English SpeechDat Polyphone database DB1

View resource name in all available languages

Base de données "Polyphone" en anglais SpeechDat(M) DB1

119-026-533-342-5

ID:

ELRA-S0011

The (polyphone-like) English SpeechDat(M) database was recorded within the framework of the SPEECHDAT(M) Project. It consists of 1,000 speakers, chosen according to their individual demographics, who were recorded over digital telephone lines using fixed telephone sets. The material to be spoken was provided to the caller via a prompt sheet. The database is divided into two sub-sets: the phonetically rich sentences (one CD) known as DB2, and the application-oriented utterances (two CDs) known as DB1.
The recorded material in DB1 comprises immediately usable and relevant speech, including number and letter sequences, common control keywords, dates, times, money amounts, etc. This provides a realistic basis for using these resources for the training and assessment of speaker-independent recognition of both isolated and continuous speech utterances, employing either whole-word modeling and/or phoneme based approaches.The sample rate for speech is 8 KHz, quantisation is 8 bit, and a-law encoding is used. This results in a data rate of 64 kB/s.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.

View resource description in French

Il s'agit de la base de données SpeechDat (M) anglaise enregistrée selon les règles des bases de données "polyphone". Elle contient des enregistrements de 1000 locuteurs, choisis selon les critères démographiques (âge, sexe, lieu, ...) et enregistrés à travers le réseau téléphonique.
La base de données est répartie en deux ensembles: les phrases phonétiquement riches et les mots de commandes, orientés vers des applications (il s'agit de chiffres, nombres, séquences de lettres, mots de commande usuels, etc.). Ce qui fourni une base solide pour la réalisation de systèmes de reconnaissance de la parole, indépendant du locuteur aussi bien en mode mots isolés qu'en mode parole continue, utilisant des modèles phonétiques ou des modèles de "mots". La fréquence d'échantillonnage est de 8 KHz, un codage sur 8 bit, la loi a, donnant un débit de 64 kB/s.
Un lexique de prononciation avec sa transcription phonétique en SAMPA est également fourni.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
11000.00 € submit
14000.00 € submit
Licence: Commercial Use - ELRA VAR
14000.00 € submit
14000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
20000.00 € submit
20000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
23/01/1997 Downloadable
People who looked at this resource also viewed the following:
Resources from the same project