Audio (519)
Text (114)
Video (19)
Available (637)
True (32)
TEI (9)
Tourism (2)
Science (1)

Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

636 Language Resources (Page 1 of 32)

« Previous | Next »Order by:

 Accented English GlobalPhone    
  • English

ID: ELRA-S0389

ISLRN: 574-579-221-841-3

The Accented English part of the GlobalPhone resources contains 63 recording sessions of Bulgarian, Chinese, German, and Indian native speakers reading 37 English sentences each, produced in GlobalPhone-style, i.e. 16kHz PCM encoded audio recordings of utterance-segmented read speech from the new...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
3600.00 € submit
Licence: Commercial Use - ELRA VAR
3600.00 € submit
3600.00 € submit
 ACCOR - English    
  • English

ID: ELRA-S0001

ISLRN: 936-783-643-804-4

ACCOR is a unique acoustic and articulatory database recorded as part of the ESPRIT- ACCOR project investigating cross-language acoustic-articulatory correlations in coarticulatory processes. The European Languages covered are: Catalan, English, French, German, Irish Gaelic, Italian and Swedish. ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
25.00 € submit
Licence: Commercial Use - ELRA VAR
25.00 € submit
25.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
75.00 € submit
Licence: Commercial Use - ELRA VAR
75.00 € submit
75.00 € submit
 Acoustic database for Polish concatenative speech synthesis    
  • Polish

ID: ELRA-S0342

ISLRN: 305-222-372-690-4

This database consists of 1443 nonsense words including all the diphones for the Polish language. The diphone is always placed at an unstressed syllable. The neighbourhood doesn’t influence the co-articulation of the diphone. The database includes information such as: the name of the diphone, co...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
35.00 € submit
35.00 € submit
Licence: Commercial Use - ELRA VAR
35.00 € submit
35.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
100.00 € submit
Licence: Commercial Use - ELRA VAR
100.00 € submit
100.00 € submit
 Acoustic database for Polish unit selection speech synthesis    
  • Polish

ID: ELRA-S0339

ISLRN: 981-910-282-065-4

This database contains parliamentary statements and newspaper reviews read by a semi-professional male speaker. It consists of a selection of 2150 sentences annotated and manually verified, including 100 rare phonemes in words. Prompts vary in length from 2.3 to 13.4 seconds, with an average leng...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
250.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
 aGender    
  • German

ID: ELRA-S0365

ISLRN: 038-476-412-610-4

aGender contains speech sample recordings over public telephone lines with read and (semi-)spontaneous speech. Native German speakers called a voice portal from their private phone, and read text + answered some open questions. The purpose of the corpus is the automatic detection of gender and/or...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
327.00 € submit
8127.00 € submit
Licence: Commercial Use - ELRA VAR
8127.00 € submit
8127.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
455.00 € submit
8255.00 € submit
Licence: Commercial Use - ELRA VAR
8255.00 € submit
8255.00 € submit
 Albayzin corpus    
  • Spanish; Castilian

ID: ELRA-S0089

ISLRN: 443-392-902-600-9

This corpus consists of 3 sub-corpora of 16 kHz 16 bits signals, recorded by 304 Castillian speakers. The 3 sub-corpora are: - Phonetic corpus: 6,800 utterances of phonetically balanced sentences, including 1000 with phonetic segmentation. - Geographic corpus: 6,800 utterances of sentences ext...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2000.00 € submit
12000.00 € submit
Licence: Commercial Use - ELRA VAR
12000.00 € submit
12000.00 € submit

Special offers are also available. Check here for details.

 Alcohol Language Corpus (BAS ALC)    
  • German

ID: ELRA-S0299

ISLRN: 780-368-852-139-3

ALC contains recordings of German speakers that are either intoxicated or sober. The type of speech ranges from read single digits to full conversation style. Recordings were done during drinking test where speakers drank beer or wine to reach a self-chosen level of alcoholic intoxication. The ac...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
510.00 € submit
510.00 € submit
Licence: Commercial Use - ELRA VAR
510.00 € submit
510.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1020.00 € submit
1020.00 € submit
Licence: Commercial Use - ELRA VAR
1020.00 € submit
1020.00 € submit
 Al-Hayat Arabic Corpus    
  • Arabic

ID: ELRA-W0030

ISLRN: 365-777-769-398-7

The corpus was developed in the course of a research project at the University of Essex, in collaboration with the Open University. The corpus contains Al-Hayat newspaper articles with value added for Language Engineering and Information Retrieval applications development purposes. The data have ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
480.00 € submit
960.00 € submit
Licence: Commercial Use - ELRA VAR
960.00 € submit
960.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
720.00 € submit
1440.00 € submit
Licence: Commercial Use - ELRA VAR
1440.00 € submit
1440.00 € submit
 American/Canadian English Speech Recognition Corpus (headset+mobile)    
  • English

ID: ELRA-S0228-102

ISLRN: 992-319-311-431-0

This corpus comprises 12,974 entries uttered by 30 speakers (15 males and 15 females), recorded over 2 channels (headset and mobile in noisy restaurant/shopping mall/info center/hospital/station/car). Speech samples are stored as a sequence of 16-bit 48kHz for a total of 12 hours of speech per ch...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3600.00 € submit
3600.00 € submit
Licence: Commercial Use - ELRA VAR
3600.00 € submit
3600.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3600.00 € submit
3600.00 € submit
Licence: Commercial Use - ELRA VAR
3600.00 € submit
3600.00 € submit
 American Children Speech Data by Microphone - 50 Hours    
  • English

ID: ELRA-S0468

ISLRN: 178-575-028-743-8

It is recorded by 219 American children native speakers. The recording texts are mainly storybook, children's song, spoken expressions, etc. 350 sentences for each speaker. Each sentence contain 4.5 words in average. Each sentence is repeated 2.1 times in average. The recording device is hi-fi Bl...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
28785.00 € submit
28785.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
28785.00 € submit
28785.00 € submit

Special offers are also available. Check here for details.

 American English Conversational Speech Recognition Corpus (Multi-Channel)    
  • English

ID: ELRA-S0228-93

ISLRN: 576-996-121-023-5

This corpus was recorded by 20 speakers (10 males and 10 females), over 7 channels (multi-channel in quiet office/home). Speech samples are stored as a sequence of 16-bit 16 kHz for a total of 10 hours of speech per channel.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5600.00 € submit
5600.00 € submit
Licence: Commercial Use - ELRA VAR
5600.00 € submit
5600.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5600.00 € submit
5600.00 € submit
Licence: Commercial Use - ELRA VAR
5600.00 € submit
5600.00 € submit
 American English Speech Data by Mobile Phone - 800 Hours    
  • English

ID: ELRA-S0437

ISLRN: 629-877-109-625-1

1842 American native speakers participated in the recording with authentic accent. The recorded script is designed by linguists, based on scenes, and cover a wide range of topics including generic, interactive, on-board and home. The text is manually proofread with high accuracy. It matches with ...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
136800.00 € submit
136800.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
136800.00 € submit
136800.00 € submit

Special offers are also available. Check here for details.

 American English Speech Data by Mobile Phone_Reading - 215 Hours    
  • English

ID: ELRA-S0467

ISLRN: 921-365-371-849-5

The data set contains 349 American English speakers' speech data, all of whom are American locals. It is recorded in quiet environment. The recording contents cover various categories like economics, entertainment, news and spoken language. It is manually transcribed and annotated with the starti...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
34722.50 € submit
34722.50 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
34722.50 € submit
34722.50 € submit

Special offers are also available. Check here for details.

 American English Speech Recognition Corpus (Desktop)    
  • English

ID: ELRA-S0228-79

ISLRN: 254-019-000-249-3

This corpus comprises 49,990 entries uttered by 50 speakers (25 males and 25 females), recorded over 2 channels (desktop in quiet office). Speech samples are stored as a sequence of 16-bit 16kHz for a total of 24.9 hours of speech per channel.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 American English Speech Recognition Corpus (Mobile) - 14.67 hours    
  • English

ID: ELRA-S0228-73

ISLRN: 817-988-141-738-4

This corpus comprises 14,988 entries uttered by 50 speakers (23 males and 27 females), recorded over the mobile telephone network. Speech samples are stored as a sequence of 16-bit 16 kHz for a total of 14.67 hours of speech.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 American English Speech Recognition Corpus (Mobile) - 19.4 hours    
  • English

ID: ELRA-S0228-58

ISLRN: 968-856-860-742-9

This corpus comprises 39,243 entries uttered by 151 speakers (74 males and 77 females), recorded over the mobile telephone network. Speech samples are stored as a sequence of 16-bit 16kHz for a total of 19.4 hours of speech.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2700.00 € submit
2700.00 € submit
Licence: Commercial Use - ELRA VAR
2700.00 € submit
2700.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2700.00 € submit
2700.00 € submit
Licence: Commercial Use - ELRA VAR
2700.00 € submit
2700.00 € submit
 American Spanish Recognition Corpus (Desktop+Mobile)    
  • English

ID: ELRA-S0228-68

ISLRN: 100-009-143-020-4

This corpus comprises 33,527 entries uttered by 40 speakers (21 males and 19 females), recorded over 2 channels (desktop in quiet office and mobile in noisy restaurant). Speech samples are stored as a sequence of 16-bit 16kHz for a total of 14.7 hours of speech per channel.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4800.00 € submit
4800.00 € submit
Licence: Commercial Use - ELRA VAR
4800.00 € submit
4800.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4800.00 € submit
4800.00 € submit
Licence: Commercial Use - ELRA VAR
4800.00 € submit
4800.00 € submit
 Amharic-English bilingual corpus    
  • Amharic
  • English

ID: ELRA-W0074

ISLRN: 590-255-335-719-0

The Amharic-English bilingual corpus contains parallel text from legal and news domains in Amharic script, in transliterated form and in English. The size of the corpus is of 232,653 words in Amharic and 291,701 in English. This parallel corpus contains documents from two domains, namely legal...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
 AnCora Catalan 2.0.0    
  • Catalan; Valencian

ID: ELRA-W0327

ISLRN: 186-654-762-852-8

The AnCora Catalan Corpus 2.0.0 is a corpus of 500,000 words annotated at different levels: - Lemma and Part of Speech, - Syntactic constituents and functions, - Argument structure and thematic roles, - Semantic classes of the verb, - Denotative type of deverbal nouns, - Nouns related to W...

MEMBERacademiccommercial
Licence: Attribution, Commercial Use - GPL
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Commercial Use - GPL
0.00 € submit
0.00 € submit

« Previous | Next »