20 Language Resources

Order by:

 Collins Multilingual database (MLD) – PhraseBank with audio files    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-S0383

ISLRN: 398-655-047-044-5

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377). This version includes the audio files corresponding t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3360.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4480.00 € submit
 Collins Multilingual database (MLD) – WordBank with audio files    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-S0382

ISLRN: 309-438-781-042-2

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377). This version includes the corresponding audio files c...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3640.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5200.00 € submit
 GlobalPhone 2000 Speaker Package    
  • Arabic
  • Bulgarian
  • Chinese
  • Croatian
  • Czech
  • French
  • German
  • Hausa
  • Japanese
  • Korean
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swahili (macrolanguage)
  • Swedish
  • Tamil
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese

ID: ELRA-S0400

ISLRN: 331-592-378-424-7

The GlobalPhone 2000 Speaker Package contains transcribed read speech spoken by 2000 native speakers in 22 languages. The data are sampled from the GlobalPhone Speech and Text Data available in the ELRA Catalogue, i.e.: Arabic (ELRA-S0192), Bulgarian (ELRA-S0319), Chinese-Mandarin (ELRA-S0193), C...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1200.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1400.00 € submit
7200.00 € submit
Licence: Commercial Use - ELRA VAR
7200.00 € submit
7200.00 € submit
 GlobalPhone Korean    
  • Korean

ID: ELRA-S0200

ISLRN: 520-329-707-787-0

The GlobalPhone corpus developed in collaboration with the Karlsruhe Institute of Technology (KIT) was designed to provide read speech data for the development and evaluation of large continuous speech recognition systems in the most widespread languages of the world, and to provide a uniform, mu...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
3600.00 € submit
Licence: Commercial Use - ELRA VAR
3600.00 € submit
3600.00 € submit

Special offers are also available. Check here for details.

 GlobalPhone Multilingual Model Package    
  • Arabic
  • Bulgarian
  • Chinese
  • Croatian
  • Czech
  • French
  • German
  • Hausa
  • Japanese
  • Korean
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swahili (macrolanguage)
  • Swedish
  • Tamil
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese

ID: ELRA-S0399

ISLRN: 204-945-263-927-6

The GlobalPhone Multilingual Model Package contains about 22 hours of transcribed read speech spoken by native speakers in 22 languages. The data are sampled from the GlobalPhone Speech and Text Data available in the ELRA Catalogue, i.e.: Arabic (ELRA-S0192), Bulgarian (ELRA-S0319), Chinese-Manda...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1200.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1400.00 € submit
7200.00 € submit
Licence: Commercial Use - ELRA VAR
7200.00 € submit
7200.00 € submit
 Korean Speech Recognition Corpus (desktop) – digit string (110 people)    
  • Korean

ID: ELRA-S0228-52

ISLRN: 331-652-936-814-0

This corpus comprises 13,200 Korean digit strings uttered by 110 speakers of different dialects, ages and various educational levels, recorded over 4 channels. Speech samples are stored as a sequence of 16-bit 48kHz WAV for 18.89 hours of speech per channel. The total capacity of the data is 24.2...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4000.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4000.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
 Korean Speech Recognition Corpus (Desktop+Mobile)    
  • Korean

ID: ELRA-S0228-103

ISLRN: 852-908-669-816-4

This corpus comprises 32,247 entries uttered by 52 speakers (26 males and 26 females), recorded over 3 channels (desktop and mobile in quiet office). Speech samples are stored as a sequence of 16-bit 48kHz for a total of 15.76 hours of speech per channel.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 Korean Speech Recognition corpus (Desktop) - name, digit string, place, sentences    
  • Korean

ID: ELRA-S0228-62

ISLRN: 832-286-766-358-8

This corpus comprises 83,756 entries uttered by 150 speakers (66 males and 84 females), recorded over 4 channels (desktop in quiet office). Speech samples are stored as a sequence of 16-bit 48kHz for a total of 29.65 hours of speech per channel. This set combines ELRA-S0228-50, ELRA-S0228-51, ELR...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
15000.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
15000.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 Korean Speech Recognition Corpus (desktop) – person name (150 people)    
  • Korean

ID: ELRA-S0228-50

ISLRN: 146-478-758-877-6

This corpus comprises 1,500 Korean person names uttered by 150 speakers of different dialects, ages and various educational levels, recorded over 4 channels. Speech samples are stored as a sequence of 16-bit 48kHz WAV for 1.69 hours of speech per channel. The total capacity of the data is 2 Gb. ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4000.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4000.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
 Korean Speech Recognition Corpus (desktop) – place name (150 people)    
  • Korean

ID: ELRA-S0228-51

ISLRN: 227-476-631-391-1

This corpus comprises 1,500 Korean place names uttered by 150 speakers of different dialects, ages and various educational levels, recorded over 4 channels. Speech samples are stored as a sequence of 16-bit 48kHz WAV for 1.65 hours of speech per channel. The total capacity of the data is 2 Gb. E...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4000.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4000.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
 Korean Speech Recognition Corpus (desktop) – single Korean sentences (40 people)    
  • Korean

ID: ELRA-S0228-53

ISLRN: 892-259-683-285-7

This corpus comprises 4,800 Korean sentences uttered by 40 speakers of different dialects, ages and various educational levels, recorded over 4 channels. Speech samples are stored as a sequence of 16-bit 48kHz WAV for 7.43 hours of speech per channel. The total capacity of the data is 9.82 Gb. E...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 Korean Speecon database    
  • Korean

ID: ELRA-S0177

ISLRN: 429-596-342-929-3

The Korean Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 568 adult Korean speakers (259 males, 309 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place). 2) The second set comprises the record...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50000.00 € submit
67000.00 € submit
Licence: Commercial Use - ELRA VAR
67000.00 € submit
67000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
60000.00 € submit
75000.00 € submit
Licence: Commercial Use - ELRA VAR
75000.00 € submit
75000.00 € submit
 LILA Korean database    
  • Korean

ID: ELRA-S0295

ISLRN: 391-771-784-796-1

The LILA Korean database collected in South Korea was recorded within the scope of the LILA project. It contains the recordings of 1,000 Korean speakers (500 males and 500 females) recorded over the Korean mobile telephone network. The following acoustic conditions were selected as representativ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
43125.00 € submit
47500.00 € submit
Licence: Commercial Use - ELRA VAR
47500.00 € submit
47500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
46405.00 € submit
51875.00 € submit
Licence: Commercial Use - ELRA VAR
51875.00 € submit
51875.00 € submit
 Phonetically Balanced Sentences    
  • Korean

ID: ELRA-S0129

ISLRN: 134-396-214-473-8

Large acoustic corpus in Korean produced by Kaist Korterm. 20 native Korean speakers (males and females) read 1 time 539 sentences and a set of 50 common sentence. Information such as the size and the level of studies of the speakers are provided. The recordings took place in a soundproof room. T...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
 Phonetically Balanced Words (1)    
  • Korean

ID: ELRA-S0124

ISLRN: 511-274-208-444-3

Large acoustic corpus of read text in Korean. 2 announcers and 70 native speakers have been recorded (38 males, 32 females), distributed according to 4 age classes. They read two times 452 eojeols (Korean terms), and 2 announcers read one time 2000 eojeols. In these 2000 eojeols, the above 452 eo...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
250.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
 Phonetically Balanced Words (2)    
  • Korean

ID: ELRA-S0125

ISLRN: 270-301-778-832-8

Large acoustic corpus of read text in Korean produced by Kaist Korterm. Native Korean speakers (males and females) have uttered 36 geographical proper nouns. Information such as the size and the level of studies of the speakers are provided. The recordings took place in a soundproof room. The dat...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
200.00 € submit
Licence: Commercial Use - ELRA VAR
200.00 € submit
200.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
400.00 € submit
Licence: Commercial Use - ELRA VAR
400.00 € submit
400.00 € submit
 Phonetically Balanced Words (3)    
  • Korean

ID: ELRA-S0126

ISLRN: 327-664-548-453-3

Large acoustic corpus in Korean produced by Kaist Korterm. Two announcers and 70 native speakers (males and females) read 2 times one paragraph. . Information such as the size and the level of studies of the speakers are provided. The recordings took place in a soundproof room. The data are store...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
63.00 € submit
250.00 € submit
Licence: Commercial Use - ELRA VAR
250.00 € submit
250.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
125.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
 Phonetically Balanced Words (4)    
  • Korean

ID: ELRA-S0127

ISLRN: 081-051-013-524-4

Large acoustic corpus in Korean produced by Kaist Korterm. 70 native Korean speakers (males and females) read 4 times 32 cardinal numbers and 9 determinatives of one syllable. Two announcers read these only 2 times. Information such as the size and the level of studies of the speakers are provide...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
200.00 € submit
800.00 € submit
Licence: Commercial Use - ELRA VAR
800.00 € submit
800.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
1600.00 € submit
Licence: Commercial Use - ELRA VAR
1600.00 € submit
1600.00 € submit
 Phonetically Balanced Words (5)    
  • Korean

ID: ELRA-S0128

ISLRN: 605-115-604-193-3

Large acoustic corpus in Korean produced by Kaist Korterm. 70 native Korean speakers (males and females) read 4 times 35 cardinal numbers compounded of 4 single numbers. Two announcers read these only two times. Information such as the size and the level of studies of the speakers are provided. T...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
250.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
 Phonetically Rich Words    
  • Korean

ID: ELRA-S0130

ISLRN: 222-999-434-634-2

Large acoustic corpus in Korean produced by Kaist Korterm. 500 native speakers have been recorded (250 males, 250 females). They have uttered 32 single cardinal numbers, 1620 cardinal numbers compounded of 4 single numbers and 3813 phonetically rich words. The recordings took place in natural env...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
313.00 € submit
1250.00 € submit
Licence: Commercial Use - ELRA VAR
1250.00 € submit
1250.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
625.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit