Resource Type:
Corpus: | ![]() |
Lexical/Conceptual: | ![]() |
Tool/Service: | ![]() |
Language Description: | ![]() |
Media Type:
Text: | ![]() |
Audio: | ![]() |
Image: | ![]() |
Video: | ![]() |
Text Numerical: | ![]() |
Text N-Gram: | ![]() |
1684 Language Resources (Page 67 of 85)
« Previous | Next »Order by:


- Chinese
- French
ID: ELRA-W0111
ISLRN: 153-566-144-442-2This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and 2 reference translations in French. The source texts are newspaper articles from the Chinese version of Voice of America. Articles are dated from 2011 and 2012. The translation has been conducted by two dif...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
150.00 €
![]() |
500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
500.00 €
![]() |
500.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |


- Chinese
- French
ID: ELRA-W0125
ISLRN: 713-266-631-883-0TRAD Chinese-French Parallel Text - Blog was developed by ELDA as part of the PEA-TRAD project. It contains English translations of a subset of approximately 10,000 Chinese words from GALE Phase 1 Chinese Blog Parallel Text (available here: https://catalog.ldc.upenn.edu/LDC2008T06). The PEA-TR...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - Non Standard Licence Terms |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - Non Standard Licence Terms |


- Chinese
- French
ID: ELRA-W0109
ISLRN: 464-017-697-777-3This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and 2 reference translations in French. The source texts are blog articles dealing with various subjects such as economy, environment, society, technologies, etc. Articles are dated from June 2013. The translat...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
150.00 €
![]() |
500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
500.00 €
![]() |
500.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |


- Pushto; Pashto
ID: ELRA-S0381
ISLRN: 918-508-885-913-7This corpus contains transcribed broadcast news recordings in Pashto. Recordings are collected from 5 sources: Ashna TV, Azadi Radio, Deewa Radio, Mashaal Radio and Shamshad TV. The corpus contains 108 hours of recordings covering more than 1,000 speakers. Transcriptions are provided together ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
2000.00 €
![]() |
20000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
20000.00 €
![]() |
20000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3500.00 €
![]() |
28000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
28000.00 €
![]() |
28000.00 €
![]() |


- English
- Pushto; Pashto
ID: ELRA-W0097
ISLRN: 612-936-517-010-2This is a parallel corpus, which contains 10,000 Pashto words translated into English by two different translators. The source texts have been collected from the following news websites: Azadiradio, Mashaal and Voice of America Pashto. The content has also been translated into French (see ELRA-W...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
350.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
![]() |
2000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |


- English
- Pushto; Pashto
ID: ELRA-W0095
ISLRN: 006-102-605-738-4This is a parallel corpus, which contains 10,000 Pashto words translated into English. The source texts come from 3 broadcast news transcriptions of the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381). These texts are VOA Ashna TV programs recorded on 15/01/2011, 18/01/2011 and 19/01/2011. ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
350.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
![]() |
2000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |


- French
- Pushto; Pashto
ID: ELRA-W0096
ISLRN: 649-628-149-051-7This is a parallel corpus, which contains 10,000 Pashto words translated into French by two different translators. The source texts have been collected from the following news websites: Azadiradio, Mashaal and Voice of America Pashto. The content has also been translated into English (see ELRA-W...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
350.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
![]() |
2000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |


- French
- Pushto; Pashto
ID: ELRA-W0094
ISLRN: 547-897-479-723-3This is a parallel corpus, which contains 10,000 Pashto words translated into French by two different translators. The source texts come from 3 broadcast news transcriptions of the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381). These texts are VOA Ashna TV programs recorded on 15/01/2011,...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
350.00 €
![]() |
1000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
1000.00 €
![]() |
1000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
500.00 €
![]() |
2000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
2000.00 €
![]() |
2000.00 €
![]() |


- French
- Pushto; Pashto
ID: ELRA-W0093
ISLRN: 802-643-297-429-4The corpus consists of the transcription of 106 hours of recordings in Pashto translated into French. The transcriptions are extracted from the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381). It contains about 832,000 source words and 747,000 target words. No audio file is provided. Pasht...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3000.00 €
![]() |
10000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
10000.00 €
![]() |
10000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
4000.00 €
![]() |
18000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
18000.00 €
![]() |
18000.00 €
![]() |


- Pushto; Pashto
ID: ELRA-W0092
ISLRN: 394-903-293-388-0This is a monolingual text corpus in Pashto. The corpus contains about 112,000,000 tokens collected from 46 different blogs and websites. Identified and negotiated or freely available sources have been crawled in 2012, cleaned and XML-formatted. Pashto is an indo-iranian language spoken by th...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
1200.00 €
![]() |
3500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3500.00 €
![]() |
3500.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
2000.00 €
![]() |
5000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
5000.00 €
![]() |
5000.00 €
![]() |


- Arabic
- English
ID: ELRA-W0126
ISLRN: 986-364-744-303-9The dataset is composed of two distinct resources: 1) A collection of mixed English and Arabizi text intended to train and test a system for the automatic detection of code-switching in mixed English and Arabizi texts. The training part of the corpus contains: 522 tweets composed of 5,207 token...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
500.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
500.00 €
![]() |
500.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
650.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
650.00 €
![]() |
650.00 €
![]() |


- English
ID: ELRA-S0120
ISLRN: 502-719-830-448-5LDC reference: https://catalog.ldc.upenn.edu/LDC2002T03 The Translanguage English Database (TED) Transcripts corpus contains transcriptions of thirty-nine of the 188 speeches of the TED Corpus made at Eurospeech'93 in Berlin. The thirty-nine transcripts in this publication are in Universal Tra...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
0.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
0.00 €
![]() |
0.00 €
![]() |


- English
- Norwegian
ID: ELRA-W0156
ISLRN: 909-695-133-060-3This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Translation memories containing translations of EU legis...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |


- English
- Swedish
ID: ELRA-W0236
ISLRN: 709-518-556-855-4This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Translation memory from Swedish National Audit Office
MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |


- English
- Lithuanian
ID: ELRA-W0165
ISLRN: 691-158-541-313-8This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Translation Memories of Lithuanian legislation from Seim...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution - CC-BY-4.0 |
0.00 €
![]() |
0.00 €
![]() |


- English
- French
- Modern Greek (1453-)
ID: ELRA-W0307
ISLRN: 954-287-236-137-4This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Trilingual (Greek-English-French) documents - standard f...
MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Other - Public Domain |
0.00 €
![]() |
0.00 €
![]() |


- English
- French
- German
ID: ELRA-W0013
ISLRN: 717-350-913-018-8The TSNLP project (LRE 62-089) has produced a database of test suites for English, French and German containing over 4,000 test items (sentences or fragment of sentences) per language which have been constructed for evaluating natural language processing systems, but which may also be useful for ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
100.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
100.00 €
![]() |
100.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
100.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
100.00 €
![]() |
100.00 €
![]() |



- English
ID: ELRA-W0048
ISLRN: 799-660-957-954-5TUNA (Towards a UNified Algorithm for the generation of referring expressions) is a research project funded by the UK's Engineering and Physical Sciences Research Council (EPSRC). The TUNA Corpus of Referring Expressions is built with the contributions from 50 native or fluent speakers of Engl...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
45.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
45.00 €
![]() |
45.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
![]() |
45.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
45.00 €
![]() |
45.00 €
![]() |


- Turkish
ID: ELRA-S0121
ISLRN: 192-049-804-522-1This Turkish speech database was produced by the department of Théorie des Circuits et Traitement de Signal at the Faculté Polytechnique de Mons. The corpus was designed to provide read speech data for speech recognition purposes. The database contains 14 hours of speech (1618 words) from 43 Turk...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
400.00 €
![]() |
3000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
3000.00 €
![]() |
3000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
800.00 €
![]() |
6000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
6000.00 €
![]() |
6000.00 €
![]() |


- Turkish
ID: ELRA-S0178
ISLRN: 539-782-381-710-6The Turkish Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 550 adult Turkish speakers (280 males, 270 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place). 2) The second set comprises the reco...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
50000.00 €
![]() |
67000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
67000.00 €
![]() |
67000.00 €
![]() |
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
60000.00 €
![]() |
75000.00 €
![]() |
Licence: Commercial Use - ELRA VAR |
75000.00 €
![]() |
75000.00 €
![]() |