22 Language Resources (Page 1 of 2)

« Previous | Next »Order by:

 ARCADE II Evaluation Package    
  • Arabic
  • Chinese
  • English
  • French
  • German
  • Italian
  • Japanese
  • Modern Greek (1453-)
  • Persian
  • Russian
  • Spanish; Castilian

ID: ELRA-E0018

ISLRN: 875-865-064-331-9

The ARCADE II Evaluation Package was produced within the French national project ARCADE II (Evaluation of parallel text alignment systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The ARCADE II project enabled to carry out a cam...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit
 ECI/MCI (European Corpus Initiative/Multilingual Corpus I)    
  • Albanian
  • Bulgarian
  • Chinese
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Estonian
  • French
  • German
  • Italian
  • Japanese
  • Latin
  • Lithuanian
  • Malay (macrolanguage)
  • Modern Greek (1453-)
  • Norwegian
  • Portuguese
  • Russian
  • Scottish Gaelic; Gaelic
  • Serbian
  • Spanish; Castilian
  • Swedish
  • Turkish
  • Uzbek

ID: ELRA-W0004

ISLRN: 511-168-567-582-5

The European Corpus Initiative (ECI) was founded to oversee the acquisition and preparation of a large multilingual corpus, and supports existing and projected national and international efforts to carefully design, collect and publish large-scale multilingual written and spoken corpora. ECI has ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
50.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
50.00 € submit
 Multilingual Corpus    
  • Chinese
  • English
  • Korean

ID: ELRA-W0035

ISLRN: 731-151-596-869-3

Multilingual parallel corpus produced by Kaist Korterm containing 60 000 expressions in Korean, Chinese and English.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
750.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 NE3L named entities Chinese corpus    
  • Chinese

ID: ELRA-W0079

ISLRN: 187-154-782-686-9

The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 Original Short-Message Data Collation II in Chinese    
  • Chinese

ID: ELRA-W0045-05

ISLRN: 004-512-635-005-4

This corpus comprises 2,604,901 characters, corresponding to 202,277 daily life short messages (SMS). This subset contains the original messages. All data have been proofread manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
11942.00 € submit
11942.00 € submit
Licence: Commercial Use - ELRA VAR
11942.00 € submit
11942.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
11942.00 € submit
11942.00 € submit
Licence: Commercial Use - ELRA VAR
11942.00 € submit
11942.00 € submit
 Original Short-Message Data Collation II in Chinese (named entities)    
  • Chinese

ID: ELRA-W0045-08

ISLRN: 753-094-616-225-9

This corpus comprises 2,604,901 characters, corresponding to 202,277 daily life short messages (SMS). This subset contains original messages together with named entities. All data have been proofread manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
 Original Short-Message Data Collation II in Chinese (participles)    
  • Chinese

ID: ELRA-W0045-07

ISLRN: 747-585-323-393-8

This corpus comprises 2,604,901 characters, corresponding to 202,277 daily life short messages (SMS). This subset contains original messages together with participles. All data have been proofread manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
 Original Short-Message Data Collation II in Chinese (PinYin)    
  • Chinese

ID: ELRA-W0045-06

ISLRN: 745-287-055-486-8

This corpus comprises 2,604,901 characters, corresponding to 202,277 daily life short messages (SMS). This subset contains original messages together with PinYin transcription. All data have been proofread manually with PinYin.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
 Original Short-Message Data Collation I in Chinese    
  • Chinese

ID: ELRA-W0045-01

ISLRN: 453-260-875-772-3

This corpus comprises 5,891,275 characters, corresponding to 51,568 short messages (SMS) from radio/TV stations and 213,694 daily life short messages. This subset contains the original messages. All data have been proofread manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
 Original Short-Message Data Collation I in Chinese (named entities)    
  • Chinese

ID: ELRA-W0045-04

ISLRN: 169-161-744-054-8

This corpus comprises 5,891,275 characters, corresponding to 51,568 short messages (SMS) from radio/TV stations and 213,694 daily life short messages. This subset contains original messages together with named entities. All data have been proofread and tagged manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
 Original Short-Message Data Collation I in Chinese (participles)    
  • Chinese

ID: ELRA-W0045-03

ISLRN: 327-586-643-099-5

This corpus comprises 5,891,275 characters, corresponding to 51,568 short messages (SMS) from radio/TV stations and 213,694 daily life short messages. This subset contains original messages together with participles. All data have been proofread manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
 Original Short-Message Data Collation I in Chinese (PinYin)    
  • Chinese

ID: ELRA-W0045-02

ISLRN: 910-780-238-099-2

This corpus comprises 5,891,275 characters, corresponding to 51,568 short messages (SMS) from radio/TV stations and 213,694 daily life short messages. This subset contains original messages together with PinYin transcription. All data have been proofread manually with PinYin.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
 The Lancaster Corpus of Mandarin Chinese (LCMC)    
  • Chinese

ID: ELRA-W0039

ISLRN: 990-638-120-277-2

The Lancaster Corpus of Mandarin Chinese (LCMC) is designed as a Chinese match for the FLOB and FROWN corpora for modern British and American English. The corpus is suitable for use in both monolingual research into modern Mandarin Chinese and cross-linguistic contrast of Chinese and British/Ame...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
7500.00 € submit
Licence: Commercial Use - ELRA VAR
7500.00 € submit
7500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
12000.00 € submit
Licence: Commercial Use - ELRA VAR
12000.00 € submit
12000.00 € submit
 TRAD Chinese-English Email Parallel corpus – Development Set    
  • Chinese
  • English

ID: ELRA-W0113

ISLRN: 447-281-370-489-0

This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and a reference translation in English. The source texts are a selection of emails from the Speechocean King-NLP-001 corpus, a corpus of private emails collected from the daily life and business domains. The ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Chinese-English Email Parallel corpus – Test Set    
  • Chinese
  • English

ID: ELRA-W0115

ISLRN: 985-956-234-357-3

This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and 2 reference translations in English. The source texts are a selection of emails from the Speechocean King-NLP-001 corpus, a corpus of private emails collected from the daily life and business domains. The t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Chinese-English News Articles Parallel corpus    
  • Chinese
  • English

ID: ELRA-W0112

ISLRN: 626-096-751-907-7

This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and 2 reference translations in English. The source texts are newspaper articles from the Chinese version of Voice of America. Articles are dated from 2011 and 2012. The translation has been conducted by two di...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Chinese-English Web domain (blogs) Parallel corpus    
  • Chinese
  • English

ID: ELRA-W0110

ISLRN: 982-341-079-331-4

This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and 2 reference translations in English. The source texts are blog articles dealing with various subjects such as economy, environment, society, technologies, etc. Articles are dated from June 2013. The transla...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Chinese-French Email Parallel corpus – Development Set    
  • Chinese
  • French

ID: ELRA-W0114

ISLRN: 255-358-917-604-3

This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and a reference translation in French. The source texts are a selection of emails from the Speechocean King-NLP-001 corpus, a corpus of private emails collected from the daily life and business domains. The c...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Chinese-French Email Parallel corpus – Test Set    
  • Chinese
  • French

ID: ELRA-W0116

ISLRN: 239-027-077-538-0

This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and 2 reference translations in French. The source texts are a selection of emails from the Speechocean King-NLP-001 corpus, a corpus of private emails collected from the daily life and business domains. The tr...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Chinese-French News Articles Parallel corpus    
  • Chinese
  • French

ID: ELRA-W0111

ISLRN: 153-566-144-442-2

This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and 2 reference translations in French. The source texts are newspaper articles from the Chinese version of Voice of America. Articles are dated from 2011 and 2012. The translation has been conducted by two dif...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit

« Previous | Next »