26 Language Resources (Page 1 of 2)

« Previous | Next »Order by:

 ARCADE II Evaluation Package    
  • Arabic
  • Chinese
  • English
  • French
  • German
  • Italian
  • Japanese
  • Modern Greek (1453-)
  • Persian
  • Russian
  • Spanish; Castilian

ID: ELRA-E0018

ISLRN: 875-865-064-331-9

The ARCADE II Evaluation Package was produced within the French national project ARCADE II (Evaluation of parallel text alignment systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The ARCADE II project enabled to carry out a cam...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit
 Collins Multilingual database (MLD) - PhraseBank    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-T0377

ISLRN: 452-383-219-228-0

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, distributed separately under reference ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank). The PhraseBank consists of 2,000 p...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1680.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2240.00 € submit
 Collins Multilingual database (MLD) - WordBank    
  • Arabic
  • Bengali
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Malayalam
  • Modern Greek (1453-)
  • Norwegian
  • Polish
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Russian
  • Spanish; Castilian
  • Swedish
  • Tamil
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese

ID: ELRA-T0376

ISLRN: 990-814-402-335-7

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank) and a multilingual set of sentences in 28 languages (the PhraseBank, distributed separately under reference ELRA-T0377). The WordBank contains 10,000 words...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2400.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3600.00 € submit
 ECI/MCI (European Corpus Initiative/Multilingual Corpus I)    
  • Albanian
  • Bulgarian
  • Chinese
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Estonian
  • French
  • German
  • Italian
  • Japanese
  • Latin
  • Lithuanian
  • Malay (macrolanguage)
  • Modern Greek (1453-)
  • Norwegian
  • Portuguese
  • Russian
  • Scottish Gaelic; Gaelic
  • Serbian
  • Spanish; Castilian
  • Swedish
  • Turkish
  • Uzbek

ID: ELRA-W0004

ISLRN: 511-168-567-582-5

The European Corpus Initiative (ECI) was founded to oversee the acquisition and preparation of a large multilingual corpus, and supports existing and projected national and international efforts to carefully design, collect and publish large-scale multilingual written and spoken corpora. ECI has ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
50.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
50.00 € submit
 GlobalPhone Chinese-Mandarin Pronunciation Dictionary      
  • Chinese

ID: ELRA-S0363

ISLRN: 457-511-870-286-9

The GlobalPhone pronunciation dictionaries, created within the framework of the multilingual speech and language corpus GlobalPhone, were developed in collaboration with the Karlsruhe Institute of Technology (KIT). The GlobalPhone pronunciation dictionaries contain the pronunciations of all wo...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
3600.00 € submit
Licence: Commercial Use - ELRA VAR
3600.00 € submit
3600.00 € submit

Special offers are also available. Check here for details.

 LC-STAR Mandarin Chinese Phonetic lexicon      
  • Chinese

ID: ELRA-S0256

ISLRN: 103-062-804-789-9

The LC-STAR Mandarin Chinese Phonetic lexicon was created within the scope of the LC-STAR project (IST 2001-32216) which was sponsored by the European Commission. The lexicon comprises 104,368 entries, distributed over three categories: - a set of 38,098 common word entries. This set is extract...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
27000.00 € submit
40000.00 € submit
Licence: Commercial Use - ELRA VAR
40000.00 € submit
40000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
38000.00 € submit
50000.00 € submit
Licence: Commercial Use - ELRA VAR
50000.00 € submit
50000.00 € submit
 Multilingual Corpus    
  • Chinese
  • English
  • Korean

ID: ELRA-W0035

ISLRN: 731-151-596-869-3

Multilingual parallel corpus produced by Kaist Korterm containing 60 000 expressions in Korean, Chinese and English.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
750.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 NE3L named entities Chinese corpus    
  • Chinese

ID: ELRA-W0079

ISLRN: 187-154-782-686-9

The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 Original Short-Message Data Collation II in Chinese    
  • Chinese

ID: ELRA-W0045-05

ISLRN: 004-512-635-005-4

This corpus comprises 2,604,901 characters, corresponding to 202,277 daily life short messages (SMS). This subset contains the original messages. All data have been proofread manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
11942.00 € submit
11942.00 € submit
Licence: Commercial Use - ELRA VAR
11942.00 € submit
11942.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
11942.00 € submit
11942.00 € submit
Licence: Commercial Use - ELRA VAR
11942.00 € submit
11942.00 € submit
 Original Short-Message Data Collation II in Chinese (named entities)    
  • Chinese

ID: ELRA-W0045-08

ISLRN: 753-094-616-225-9

This corpus comprises 2,604,901 characters, corresponding to 202,277 daily life short messages (SMS). This subset contains original messages together with named entities. All data have been proofread manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
 Original Short-Message Data Collation II in Chinese (participles)    
  • Chinese

ID: ELRA-W0045-07

ISLRN: 747-585-323-393-8

This corpus comprises 2,604,901 characters, corresponding to 202,277 daily life short messages (SMS). This subset contains original messages together with participles. All data have been proofread manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
 Original Short-Message Data Collation II in Chinese (PinYin)    
  • Chinese

ID: ELRA-W0045-06

ISLRN: 745-287-055-486-8

This corpus comprises 2,604,901 characters, corresponding to 202,277 daily life short messages (SMS). This subset contains original messages together with PinYin transcription. All data have been proofread manually with PinYin.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
 Original Short-Message Data Collation I in Chinese    
  • Chinese

ID: ELRA-W0045-01

ISLRN: 453-260-875-772-3

This corpus comprises 5,891,275 characters, corresponding to 51,568 short messages (SMS) from radio/TV stations and 213,694 daily life short messages. This subset contains the original messages. All data have been proofread manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
14928.00 € submit
14928.00 € submit
Licence: Commercial Use - ELRA VAR
14928.00 € submit
14928.00 € submit
 Original Short-Message Data Collation I in Chinese (named entities)    
  • Chinese

ID: ELRA-W0045-04

ISLRN: 169-161-744-054-8

This corpus comprises 5,891,275 characters, corresponding to 51,568 short messages (SMS) from radio/TV stations and 213,694 daily life short messages. This subset contains original messages together with named entities. All data have been proofread and tagged manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
 Original Short-Message Data Collation I in Chinese (participles)    
  • Chinese

ID: ELRA-W0045-03

ISLRN: 327-586-643-099-5

This corpus comprises 5,891,275 characters, corresponding to 51,568 short messages (SMS) from radio/TV stations and 213,694 daily life short messages. This subset contains original messages together with participles. All data have been proofread manually.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
 Original Short-Message Data Collation I in Chinese (PinYin)    
  • Chinese

ID: ELRA-W0045-02

ISLRN: 910-780-238-099-2

This corpus comprises 5,891,275 characters, corresponding to 51,568 short messages (SMS) from radio/TV stations and 213,694 daily life short messages. This subset contains original messages together with PinYin transcription. All data have been proofread manually with PinYin.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
18659.00 € submit
18659.00 € submit
Licence: Commercial Use - ELRA VAR
18659.00 € submit
18659.00 € submit
 The Lancaster Corpus of Mandarin Chinese (LCMC)    
  • Chinese

ID: ELRA-W0039

ISLRN: 990-638-120-277-2

The Lancaster Corpus of Mandarin Chinese (LCMC) is designed as a Chinese match for the FLOB and FROWN corpora for modern British and American English. The corpus is suitable for use in both monolingual research into modern Mandarin Chinese and cross-linguistic contrast of Chinese and British/Ame...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
7500.00 € submit
Licence: Commercial Use - ELRA VAR
7500.00 € submit
7500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
12000.00 € submit
Licence: Commercial Use - ELRA VAR
12000.00 € submit
12000.00 € submit
 TRAD Chinese-English Email Parallel corpus – Development Set    
  • Chinese
  • English

ID: ELRA-W0113

ISLRN: 447-281-370-489-0

This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and a reference translation in English. The source texts are a selection of emails from the Speechocean King-NLP-001 corpus, a corpus of private emails collected from the daily life and business domains. The ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Chinese-English Email Parallel corpus – Test Set    
  • Chinese
  • English

ID: ELRA-W0115

ISLRN: 985-956-234-357-3

This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and 2 reference translations in English. The source texts are a selection of emails from the Speechocean King-NLP-001 corpus, a corpus of private emails collected from the daily life and business domains. The t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Chinese-English News Articles Parallel corpus    
  • Chinese
  • English

ID: ELRA-W0112

ISLRN: 626-096-751-907-7

This is a parallel corpus of 15,000 characters in Chinese (equivalent to 10,000 words) and 2 reference translations in English. The source texts are newspaper articles from the Chinese version of Voice of America. Articles are dated from 2011 and 2012. The translation has been conducted by two di...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit

« Previous | Next »