7 Language Resources

Order by:

 Annotated tweet corpus in Arabizi, French and English    
  • Arabic
  • English
  • French

ID: ELRA-W0323

ISLRN: 482-848-308-105-6

The annotated tweet corpus in Arabizi, French and English was built by ELDA on behalf of INSA Rouen Normandie (Normandie Université, LITIS team), in the framework of the SAPhIRS project (System for the Analysis of Information Propagation in Social Networks), funded by the DGE (Direction Générale ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
7000.00 € submit
Licence: Commercial Use - ELRA VAR
7000.00 € submit
7000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
 ARCADE/ROMANSEVAL corpus    
  • English
  • French
  • Italian

ID: ELRA-W0018

ISLRN: 681-769-134-114-2

The ARCADE/ROMANSEVAL corpus was used as a reference corpus in two international competitions: · ARCADE, an exercise on multilingual text alignment financed by AUPELF-UREF · ROMANSEVAL, part of the SENSEVAL exercise sponsored by ACL-SIGLEX and EURALEX, on word sense disambiguation. The corpus ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 CRATER 2 Corpus    
  • English
  • French
  • Spanish; Castilian

ID: ELRA-W0033

ISLRN: 052-466-219-226-4

The CRATER corpus was built upon the foundations of an earlier project, ET10/63, which was funded in the final phase of the Eurotra programme. The Corpus Resources and Terminology Extraction project (MLAP-93 20) extended the bilingual annotated English-French International Telecommunications Unio...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
25.00 € submit
Licence: Commercial Use - ELRA VAR
25.00 € submit
25.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
125.00 € submit
Licence: Commercial Use - ELRA VAR
125.00 € submit
125.00 € submit
 CRATER corpus    
  • English
  • French
  • Spanish; Castilian

ID: ELRA-W0003

ISLRN: 645-721-607-031-5

The Corpus Resources and Terminology Extraction project (MLAP-93 20) has extended the bilingual annotated English-French International Telecommunications Union corpus to include Spanish, and has also debugged the existing corpus. The offer consists of a multi-lingual aligned corpus of 1,000,000 t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
20.00 € submit
Licence: Commercial Use - ELRA VAR
20.00 € submit
20.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
100.00 € submit
Licence: Commercial Use - ELRA VAR
100.00 € submit
100.00 € submit
 MULTEXT JOC Corpus    
  • English
  • French
  • German
  • Italian
  • Spanish; Castilian

ID: ELRA-W0017

ISLRN: 900-482-746-635-0

This CD-ROM contains a part of the corpus developed in the MULTEXT project financed by the European Commission (LRE 62-050). This part contains raw, tagged and aligned data from the Written Questions and Answers of the Official Journal of the European Community. The corpus contains approx. 5 mill...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 Parallel Corpora & Domains (bilingual and multilingual)    
  • Arabic
  • Chinese
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hebrew
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Northern Sami
  • Norwegian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Turkish

ID: ELRA-W0336

ISLRN: 471-919-856-164-1

Parallel corpora for nearly 400 language pairs and numerous multilingual combinations, including 10 million bilingual segments and 90 million tokens in 20 languages: Arabic, Chinese (Simplified), Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Italian, Japanese, Korean, North Sami...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
0.10 € submit
0.10 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
0.11 € submit
0.11 € submit

Special offers are also available. Check here for details.

 TSNLP (Test Suites for NLP Testing)    
  • English
  • French
  • German

ID: ELRA-W0013

ISLRN: 717-350-913-018-8

The TSNLP project (LRE 62-089) has produced a database of test suites for English, French and German containing over 4,000 test items (sentences or fragment of sentences) per language which have been constructed for evaluating natural language processing systems, but which may also be useful for ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
100.00 € submit
Licence: Commercial Use - ELRA VAR
100.00 € submit
100.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
100.00 € submit
Licence: Commercial Use - ELRA VAR
100.00 € submit
100.00 € submit