19 Language Resources

Order by:

 Annotated tweet corpus in Arabizi, French and English    
  • Arabic
  • English
  • French

ID: ELRA-W0323

ISLRN: 482-848-308-105-6

The annotated tweet corpus in Arabizi, French and English was built by ELDA on behalf of INSA Rouen Normandie (Normandie Université, LITIS team), in the framework of the SAPhIRS project (System for the Analysis of Information Propagation in Social Networks), funded by the DGE (Direction Générale ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
7000.00 € submit
Licence: Commercial Use - ELRA VAR
7000.00 € submit
7000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
 ArabLEX: Database of Arabic Place Names (DAP)    
  • Arabic
  • English

ID: ELRA-M0105

ISLRN: 161-842-321-771-2

This database is part of the ArabLEX set of data which consists of the Database of Arabic General Vocabulary (DAG), Database of Arabic Place Names (DAP), Database of Foreign Names in Arabic (DAF) and Database of Arab Names (DAN) available from ELRA under references, respectively, ELRA-L0131, ELRA...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
7000.00 € submit
22000.00 € submit
Licence: Commercial Use - ELRA VAR
22000.00 € submit
22000.00 € submit

Special offers are also available. Check here for details.

 ArabLEX: Database of Arab Names (DAN)    
  • Arabic
  • English

ID: ELRA-M0107

ISLRN: 773-974-582-139-4

This database is part of the ArabLEX set of data which consists of the Database of Arabic General Vocabulary (DAG), Database of Arabic Place Names (DAP), Database of Foreign Names in Arabic (DAF) and Database of Arab Names (DAN) available from ELRA under references, respectively, ELRA-L0131, ELRA...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
15000.00 € submit
45000.00 € submit
Licence: Commercial Use - ELRA VAR
45000.00 € submit
45000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
24000.00 € submit
71000.00 € submit
Licence: Commercial Use - ELRA VAR
71000.00 € submit
71000.00 € submit

Special offers are also available. Check here for details.

 ArabLEX: Database of Foreign Names in Arabic (DAF)    
  • Arabic
  • English

ID: ELRA-M0106

ISLRN: 943-592-129-040-2

This database is part of the ArabLEX set of data which consists of the Database of Arabic General Vocabulary (DAG), Database of Arabic Place Names (DAP), Database of Foreign Names in Arabic (DAF) and Database of Arab Names (DAN) available from ELRA under references, respectively, ELRA-L0131, ELRA...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
9000.00 € submit
27000.00 € submit
Licence: Commercial Use - ELRA VAR
27000.00 € submit
27000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
16000.00 € submit
49000.00 € submit
Licence: Commercial Use - ELRA VAR
49000.00 € submit
49000.00 € submit

Special offers are also available. Check here for details.

 GEOLINGUAL Multilingual Geographical Entity Tables    
  • Arabic
  • Chinese
  • Danish
  • Dutch; Flemish
  • English
  • French
  • German
  • Hebrew
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Turkish

ID: ELRA-L0205

ISLRN: 816-648-322-249-9

A table of over 200 countries and other major geographical names worldwide – including their adjectives, persons, and main languages – in the following languages: Arabic, Chinese Simplified, Danish, Dutch, English, French, German, Greek, Hebrew, Japanese, Korean, Polish, Portuguese, Russian, Span...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
1050.00 € submit
1050.00 € submit
 GLOBAL Multilingual Lexical Data - Bilingual - Level 1    
  • Arabic
  • Chinese
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • French
  • German
  • Hebrew
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Latin
  • Modern Greek (1453-)
  • Norwegian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish

ID: ELRA-M0111-04

ISLRN: 255-971-767-096-3

The GLOBAL Multilingual Lexical Data (references ELRA-M0111-01 to ELRA-M0111-06 in the ELRA Catalogue) consists of a network of lexicographic cores for major world languages, comprising diverse monolingual, bilingual and multilingual combinations, in different sizes, originally built for language...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
6800.00 € submit
6800.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
7140.00 € submit
7140.00 € submit

Special offers are also available. Check here for details.

 GLOBAL Multilingual Lexical Data - Monolingual - Level 1    
  • Arabic
  • Chinese
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • French
  • German
  • Hebrew
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Latin
  • Modern Greek (1453-)
  • Norwegian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish

ID: ELRA-M0111-01

ISLRN: 604-974-454-390-3

The GLOBAL Multilingual Lexical Data (references ELRA-M0111-01 to ELRA-M0111-06 in the ELRA Catalogue) consists of a network of lexicographic cores for major world languages, comprising diverse monolingual, bilingual and multilingual combinations, in different sizes, originally built for language...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
4250.00 € submit
4250.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
4462.50 € submit
4462.50 € submit

Special offers are also available. Check here for details.

 MAURDOR Evaluation Package  
  • Arabic
  • English
  • French

ID: ELRA-E0045

ISLRN: 364-018-517-901-2

The MAURDOR project consists in evaluating systems for automatic processing of written documents. Collected written documents are scanned documents (printed, typewritten or manuscripts). In order to get images for the evaluation of automatic analysis systems, 10,000 original documents were c...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
10000.00 € submit
Licence: Evaluation Use - ELRA EVALUATION
5000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
750.00 € submit
15000.00 € submit
Licence: Evaluation Use - ELRA EVALUATION
7500.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 MULTIGLOSS Multilingual Glossaries - L1-English pair    
  • Afrikaans
  • Arabic
  • Azerbaijani
  • Bulgarian
  • Catalan; Valencian
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Estonian
  • Finnish
  • French
  • German
  • Hebrew
  • Hindi
  • Hungarian
  • Icelandic
  • Indonesian
  • Italian
  • Japanese
  • Korean
  • Latin
  • Latvian
  • Lithuanian
  • Malay (macrolanguage)
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Russian
  • Serbian
  • Slovak
  • Slovenian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Ukrainian
  • Urdu
  • Vietnamese
  • Western Frisian

ID: ELRA-M0112-01

ISLRN: 098-079-939-987-5

A series of innovative multilingual word-to-sense glossaries, based on a human-edited word-to-sense bilingual index of each language to English, which is linked automatically to the translation equivalents in 45 target languages. Each word and expression in every language is translated via its...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
2625.00 € submit
2625.00 € submit

Special offers are also available. Check here for details.

 MULTIGLOSS Multilingual Glossaries - L1-English pair + 1 language    
  • Afrikaans
  • Arabic
  • Azerbaijani
  • Bulgarian
  • Catalan; Valencian
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Estonian
  • Finnish
  • French
  • German
  • Hebrew
  • Hindi
  • Hungarian
  • Icelandic
  • Indonesian
  • Italian
  • Japanese
  • Korean
  • Latin
  • Latvian
  • Lithuanian
  • Malay (macrolanguage)
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Russian
  • Serbian
  • Slovak
  • Slovenian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Ukrainian
  • Urdu
  • Vietnamese
  • Western Frisian

ID: ELRA-M0112-02

ISLRN: 610-290-284-705-6

A series of innovative multilingual word-to-sense glossaries, based on a human-edited word-to-sense bilingual index of each language to English, which is linked automatically to the translation equivalents in 45 target languages. Each word and expression in every language is translated via its...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
3750.00 € submit
3750.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
3937.50 € submit
3937.50 € submit

Special offers are also available. Check here for details.

 Multilingual Dictionary of Sports – English-French-Arabic trilingual database    
  • Arabic
  • English
  • French

ID: ELRA-T0372-04

ISLRN: 351-230-082-450-3

This dictionary was produced within the French national project EuRADic (European and Arabic Dictionaries and Corpora), as part of the Technolangue programme funded by the French Ministry of Industry. A needs study in the field of sport terminology, which covered an overall category of users, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
1500.00 € submit
Licence: Commercial Use - ELRA VAR
1500.00 € submit
1500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
200.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
 Multilingual Dictionary of Sports – English-French-Greek-Arabic-German-Spanish-Portuguese multilingual database    
  • Arabic
  • English
  • French
  • German
  • Modern Greek (1453-)
  • Portuguese
  • Spanish; Castilian

ID: ELRA-T0372-01

ISLRN: 753-372-742-011-3

This dictionary was produced within the French national project EuRADic (European and Arabic Dictionaries and Corpora), as part of the Technolangue programme funded by the French Ministry of Industry. A needs study in the field of sport terminology, which covered an overall category of users, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
 Parallel Corpora & Domains (bilingual and multilingual)    
  • Arabic
  • Chinese
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hebrew
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Northern Sami
  • Norwegian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Turkish

ID: ELRA-W0336

ISLRN: 471-919-856-164-1

Parallel corpora for nearly 400 language pairs and numerous multilingual combinations, including 10 million bilingual segments and 90 million tokens in 20 languages: Arabic, Chinese (Simplified), Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Italian, Japanese, Korean, North Sami...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
0.10 € submit
0.10 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
0.11 € submit
0.11 € submit

Special offers are also available. Check here for details.

 TRAD Arabic-English Mailing lists Parallel corpus - Development set    
  • Arabic
  • English

ID: ELRA-W0108

ISLRN: 213-044-240-074-6

This is a parallel corpus of 10,000 words in Arabic and a reference translation in English. The source texts are emails collected from Wikiar-I, a mailing list for discussions about the Arabic Wikipedia. The collected emails are dated from 2004 to 2007. The translation has been conducted follow...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Arabic-English Mailing lists Parallel corpus - Test set    
  • Arabic
  • English

ID: ELRA-W0106

ISLRN: 858-529-510-480-2

This is a parallel corpus of 10,000 words in Arabic and 2 reference translations in English. The source texts are emails collected from Wikiar-I, a mailing list for discussions about the Arabic Wikipedia. The collected emails are dated from 2010 to 2012. The translation has been conducted by tw...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Arabic-English Newspaper Parallel corpus - Test set 1    
  • Arabic
  • English

ID: ELRA-W0099

ISLRN: 764-187-795-074-0

This is a parallel corpus of 10,000 words in Arabic and 2 reference translations in English. The source texts are articles collected in 2012 from the Arabic version of Le Monde Diplomatique. The translation has been conducted by two different translation teams following a strict protocol aimed at...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Arabic-English Parallel corpus of transcribed Broadcast News Speech    
  • Arabic
  • English

ID: ELRA-W0102

ISLRN: 812-050-111-234-9

This is a parallel corpus of 10,000 words in Arabic and 2 reference translations in English. The source texts are transcriptions of broadcast news in Arabic recorded on France 24. The translation has been conducted by two different translation teams following a strict protocol aimed at producing ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Arabic-English Web domain (blogs) Parallel corpus    
  • Arabic
  • English

ID: ELRA-W0104

ISLRN: 762-161-069-435-5

This is a parallel corpus of 10,000 words in Arabic and 2 reference translations in English. The source texts are blog articles written between 2008 and 2013. The translation has been conducted by two different translation teams following a strict protocol aimed at producing high quality translat...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 Training and test data for Arabizi detection and transliteration    
  • Arabic
  • English

ID: ELRA-W0126

ISLRN: 986-364-744-303-9

The dataset is composed of two distinct resources: 1) A collection of mixed English and Arabizi text intended to train and test a system for the automatic detection of code-switching in mixed English and Arabizi texts. The training part of the corpus contains: 522 tweets composed of 5,207 token...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
650.00 € submit
Licence: Commercial Use - ELRA VAR
650.00 € submit
650.00 € submit