18 Language Resources

Order by:

 2007 CoNLL Shared Task - Arabic & English    
  • Arabic
  • English

ID: ELRA-W0123

ISLRN: 505-782-255-628-8

2007 CoNLL Shared Task - Arabic & English consists of dependency treebanks in two languages used as part of the CoNLL 2007 shared task on multi-lingual dependency parsing and domain adaptation. The languages covered in this release are: Arabic and English. The Conference on Computational Natur...

MEMBERacademiccommercial
Licence: Non Commercial Use - Non Standard Licence Terms
NON MEMBERacademiccommercial
Licence: Non Commercial Use - Non Standard Licence Terms
 Annotated tweet corpus in Arabizi, French and English    
  • Arabic
  • English
  • French

ID: ELRA-W0323

ISLRN: 482-848-308-105-6

The annotated tweet corpus in Arabizi, French and English was built by ELDA on behalf of INSA Rouen Normandie (Normandie Université, LITIS team), in the framework of the SAPhIRS project (System for the Analysis of Information Propagation in Social Networks), funded by the DGE (Direction Générale ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
7000.00 € submit
Licence: Commercial Use - ELRA VAR
7000.00 € submit
7000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
 ArabLEX: Database of Arabic Place Names (DAP)    
  • Arabic
  • English

ID: ELRA-M0105

ISLRN: 161-842-321-771-2

This database is part of the ArabLEX set of data which consists of the Database of Arabic General Vocabulary (DAG), Database of Arabic Place Names (DAP), Database of Foreign Names in Arabic (DAF) and Database of Arab Names (DAN) available from ELRA under references, respectively, ELRA-L0131, ELRA...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
7000.00 € submit
22000.00 € submit
Licence: Commercial Use - ELRA VAR
22000.00 € submit
22000.00 € submit

Special offers are also available. Check here for details.

 ArabLEX: Database of Arab Names (DAN)    
  • Arabic
  • English

ID: ELRA-M0107

ISLRN: 773-974-582-139-4

This database is part of the ArabLEX set of data which consists of the Database of Arabic General Vocabulary (DAG), Database of Arabic Place Names (DAP), Database of Foreign Names in Arabic (DAF) and Database of Arab Names (DAN) available from ELRA under references, respectively, ELRA-L0131, ELRA...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
15000.00 € submit
45000.00 € submit
Licence: Commercial Use - ELRA VAR
45000.00 € submit
45000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
24000.00 € submit
71000.00 € submit
Licence: Commercial Use - ELRA VAR
71000.00 € submit
71000.00 € submit

Special offers are also available. Check here for details.

 ArabLEX: Database of Foreign Names in Arabic (DAF)    
  • Arabic
  • English

ID: ELRA-M0106

ISLRN: 943-592-129-040-2

This database is part of the ArabLEX set of data which consists of the Database of Arabic General Vocabulary (DAG), Database of Arabic Place Names (DAP), Database of Foreign Names in Arabic (DAF) and Database of Arab Names (DAN) available from ELRA under references, respectively, ELRA-L0131, ELRA...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
9000.00 € submit
27000.00 € submit
Licence: Commercial Use - ELRA VAR
27000.00 € submit
27000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
16000.00 € submit
49000.00 € submit
Licence: Commercial Use - ELRA VAR
49000.00 € submit
49000.00 € submit

Special offers are also available. Check here for details.

 Collins Multilingual database (MLD) - PhraseBank    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-T0377

ISLRN: 452-383-219-228-0

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, distributed separately under reference ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank). The PhraseBank consists of 2,000 p...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1680.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2240.00 € submit
 Collins Multilingual database (MLD) – PhraseBank with audio files    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-S0383

ISLRN: 398-655-047-044-5

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377). This version includes the audio files corresponding t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3360.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4480.00 € submit
 Collins Multilingual database (MLD) - WordBank    
  • Arabic
  • Bengali
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Malayalam
  • Modern Greek (1453-)
  • Norwegian
  • Polish
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Russian
  • Spanish; Castilian
  • Swedish
  • Tamil
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese

ID: ELRA-T0376

ISLRN: 990-814-402-335-7

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank) and a multilingual set of sentences in 28 languages (the PhraseBank, distributed separately under reference ELRA-T0377). The WordBank contains 10,000 words...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2400.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3600.00 € submit
 Collins Multilingual database (MLD) – WordBank with audio files    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-S0382

ISLRN: 309-438-781-042-2

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377). This version includes the corresponding audio files c...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3640.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5200.00 € submit
 MAURDOR Evaluation Package  
  • Arabic
  • English
  • French

ID: ELRA-E0045

ISLRN: 364-018-517-901-2

The MAURDOR project consists in evaluating systems for automatic processing of written documents. Collected written documents are scanned documents (printed, typewritten or manuscripts). In order to get images for the evaluation of automatic analysis systems, 10,000 original documents were c...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
10000.00 € submit
Licence: Evaluation Use - ELRA EVALUATION
5000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
750.00 € submit
15000.00 € submit
Licence: Evaluation Use - ELRA EVALUATION
7500.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 Multilingual Dictionary of Sports – English-French-Arabic trilingual database    
  • Arabic
  • English
  • French

ID: ELRA-T0372-04

ISLRN: 351-230-082-450-3

This dictionary was produced within the French national project EuRADic (European and Arabic Dictionaries and Corpora), as part of the Technolangue programme funded by the French Ministry of Industry. A needs study in the field of sport terminology, which covered an overall category of users, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
1500.00 € submit
Licence: Commercial Use - ELRA VAR
1500.00 € submit
1500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
200.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
 Multilingual Dictionary of Sports – English-French-Greek-Arabic-German-Spanish-Portuguese multilingual database    
  • Arabic
  • English
  • French
  • German
  • Modern Greek (1453-)
  • Portuguese
  • Spanish; Castilian

ID: ELRA-T0372-01

ISLRN: 753-372-742-011-3

This dictionary was produced within the French national project EuRADic (European and Arabic Dictionaries and Corpora), as part of the Technolangue programme funded by the French Ministry of Industry. A needs study in the field of sport terminology, which covered an overall category of users, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
 TRAD Arabic-English Mailing lists Parallel corpus - Development set    
  • Arabic
  • English

ID: ELRA-W0108

ISLRN: 213-044-240-074-6

This is a parallel corpus of 10,000 words in Arabic and a reference translation in English. The source texts are emails collected from Wikiar-I, a mailing list for discussions about the Arabic Wikipedia. The collected emails are dated from 2004 to 2007. The translation has been conducted follow...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Arabic-English Mailing lists Parallel corpus - Test set    
  • Arabic
  • English

ID: ELRA-W0106

ISLRN: 858-529-510-480-2

This is a parallel corpus of 10,000 words in Arabic and 2 reference translations in English. The source texts are emails collected from Wikiar-I, a mailing list for discussions about the Arabic Wikipedia. The collected emails are dated from 2010 to 2012. The translation has been conducted by tw...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Arabic-English Newspaper Parallel corpus - Test set 1    
  • Arabic
  • English

ID: ELRA-W0099

ISLRN: 764-187-795-074-0

This is a parallel corpus of 10,000 words in Arabic and 2 reference translations in English. The source texts are articles collected in 2012 from the Arabic version of Le Monde Diplomatique. The translation has been conducted by two different translation teams following a strict protocol aimed at...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Arabic-English Parallel corpus of transcribed Broadcast News Speech    
  • Arabic
  • English

ID: ELRA-W0102

ISLRN: 812-050-111-234-9

This is a parallel corpus of 10,000 words in Arabic and 2 reference translations in English. The source texts are transcriptions of broadcast news in Arabic recorded on France 24. The translation has been conducted by two different translation teams following a strict protocol aimed at producing ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 TRAD Arabic-English Web domain (blogs) Parallel corpus    
  • Arabic
  • English

ID: ELRA-W0104

ISLRN: 762-161-069-435-5

This is a parallel corpus of 10,000 words in Arabic and 2 reference translations in English. The source texts are blog articles written between 2008 and 2013. The translation has been conducted by two different translation teams following a strict protocol aimed at producing high quality translat...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
 Training and test data for Arabizi detection and transliteration    
  • Arabic
  • English

ID: ELRA-W0126

ISLRN: 986-364-744-303-9

The dataset is composed of two distinct resources: 1) A collection of mixed English and Arabizi text intended to train and test a system for the automatic detection of code-switching in mixed English and Arabizi texts. The training part of the corpus contains: 522 tweets composed of 5,207 token...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
650.00 € submit
Licence: Commercial Use - ELRA VAR
650.00 € submit
650.00 € submit