Audio (576)
Text (356)
Video (22)
True (214)
TEI (10)
TMX (6)

Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

934 Language Resources (Page 2 of 47)

« Previous | Next »Order by:

 American English Speech Data by Mobile Phone_Reading - 215 Hours    
  • English

ID: ELRA-S0467

ISLRN: 921-365-371-849-5

The data set contains 349 American English speakers' speech data, all of whom are American locals. It is recorded in quiet environment. The recording contents cover various categories like economics, entertainment, news and spoken language. It is manually transcribed and annotated with the starti...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
34722.50 € submit
34722.50 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
34722.50 € submit
34722.50 € submit

Special offers are also available. Check here for details.

 American English Speech Recognition Corpus (Desktop)    
  • English

ID: ELRA-S0228-79

ISLRN: 254-019-000-249-3

This corpus comprises 49,990 entries uttered by 50 speakers (25 males and 25 females), recorded over 2 channels (desktop in quiet office). Speech samples are stored as a sequence of 16-bit 16kHz for a total of 24.9 hours of speech per channel.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 American English Speech Recognition Corpus (Mobile) - 14.67 hours    
  • English

ID: ELRA-S0228-73

ISLRN: 817-988-141-738-4

This corpus comprises 14,988 entries uttered by 50 speakers (23 males and 27 females), recorded over the mobile telephone network. Speech samples are stored as a sequence of 16-bit 16 kHz for a total of 14.67 hours of speech.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 American English Speech Recognition Corpus (Mobile) - 19.4 hours    
  • English

ID: ELRA-S0228-58

ISLRN: 968-856-860-742-9

This corpus comprises 39,243 entries uttered by 151 speakers (74 males and 77 females), recorded over the mobile telephone network. Speech samples are stored as a sequence of 16-bit 16kHz for a total of 19.4 hours of speech.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2700.00 € submit
2700.00 € submit
Licence: Commercial Use - ELRA VAR
2700.00 € submit
2700.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2700.00 € submit
2700.00 € submit
Licence: Commercial Use - ELRA VAR
2700.00 € submit
2700.00 € submit
 American Spanish Recognition Corpus (Desktop+Mobile)    
  • English

ID: ELRA-S0228-68

ISLRN: 100-009-143-020-4

This corpus comprises 33,527 entries uttered by 40 speakers (21 males and 19 females), recorded over 2 channels (desktop in quiet office and mobile in noisy restaurant). Speech samples are stored as a sequence of 16-bit 16kHz for a total of 14.7 hours of speech per channel.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4800.00 € submit
4800.00 € submit
Licence: Commercial Use - ELRA VAR
4800.00 € submit
4800.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4800.00 € submit
4800.00 € submit
Licence: Commercial Use - ELRA VAR
4800.00 € submit
4800.00 € submit
 Amharic-English bilingual corpus    
  • Amharic
  • English

ID: ELRA-W0074

ISLRN: 590-255-335-719-0

The Amharic-English bilingual corpus contains parallel text from legal and news domains in Amharic script, in transliterated form and in English. The size of the corpus is of 232,653 words in Amharic and 291,701 in English. This parallel corpus contains documents from two domains, namely legal...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
4000.00 € submit
Licence: Commercial Use - ELRA VAR
4000.00 € submit
4000.00 € submit
 AnCora Catalan 2.0.0    
  • Catalan; Valencian

ID: ELRA-W0327

ISLRN: 186-654-762-852-8

The AnCora Catalan Corpus 2.0.0 is a corpus of 500,000 words annotated at different levels: - Lemma and Part of Speech, - Syntactic constituents and functions, - Argument structure and thematic roles, - Semantic classes of the verb, - Denotative type of deverbal nouns, - Nouns related to W...

MEMBERacademiccommercial
Licence: Attribution, Commercial Use - GPL
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Commercial Use - GPL
0.00 € submit
0.00 € submit
 AnCora Spanish 2.0.0    
  • Spanish; Castilian

ID: ELRA-W0326

ISLRN: 252-495-813-736-1

The AnCora Spanish Corpus 2.0.0 is a corpus of 500,000 words annotated at different levels: - Lemma and Part of Speech, - Syntactic constituents and functions, - Argument structure and thematic roles, - Semantic classes of the verb, - Denotative type of deverbal nouns, - Nouns related to W...

MEMBERacademiccommercial
Licence: Attribution, Commercial Use - GPL
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Commercial Use - GPL
0.00 € submit
0.00 € submit
 ANITA (Audio eNhancement In Telecom Applications)    
  • English
  • French
  • German
  • Spanish; Castilian

ID: ELRA-S0156

ISLRN: 537-894-870-719-4

ANITA (Audio eNhancement In secured Telecommunication Applications) is a European project launched on the initiative of EADS TELECOM with the objective of reducing audio acoustics noise in secured communications in adverse environments (sirens, alarms, engines, water pumps, stress situations, etc...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
 An-Nahar Newspaper Text Corpus    
  • Arabic

ID: ELRA-W0027

ISLRN: 083-457-618-309-8

The An-Nahar Lebanon Newspaper Text Corpus comprises articles in standard Arabic from 1995 to 2000 (6 years) stored as HTML files on CDRom media. Each year contains 45 000 articles and 24 million words. Each article includes information such as title, newspaper's name, date, country, type, page, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2016.00 € submit
3192.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3024.00 € submit
4788.00 € submit

Special offers are also available. Check here for details.

 Annotated tweet corpus in Arabizi, French and English    
  • Arabic
  • English
  • French

ID: ELRA-W0323

ISLRN: 482-848-308-105-6

The annotated tweet corpus in Arabizi, French and English was built by ELDA on behalf of INSA Rouen Normandie (Normandie Université, LITIS team), in the framework of the SAPhIRS project (System for the Analysis of Information Propagation in Social Networks), funded by the DGE (Direction Générale ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
7000.00 € submit
Licence: Commercial Use - ELRA VAR
7000.00 € submit
7000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
 APASCI    
  • Italian

ID: ELRA-S0039

ISLRN: 501-292-014-931-9

APASCI is an Italian speech database recorded in an insulated room with a Sennheiser MKH 416 T microphone. It includes 5,290 phonetically rich sentences and 10,800 isolated digits, for a total of 58,924 word occurrences (2,191 different words) and 641 minutes of speech. The speech material was re...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
20000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1600.00 € submit
25000.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit
 Arabic Speech Corpus    
  • Arabic

ID: ELRA-S0384

ISLRN: 866-568-447-697-8

This speech corpus has been developed as part of a PhD work carried out by Nawar Halabi at the University of Southampton. The corpus was recorded through a Neumann TLM 103 Studio Microphone by one male speaker in South Levantine Arabic (Damascian accent) in a professional studio. The transcript w...

MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
9000.00 € submit
Licence: Attribution - CC-BY
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Commercial Use - ELRA VAR
11200.00 € submit
Licence: Attribution - CC-BY
0.00 € submit
0.00 € submit
 Arbobanko (Esperanto Treebank)    
  • Esperanto

ID: ELRA-W0129

ISLRN: 185-602-618-699-2

The Arbobanko (Esperanto Treebank) is a 52,000 token dependency treebank of Esperanto with texts from the MONATO news magazine, consisting of random excerpts from the period 2000-2010. All words were annotated for lemma, part-of-speech, inflection, compounding and affixing, syntactic function, de...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
900.00 € submit
Licence: Commercial Use - ELRA VAR
900.00 € submit
900.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
1500.00 € submit
Licence: Commercial Use - ELRA VAR
1500.00 € submit
1500.00 € submit
 Arboretum treebank    
  • Danish

ID: ELRA-W0084

ISLRN: 025-729-182-451-2

The Arboretum treebank is a morphologically and syntactically annotated repository of Danish sentences, taken from Korpus 90 and Korpus 2000, both compiled by the Society for Danish Language and Literature (http://ordnet.dk/korpusdk/fakta), and containing samples of written Danish from the 90'ies...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
7000.00 € submit
Licence: Commercial Use - ELRA VAR
7000.00 € submit
7000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2200.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
 ARCADE II Evaluation Package    
  • Arabic
  • Chinese
  • English
  • French
  • German
  • Italian
  • Japanese
  • Modern Greek (1453-)
  • Persian
  • Russian
  • Spanish; Castilian

ID: ELRA-E0018

ISLRN: 875-865-064-331-9

The ARCADE II Evaluation Package was produced within the French national project ARCADE II (Evaluation of parallel text alignment systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The ARCADE II project enabled to carry out a cam...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit
 ARCADE/ROMANSEVAL corpus    
  • English
  • French
  • Italian

ID: ELRA-W0018

ISLRN: 681-769-134-114-2

The ARCADE/ROMANSEVAL corpus was used as a reference corpus in two international competitions: · ARCADE, an exercise on multilingual text alignment financed by AUPELF-UREF · ROMANSEVAL, part of the SENSEVAL exercise sponsored by ACL-SIGLEX and EURALEX, on word sense disambiguation. The corpus ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 Archives of "El Mundo" Newspaper – Year 2020    
  • Spanish; Castilian

ID: ELRA-W0333

ISLRN: 573-498-319-304-6

This corpus consists of 15,073 articles in Spanish from electronic archives of "El Mundo" Newspaper published in the year 2020. A few articles also come from publications from other related media: El Mundo Alicante, El Mundo Andalucía, El Mundo Baleares, El Mundo Catalunya, El Mundo Valéncia et E...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 Archives of "El Mundo" Newspaper – Year 2021    
  • Spanish; Castilian

ID: ELRA-W0334

ISLRN: 196-909-664-343-4

This corpus consists of 14,461 articles in Spanish from electronic archives of "El Mundo" Newspaper published in the year 2021. A few articles also come from publications from other related media: El Mundo Alicante, El Mundo Andalucía, El Mundo Baleares, El Mundo Catalunya, El Mundo Valéncia et E...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 Archives of "El Mundo" Newspaper – Year 2022    
  • Spanish; Castilian

ID: ELRA-W0335

ISLRN: 261-537-224-628-2

This corpus consists of 16,124 articles in Spanish from electronic archives of "El Mundo" Newspaper published in the year 2022. A few articles also come from publications from other related media: El Mundo Alicante, El Mundo Andalucía, El Mundo Baleares, El Mundo Catalunya, El Mundo Valéncia et E...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit

« Previous | Next »