63 Language Resources (Page 1 of 4)

« Previous | Next »Order by:

 88milSMS. A corpus of authentic text messages in French    
  • French

ID: ELRA-W0082

ISLRN: 024-713-187-947-8

A pluridisciplinary team of linguists and computer scientists (Rachel Panckhurst, Catherine Détrie, Cédric Lopez, Claudine Moïse, Mathieu Roche, Bertrand Verine (Praxiling, Lirmm, Lidilem, Tetis, Viseo) collected more than 88,000 French authentic text messages in Montpellier (2011), as part of th...

MEMBERacademiccommercial
Licence: Non Commercial Use - Non Standard Licence Terms
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - Non Standard Licence Terms
0.00 € submit
0.00 € submit
 Amaryllis Corpus - Evaluation Package    
  • French

ID: ELRA-W0029

ISLRN: 786-395-313-491-8

Launched at the end of 1995, the AMARYLLIS project aimed at evaluating information retrieval software for French text corpora in order to provide a methodology for the evaluation of other similar tools. AMARYLLIS was organised by the Institut de l'Information Scientifique et Technique (INIST) wit...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
100.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
45.00 € submit
100.00 € submit
 ANITA (Audio eNhancement In Telecom Applications)    
  • English
  • French
  • German
  • Spanish; Castilian

ID: ELRA-S0156

ISLRN: 537-894-870-719-4

ANITA (Audio eNhancement In secured Telecommunication Applications) is a European project launched on the initiative of EADS TELECOM with the objective of reducing audio acoustics noise in secured communications in adverse environments (sirens, alarms, engines, water pumps, stress situations, etc...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1000.00 € submit
2000.00 € submit
Licence: Commercial Use - ELRA VAR
2000.00 € submit
2000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
 A "scientific" corpus of modern French ("La Recherche" magazine) - Complete version    
  • French

ID: ELRA-W0025-02

ISLRN: 798-363-116-656-4

This "scientific" corpus of modern French was produced by the University of Nantes (France) within the European Commission funded project LRsP&P (Language Resources Production & Packaging - LE4-8335). The corpus contains all articles published in La Recherche magazine in 1998, including issues 30...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
500.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
5000.00 € submit
5000.00 € submit
 A "scientific" corpus of modern French ("La Recherche" magazine) - Raw data    
  • French

ID: ELRA-W0025-01

ISLRN: 508-941-013-339-7

This "scientific" corpus of modern French was produced by the University of Nantes (France) through a funding from ELRA in the framework of the European Commission project LRsP&P (Language Resources Production & Packaging - LE4-8335). The corpus contains all articles published in La Recherche mag...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
240.00 € submit
1200.00 € submit
Licence: Commercial Use - ELRA VAR
1200.00 € submit
1200.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
310.00 € submit
1500.00 € submit
Licence: Commercial Use - ELRA VAR
1500.00 € submit
1500.00 € submit
 BDBRUIT    
  • French

ID: ELRA-S0033

ISLRN: 067-749-878-515-8

A French speech database dedicated to the study of the perturbations of speech production due to noisy environments, and especially the Lombard effect. Environment: 4 noise conditions and the reference condition (quiet). The 2 noises used (a "white noise" and a "cocktail-party noise") were both p...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
385.00 € submit
775.00 € submit
Licence: Commercial Use - ELRA VAR
775.00 € submit
775.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
775.00 € submit
1400.00 € submit
Licence: Commercial Use - ELRA VAR
1400.00 € submit
1400.00 € submit
 BDSONS Base de données des sons du français    
  • French

ID: ELRA-S0005

ISLRN: 353-598-244-017-0

The BDSONS Database is a French - speech database with two subsets: evaluation and acoustic modelling. The Corpora consist of 32 speakers: 16 male and 16 female (7 CD-ROMs of approximately 3,5 Gigabytes), Phonetic labelling (partly) available on additional floppies, of the following data: "Evalu...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
630.00 € submit
950.00 € submit
Licence: Commercial Use - ELRA VAR
950.00 € submit
950.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
950.00 € submit
1580.00 € submit
Licence: Commercial Use - ELRA VAR
1580.00 € submit
1580.00 € submit
 BREF-120 - A large corpus of French read speech    
  • French

ID: ELRA-S0067

ISLRN: 843-228-642-422-1

BREF-120 resulted from the efforts of LIMSI-CNRS researchers under sponsorship from the GDR-PRC CHM, the ACCT (OFIL), the EEC (ESPRIT Polyglot project), and the Aupelf-Uref. A sub-set of BREF-120 is BREF-80 (ELRA-S0006), which consists of about 50-60 sentences per speaker and recordings conducted...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2500.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
4000.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 BREF-80    
  • French

ID: ELRA-S0006

ISLRN: 310-036-258-354-7

The BREF corpus was designed to provide enough read speech data for the development and evaluation of continuous speech recognition systems (both speaker-dependent and speaker-independent), and to provide a large corpus of continuous speech for the acquisition of acoustic-phonetic knowledge of sp...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 BREF-POLYGLOT    
  • French

ID: ELRA-S0007

ISLRN: 382-431-956-363-1

The BREF-Polyglot is a sub-corpus of the BREF corpus (1 ISO9660 CDROM); it contains speaker-dependent training data from 6 speakers. There are a total of 3193 sentences (2 signal files for each sentence), on average 530 per speaker. While this data represents only a small portion of the entire BR...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
400.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
800.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 Canadian French Speech Recognition Corpus (Mobile)    
  • French

ID: ELRA-S0228-72

ISLRN: 360-129-212-036-3

This corpus comprises 75,147 entries uttered by 50 speakers (25 males and 25 females), recorded over 3 channels (mobile quiet office). Speech samples are stored as a sequence of 16-bit 16kHz for a total of 25.67 hours of speech per channel.

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
6000.00 € submit
6000.00 € submit
Licence: Commercial Use - ELRA VAR
6000.00 € submit
6000.00 € submit
 CESART Evaluation Package    
  • French

ID: ELRA-E0019

ISLRN: 154-799-255-123-0

The CESART Evaluation Package was produced within the French national project CESART (Evaluation of terminology extraction tools), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The CESART project enabled to carry out a campaign for th...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit
 CLEF QAST (2007-2009) – Evaluation Package    
  • English
  • French
  • Spanish; Castilian

ID: ELRA-E0039

ISLRN: 460-370-870-489-0

The Cross-Language Evaluation Forum (CLEF) promotes R&D in multilingual information access (MLIA) by (i) developing an infrastructure for the testing, tuning and evaluation of information retrieval systems operating on European languages in both monolingual and cross-language contexts, and (ii) c...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit

Special offers are also available. Check here for details.

 C-ORAL-ROM - Integrated reference corpora for spoken romance languages. Multi-media edition; tools of analysis; standard linguistic measurements for validation in HLT    
  • French
  • Italian
  • Portuguese
  • Spanish; Castilian

ID: ELRA-S0172

ISLRN: 318-977-046-077-4

Description The C-ORAL-ROM resource is a multilingual corpus of spontaneous1 speech for the main romance languages of around 1,200,000 words (IST 2000-26228). The resource comprises three components: a)Multimedia corpus; b)Speech software; c)Appendix. The corpus consists of four comparable recor...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1500.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
20000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
 DEFT'08 Evaluation Package    
  • French

ID: ELRA-E0035

ISLRN: 161-881-080-899-5

DEFT (DEfi Fouille de Texte – Text Mining Challenge) organizes evaluation campaigns in the field of text mining. The topic of DEFT 2008 edition is related to the classification of texts by topics and genres. Automatic classification has multiple applications in text mining. Many application fiel...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
240.91 € submit
240.91 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
313.18 € submit
313.18 € submit
 EASy Evaluation Package    
  • French

ID: ELRA-E0034

ISLRN: 238-723-334-894-5

The EASy Evaluation Package was produced within the French national project EASy (Evaluation of syntactic parsers of French), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The project enabled to carry out a campaign for the evaluation...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit
 EPAC Corpus: orthographic transcriptions    
  • French

ID: ELRA-S0305

ISLRN: 483-703-007-740-8

This corpus consists of approx. 100 hours of manual orthographic transcriptions, which were produced from 1,677 hours of non transcribed recordings from the ESTER Evaluation Campaign (Technolangue programme, see also ELRA-E0021). This corpus also consists of automatic transcriptions of the full 1...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2000.00 € submit
7500.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit
 EQueR Evaluation Package    
  • French

ID: ELRA-E0022

ISLRN: 725-358-759-122-3

The EQueR Evaluation Package was produced within the French national project EQueR (Evaluation campaign for Question-Answering systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The EQueR project enabled to carry out a campaign f...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit
 ESTER 2 Corpus    
  • French

ID: ELRA-S0338

ISLRN: 123-207-221-143-8

ESTER 2 evaluation campaign (Evaluation of Broadcast News enriched transcription systems) is based, one the one hand, on the full corpus from the first ESTER campaign (see ELRA-E0021 and ELRA-S0241), and which was, on the other hand, completed with a training corpus of about hundred hours, specif...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
300.00 € submit
5000.00 € submit
Licence: Commercial Use - ELRA VAR
20000.00 € submit
20000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2000.00 € submit
7500.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit

« Previous | Next »