35 Language Resources (Page 1 of 2)

« Previous | Next »Order by:

 2006 CoNLL Shared Task - Ten Languages    
  • Bulgarian
  • Danish
  • Dutch; Flemish
  • German
  • Japanese
  • Portuguese
  • Slovenian
  • Spanish; Castilian
  • Swedish
  • Turkish

ID: ELRA-W0086

ISLRN: 578-227-532-044-0

2006 CoNLL Shared Task - Ten Languages consists of dependency treebanks in ten languages used as part of the CoNLL 2006 shared task on multi-lingual dependency parsing. The languages covered in this release are: Bulgarian, Danish, Dutch, German, Japanese, Portuguese, Slovene, Spanish, Swedish and...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
 BrasiLEX Brazilian Portuguese lexicon    
  • Portuguese

ID: ELRA-L0034

ISLRN: 654-505-941-943-8

BrasiLEX is a multifunctional monolingual lexicon of the Brazilian variety of Portuguese, developed by the Natural Language Group of INESC. It has about 65,000 entries (lemmas) and 1,600 correspondent inflexion paradigms. The set of entries includes compound words and the inflexion paradigms incl...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
25000.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
30000.00 € submit
Licence: Commercial Use - ELRA VAR
30000.00 € submit
30000.00 € submit

This resource is also available in a bundle. Check here for bundled pricing.

 CEPLEXicon    
  • Portuguese

ID: ELRA-L0094

ISLRN: 408-817-203-152-3

CEPLEXicon is a lexicon based on two different corpora of child speech – Santos corpus (Santos, 2006, Santos et al., 2014, see http://www.clul.ul.pt/resources/546?lang=en) and Freitas corpus (Freitas, 1997, Freitas et al. 2012). This lexicon results from the automatic tagging of the two corpora, ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 CINTIL-DeepBank    
  • Portuguese

ID: ELRA-W0062

ISLRN: 368-672-631-502-0

The CINTIL-DeepBank (Branco et al., 2010) is a corpus of sentences annotated with their full-fledged deep grammatical representations, composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), and novels (399 sentences; 3,082...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 CINTIL-DependencyBank    
  • Portuguese

ID: ELRA-W0061

ISLRN: 133-035-138-613-6

The CINTIL-DependencyBank (Silva and Branco, 2012) is a corpus of sentences annotated with their syntactic dependency graphs and grammatical function tags composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), novels (399 ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 CINTIL-PropBank    
  • Portuguese

ID: ELRA-W0056

ISLRN: 723-486-478-286-6

The CINTIL-PropBank is a corpus of sentences annotated with their constituency structure and semantic role tags, composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), and novels (399 sentences; 3,082 tokens). In addition,...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 CINTIL-TreeBank    
  • Portuguese

ID: ELRA-W0055

ISLRN: 411-691-515-701-9

The CINTIL-TreeBank is a corpus of syntactic constituency trees of Portuguese texts composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), novels (399 sentences; 3,082 tokens). In addition, there are 779 sentences (5,654 t...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 CLEF AdHoc-News Test Suites (2004-2008) – Evaluation Package    
  • Bulgarian
  • Czech
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hungarian
  • Italian
  • Persian
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish

ID: ELRA-E0036

ISLRN: 378-279-085-589-0

The Cross-Language Evaluation Forum (CLEF) promotes R&D in multilingual information access (MLIA) by (i) developing an infrastructure for the testing, tuning and evaluation of information retrieval systems operating on European languages in both monolingual and cross-language contexts, and (ii) c...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit

Special offers are also available. Check here for details.

 CLEF Question Answering Test Suites (2003-2008) – Evaluation Package    
  • Bulgarian
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Italian
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Spanish; Castilian

ID: ELRA-E0038

ISLRN: 394-993-527-034-7

The Cross-Language Evaluation Forum (CLEF) promotes R&D in multilingual information access (MLIA) by (i) developing an infrastructure for the testing, tuning and evaluation of information retrieval systems operating on European languages in both monolingual and cross-language contexts, and (ii) c...

MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
150.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Evaluation Use - ELRA EVALUATION
300.00 € submit
1000.00 € submit

Special offers are also available. Check here for details.

 Collins Multilingual database (MLD) - PhraseBank    
  • Arabic
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Modern Greek (1453-)
  • Norwegian
  • Persian
  • Polish
  • Portuguese
  • Russian
  • Spanish; Castilian
  • Swedish
  • Thai
  • Turkish
  • Vietnamese

ID: ELRA-T0377

ISLRN: 452-383-219-228-0

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, distributed separately under reference ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank). The PhraseBank consists of 2,000 p...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
1680.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2240.00 € submit
 Collins Multilingual database (MLD) - WordBank    
  • Arabic
  • Bengali
  • Chinese
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Finnish
  • French
  • German
  • Hindi
  • Italian
  • Japanese
  • Korean
  • Malayalam
  • Modern Greek (1453-)
  • Norwegian
  • Polish
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Russian
  • Spanish; Castilian
  • Swedish
  • Tamil
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese

ID: ELRA-T0376

ISLRN: 990-814-402-335-7

The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank) and a multilingual set of sentences in 28 languages (the PhraseBank, distributed separately under reference ELRA-T0377). The WordBank contains 10,000 words...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2400.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3600.00 € submit
 ECI/MCI (European Corpus Initiative/Multilingual Corpus I)    
  • Albanian
  • Bulgarian
  • Chinese
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • Estonian
  • French
  • German
  • Italian
  • Japanese
  • Latin
  • Lithuanian
  • Malay (macrolanguage)
  • Modern Greek (1453-)
  • Norwegian
  • Portuguese
  • Russian
  • Scottish Gaelic; Gaelic
  • Serbian
  • Spanish; Castilian
  • Swedish
  • Turkish
  • Uzbek

ID: ELRA-W0004

ISLRN: 511-168-567-582-5

The European Corpus Initiative (ECI) was founded to oversee the acquisition and preparation of a large multilingual corpus, and supports existing and projected national and international efforts to carefully design, collect and publish large-scale multilingual written and spoken corpora. ECI has ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
50.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
50.00 € submit
50.00 € submit
 EUROPARL Corpus Parallel Corpora: Portuguese-English    
  • English
  • Portuguese

ID: ELRA-W0090

ISLRN: 435-502-922-727-2

The EUROPARL Corpus (Portuguese-English subpart of the parallel corpora), was extracted from the proceedings of the European Parliament. It contains transcriptions of sessions dating back from 1996 to 2011, with a total of approximately 58,324,562 tokens of European Portuguese (L1) and 49,216,896...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
0.00 € submit
Licence: Commercial Use - ELRA VAR
0.00 € submit
0.00 € submit
 GlobalPhone Portuguese (Brazilian) Pronunciation Dictionary      
  • Portuguese

ID: ELRA-S0355

ISLRN: 240-605-210-375-4

The GlobalPhone pronunciation dictionaries, created within the framework of the multilingual speech and language corpus GlobalPhone, were developed in collaboration with the Karlsruhe Institute of Technology (KIT). The GlobalPhone pronunciation dictionaries contain the pronunciations of all wo...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
600.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
700.00 € submit
3600.00 € submit
Licence: Commercial Use - ELRA VAR
3600.00 € submit
3600.00 € submit

Special offers are also available. Check here for details.

 LABEL-LEX (MW)    
  • Portuguese

ID: ELRA-L0054

ISLRN: 502-837-497-805-9

LABEL-LEX (MW) is a Portuguese formalized lexicon, containing 88 619 inflected multiword lexical units (formally, sequences of simple words). The units are distributed as follows: - 85,881 nouns, with information about type, gender, number, inflected forms, irregular inflected forms and subcatego...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 LABEL-LEX (SW)    
  • Portuguese

ID: ELRA-L0055

ISLRN: 154-511-437-811-6

LABEL-LEX (SW) is a Portuguese formalized lexicon, containing 1,545,481 simple inflected words. The words are distributed as follows: - 142,236 nouns, with information about type, gender, number, inflected forms, and irregular inflected forms - 3,155 adverbs, with information about degree, polari...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
2500.00 € submit
10000.00 € submit
Licence: Commercial Use - ELRA VAR
10000.00 € submit
10000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
15000.00 € submit
Licence: Commercial Use - ELRA VAR
15000.00 € submit
15000.00 € submit
 LEX-MWE-PT - Word Combination in Portuguese    
  • Portuguese

ID: ELRA-L0097

ISLRN: 353-430-176-260-6

LEX-MWE-PT is a lexicon of European Portuguese containing multiword expressions (MWE) extracted from a balanced 50.8M-word written corpus – a subcorpus of the Reference Corpus of Contemporary Portuguese (CRPC). This corpus covers different genres, being mainly constituted by journalistic texts (5...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
100.00 € submit
1000.00 € submit
Licence: Commercial Use - ELRA VAR
1000.00 € submit
1000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
180.00 € submit
1800.00 € submit
Licence: Commercial Use - ELRA VAR
1800.00 € submit
1800.00 € submit
 LORETO Thesaurus    
  • English
  • French
  • German
  • Italian
  • Polish
  • Portuguese
  • Spanish; Castilian
  • Turkish

ID: ELRA-T0089

ISLRN: 200-766-790-065-1

Entries: 800 Languages: French, Spanish, German, Polish, Turkish, Italian, Portuguese, English Format: ASCII Medium: Floppy disk (Mac or PC) Card Description: This thesaurus is an interesting basis for starting up an information system in this area. 3 languages can be displayed at a time. The the...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
150.00 € submit
150.00 € submit
Licence: Commercial Use - ELRA VAR
150.00 € submit
150.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
200.00 € submit
200.00 € submit
Licence: Commercial Use - ELRA VAR
200.00 € submit
200.00 € submit
 LT Corpus    
  • Portuguese

ID: ELRA-W0059

ISLRN: 569-208-468-863-2

The LT Corpus is composed of 70 fiction texts from Portuguese renowned authors. The corpus contains 1,781,083 tokens. The texts date from before 1940. The corpus is delivered in one file, in two different formats. The txt version has one sentence per line, an identification number for each text ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
2500.00 € submit
Licence: Commercial Use - ELRA VAR
2500.00 € submit
2500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
3000.00 € submit
Licence: Commercial Use - ELRA VAR
3000.00 € submit
3000.00 € submit
 LusoLEX European Portuguese Lexicon    
  • Portuguese

ID: ELRA-L0033

ISLRN: 686-955-010-935-8

LusoLEX is a multifunctional monolingual lexicon of the European variety of Portuguese, developed by the Natural Language Group of INESC. It has about 61,000 entries (lemmas) and 1,600 correspondent inflexion paradigms. The set of entries includes compound words and the inflexion paradigms includ...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
3000.00 € submit
25000.00 € submit
Licence: Commercial Use - ELRA VAR
25000.00 € submit
25000.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
5000.00 € submit
30000.00 € submit
Licence: Commercial Use - ELRA VAR
30000.00 € submit
30000.00 € submit

This resource is also available in a bundle. Check here for bundled pricing.

« Previous | Next »