Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

1 Language Resource

Order by:

 The EMILLE/CIIL Corpus    
  • Assamese
  • Bengali
  • English
  • Gujarati
  • Hindi
  • Kannada
  • Kashmiri
  • Malayalam
  • Marathi
  • Oriya (macrolanguage)
  • Panjabi; Punjabi
  • Sinhala; Sinhalese
  • Tamil
  • Telugu
  • Urdu

ID: ELRA-W0037

ISLRN: 039-846-040-604-0

The EMILLE/CIIL Corpus consists of three components: monolingual, parallel and annotated corpora. There are fourteen monolingual corpora, including both written and (for some languages) spoken data for fourteen South Asian languages: Assamese, Bengali, Gujarati, Hindi, Kannada, Kashmiri, Malayala...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit