Bengali (2)
Hindi (2)
Urdu (2)
English (2)
Gujarati (2)
Panjabi; Punjabi (2)
Tamil (2)
Assamese (1)
Kannada (1)
Kashmiri (1)
Malayalam (1)
Marathi (1)
Telugu (1)
Available (2)
Corpus (2)
Text (2)
Commercial Use (1)
Commercial Use (1)
Monolingual (1)
Multilingual (1)
Punjabi (2)
Resource Type:
Corpus: | |
Lexical/Conceptual: | |
Tool/Service: | |
Language Description: |
Media Type:
Text: | |
Audio: | |
Image: | |
Video: | |
Text Numerical: | |
Text N-Gram: |
2 Language Resources
Order by:
The EMILLE/CIIL Corpus
- Assamese
- Bengali
- English
- Gujarati
- Hindi
- Kannada
- Kashmiri
- Malayalam
- Marathi
- Oriya (macrolanguage)
- Panjabi; Punjabi
- Sinhala; Sinhalese
- Tamil
- Telugu
- Urdu
ID: ELRA-W0037
ISLRN: 039-846-040-604-0The EMILLE/CIIL Corpus consists of three components: monolingual, parallel and annotated corpora. There are fourteen monolingual corpora, including both written and (for some languages) spoken data for fourteen South Asian languages: Assamese, Bengali, Gujarati, Hindi, Kannada, Kashmiri, Malayala...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
The EMILLE Lancaster Corpus
- Bengali
- English
- Gujarati
- Hindi
- Panjabi; Punjabi
- Sinhala; Sinhalese
- Tamil
- Urdu
ID: ELRA-W0038
ISLRN: 438-045-014-925-0The EMILLE Lancaster Corpus consists of three components: monolingual, parallel and annotated corpora. There are monolingual corpora for seven South Asian languages: Bengali, Gujarati, Hindi, Punjabi, Sinhala, Tamil, Urdu. The EMILLE monolingual corpora contain approximately 58,880,000 words (i...
MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
7500.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Commercial Use - ELRA VAR |
12000.00 €
|