Search and Browse – ELRA Catalogue

Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

2 Language Resources

Order by:

English-Punjabi Code-Mixed Social Media Content text

English
Panjabi; Punjabi

ID: ELRA-W0319

ISLRN: 695-759-706-170-8

The English-Punjabi Code-Mixed Social Media Content corpus is composed is composed of 893,615 parallel sentences of English-Punjabi distributed over the following domains: - 82,341 parallel sentences of English-Punjabi code-mixed Agriculture Domain Data, - 59,158 parallel sentences of English-P...

MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	0.00 €	0.00 €
Licence: Commercial Use - ELRA VAR	0.00 €	0.00 €

NON MEMBER	academic	commercial
Licence: Non Commercial Use - ELRA END USER	0.00 €	0.00 €
Licence: Commercial Use - ELRA VAR	0.00 €	0.00 €

The EMILLE Lancaster Corpus text

Bengali
English
Gujarati
Hindi
Panjabi; Punjabi
Sinhala; Sinhalese
Tamil
Urdu

ID: ELRA-W0038

ISLRN: 438-045-014-925-0

The EMILLE Lancaster Corpus consists of three components: monolingual, parallel and annotated corpora. There are monolingual corpora for seven South Asian languages: Bengali, Gujarati, Hindi, Punjabi, Sinhala, Tamil, Urdu. The EMILLE monolingual corpora contain approximately 58,880,000 words (i...

MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR		7500.00 €

NON MEMBER	academic	commercial
Licence: Commercial Use - ELRA VAR		12000.00 €