Resource Type:

Corpus:
Lexical/Conceptual:
Tool/Service:
Language Description:

Media Type:

Text:
Audio:
Image:
Video:
Text Numerical:
Text N-Gram:

2 Language Resources

Order by:

 Ema-lon Manipuri Corpus (including word embedding and language model)    
  • English
  • Manipuri

ID: ELRA-W0316

ISLRN: 588-170-827-016-7

The Ema-lon Manipuri Corpus consists of a set of resources for Manipuri language (locally known as Meiteilon) for the purpose of machine translation. The main source for these resources is the Sangai Express news website. The resources that constitute the present corpus are listed below: 1. EM C...

MEMBERacademiccommercial
Licence: Attribution, Non Commercial Use - CC-BY-NC-4.0
0.00 € submit
0.00 € submit
NON MEMBERacademiccommercial
Licence: Attribution, Non Commercial Use - CC-BY-NC-4.0
0.00 € submit
0.00 € submit
 Training and test data for Arabizi detection and transliteration    
  • Arabic
  • English

ID: ELRA-W0126

ISLRN: 986-364-744-303-9

The dataset is composed of two distinct resources: 1) A collection of mixed English and Arabizi text intended to train and test a system for the automatic detection of code-switching in mixed English and Arabizi texts. The training part of the corpus contains: 522 tweets composed of 5,207 token...

MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
500.00 € submit
Licence: Commercial Use - ELRA VAR
500.00 € submit
500.00 € submit
NON MEMBERacademiccommercial
Licence: Non Commercial Use - ELRA END USER
0.00 € submit
650.00 € submit
Licence: Commercial Use - ELRA VAR
650.00 € submit
650.00 € submit