Resource Type:
Corpus: | |
Lexical/Conceptual: | |
Tool/Service: | |
Language Description: |
Media Type:
Text: | |
Audio: | |
Image: | |
Video: | |
Text Numerical: | |
Text N-Gram: |
30 Language Resources (Page 2 of 2)
« Previous | Next » Order by:
- Arabic
ID: ELRA-S0247
ISLRN: 716-224-338-250-4The LC-STAR Standard Arabic Phonetic lexicon was created within the scope of the LC-STAR project (IST 2001-32216) which was sponsored by the European Commission. The lexicon comprises 110,271 entries, distributed over three categories: - a set of 52,981 common word entries. This set is extracte...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
21250.00 €
|
28000.00 €
|
Licence: Commercial Use - ELRA VAR |
28000.00 €
|
28000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
27625.00 €
|
36400.00 €
|
Licence: Commercial Use - ELRA VAR |
36400.00 €
|
36400.00 €
|
- Arabic
ID: ELRA-W0049
ISLRN: 124-139-628-259-2This corpus contains 102,960 vowelised, lemmatised and tagged words (58 texts from Le Monde Diplomatique Arabic, see also ELRA-W0036-04). To each text are associated 3 files : - raw text in Arabic, - vowelized text in Arabic, - one XML file containing the morphological annotation of the text. ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
185.00 €
|
975.00 €
|
Licence: Commercial Use - ELRA VAR |
975.00 €
|
975.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
400.00 €
|
2000.00 €
|
Licence: Commercial Use - ELRA VAR |
2000.00 €
|
2000.00 €
|
- Arabic
ID: ELRA-W0036-04
ISLRN: 231-368-326-920-2Electronic archiving of "Le Monde Diplomatique" articles in Arabic from 2000. The corpus is available in HTML. Each HTML file contains one article. Number of articles available per year : • 2000: 61 articles (November and December available only) (75,305 words) • 2001: 346 articles (479,435 ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
46.00 €
|
46.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
69.00 €
|
69.00 €
|
- Arabic
ID: ELRA-L0134
ISLRN: 977-057-254-691-5Moroccan Arabic Dialect Electronic Dictionary (MADED) is an electronic lexicon containing almost 11,500 entries. They are written in Arabic script wherein each Modern Standard Arabic (MSA) lemma is provided with its corresponding Moroccan Arabic equivalent. In addition, MADED entries are annotate...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, No Derivatives - CC-BY-NC-ND |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
1000.00 €
|
1000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, No Derivatives - CC-BY-NC-ND |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
2000.00 €
|
2000.00 €
|
- Arabic
ID: ELRA-L0135
ISLRN: 064-194-729-767-0The Moroccan Morphological vocabulary is a lexicon containing more than 4.6 M entries describing a given Moroccan Arabic word with fourteen (14) morphological and semantic features: the word orthographic form, the segmentation (prefix and suffix), part-of-speech (POS), gender, number, tense and t...
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, No Derivatives - CC-BY-NC-ND |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
6000.00 €
|
6000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Non Commercial Use, No Derivatives - CC-BY-NC-ND |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
12000.00 €
|
12000.00 €
|
- Arabic
ID: ELRA-W0078
ISLRN: 398-979-151-557-0The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
- Arabic
ID: ELRA-W0042
ISLRN: 050-693-158-326-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Broadcast News Speech Corpus (ELRA-S0219) and the NEMLAR Speech Synthesis Corpus (ELRA-S0220). The NEMLAR Written Corpus consists of about...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
150.00 €
|
250.00 €
|
Licence: Commercial Use - ELRA VAR |
1000.00 €
|
1000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
|
500.00 €
|
Licence: Commercial Use - ELRA VAR |
2000.00 €
|
2000.00 €
|
Special offers are also available. Check here for details.
- Arabic
ID: ELRA-W0127
ISLRN: 305-450-745-774-1Normalized Arabic Fragments for Inestimable Stemming (NAFIS) is an Arabic stemming gold standard corpus composed by a collection of sentences, selected to be representative of Arabic stemming tasks and manually annotated. Indeed, NAFIS is: Comprehensive: The content of NAFIS can be generalized...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
0.00 €
|
0.00 €
|
- Arabic
- English
ID: ELRA-W0126
ISLRN: 986-364-744-303-9The dataset is composed of two distinct resources: 1) A collection of mixed English and Arabizi text intended to train and test a system for the automatic detection of code-switching in mixed English and Arabizi texts. The training part of the corpus contains: 522 tweets composed of 5,207 token...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
500.00 €
|
Licence: Commercial Use - ELRA VAR |
500.00 €
|
500.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
650.00 €
|
Licence: Commercial Use - ELRA VAR |
650.00 €
|
650.00 €
|
- Arabic
ID: ELRA-W0325
ISLRN: 688-718-284-176-0Wojood consists of about 550,000 tokens (Modern Standard Arabic and dialect) that are manually annotated with 21 entity types (person, group of people, occupation, organization, geopolitical entity, location, facility, event, date, time, language, website, law, product, cardinal number, ordinal n...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
8000.00 €
|
Licence: Commercial Use - ELRA VAR |
8000.00 €
|
8000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
10000.00 €
|
Licence: Commercial Use - ELRA VAR |
10000.00 €
|
10000.00 €
|
« Previous | Next »