Resource Type:
Corpus: | |
Lexical/Conceptual: | |
Tool/Service: | |
Language Description: |
Media Type:
Text: | |
Audio: | |
Image: | |
Video: | |
Text Numerical: | |
Text N-Gram: |
9 Language Resources
Order by:
- Arabic
ID: ELRA-W0030
ISLRN: 365-777-769-398-7The corpus was developed in the course of a research project at the University of Essex, in collaboration with the Open University. The corpus contains Al-Hayat newspaper articles with value added for Language Engineering and Information Retrieval applications development purposes. The data have ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
480.00 €
|
960.00 €
|
Licence: Commercial Use - ELRA VAR |
960.00 €
|
960.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
720.00 €
|
1440.00 €
|
Licence: Commercial Use - ELRA VAR |
1440.00 €
|
1440.00 €
|
- Arabic
ID: ELRA-W0027
ISLRN: 083-457-618-309-8The An-Nahar Lebanon Newspaper Text Corpus comprises articles in standard Arabic from 1995 to 2000 (6 years) stored as HTML files on CDRom media. Each year contains 45 000 articles and 24 million words. Each article includes information such as title, newspaper's name, date, country, type, page, ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
2016.00 €
|
3192.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
3024.00 €
|
4788.00 €
|
Special offers are also available. Check here for details.
- Arabic
ID: ELRA-W0049
ISLRN: 124-139-628-259-2This corpus contains 102,960 vowelised, lemmatised and tagged words (58 texts from Le Monde Diplomatique Arabic, see also ELRA-W0036-04). To each text are associated 3 files : - raw text in Arabic, - vowelized text in Arabic, - one XML file containing the morphological annotation of the text. ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
185.00 €
|
975.00 €
|
Licence: Commercial Use - ELRA VAR |
975.00 €
|
975.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
400.00 €
|
2000.00 €
|
Licence: Commercial Use - ELRA VAR |
2000.00 €
|
2000.00 €
|
- Arabic
ID: ELRA-W0036-04
ISLRN: 231-368-326-920-2Electronic archiving of "Le Monde Diplomatique" articles in Arabic from 2000. The corpus is available in HTML. Each HTML file contains one article. Number of articles available per year : • 2000: 61 articles (November and December available only) (75,305 words) • 2001: 346 articles (479,435 ...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
46.00 €
|
46.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
69.00 €
|
69.00 €
|
- Arabic
ID: ELRA-W0078
ISLRN: 398-979-151-557-0The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
5000.00 €
|
5000.00 €
|
Licence: Commercial Use - ELRA VAR |
5000.00 €
|
5000.00 €
|
- Arabic
ID: ELRA-W0042
ISLRN: 050-693-158-326-9This corpus was produced within the NEMLAR project (http://www.nemlar.org). Two other resources, produced within the same project, are also available: NEMLAR Broadcast News Speech Corpus (ELRA-S0219) and the NEMLAR Speech Synthesis Corpus (ELRA-S0220). The NEMLAR Written Corpus consists of about...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
150.00 €
|
250.00 €
|
Licence: Commercial Use - ELRA VAR |
1000.00 €
|
1000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
300.00 €
|
500.00 €
|
Licence: Commercial Use - ELRA VAR |
2000.00 €
|
2000.00 €
|
Special offers are also available. Check here for details.
- Arabic
ID: ELRA-W0127
ISLRN: 305-450-745-774-1Normalized Arabic Fragments for Inestimable Stemming (NAFIS) is an Arabic stemming gold standard corpus composed by a collection of sentences, selected to be representative of Arabic stemming tasks and manually annotated. Indeed, NAFIS is: Comprehensive: The content of NAFIS can be generalized...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
0.00 €
|
Licence: Commercial Use - ELRA VAR |
0.00 €
|
0.00 €
|
- Arabic
- English
ID: ELRA-W0126
ISLRN: 986-364-744-303-9The dataset is composed of two distinct resources: 1) A collection of mixed English and Arabizi text intended to train and test a system for the automatic detection of code-switching in mixed English and Arabizi texts. The training part of the corpus contains: 522 tweets composed of 5,207 token...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
500.00 €
|
Licence: Commercial Use - ELRA VAR |
500.00 €
|
500.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
650.00 €
|
Licence: Commercial Use - ELRA VAR |
650.00 €
|
650.00 €
|
- Arabic
ID: ELRA-W0325
ISLRN: 688-718-284-176-0Wojood consists of about 550,000 tokens (Modern Standard Arabic and dialect) that are manually annotated with 21 entity types (person, group of people, occupation, organization, geopolitical entity, location, facility, event, date, time, language, website, law, product, cardinal number, ordinal n...
MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
8000.00 €
|
Licence: Commercial Use - ELRA VAR |
8000.00 €
|
8000.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Non Commercial Use - ELRA END USER |
0.00 €
|
10000.00 €
|
Licence: Commercial Use - ELRA VAR |
10000.00 €
|
10000.00 €
|