ELRA ELRA
  Home Catalogue
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
 
Use keywords to find the product you are looking for.
Advanced Search
Languages
Anglais Français
Informations
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Catalogue of Language Resources

    ELRA releases free Language Resources.


    The ELRA Catalogue of Language Resources offers a repository of Language Resources (LRs) made available through ELRA.


    (See full-size image)

    An increasing number of LRs in the various fields of Human Language Technology (see image on the left-hand side) are distributed on behalf of ELRA via its operational body ELDA, thanks to the contribution of various players of the HLT community.

    Our aim is to provide Language Resources, by means of this repository, so as to prevent researchers and developers from investing efforts to rebuild resources which already exist as well as help them identify and access those resources.

    Other resources identified, but not available through ELRA, can be viewed in the Universal Catalogue.

    If you have any suggestions or comments, or need any further details about ELRA and its Catalogue of Language Resources, please refer to the contact us section.

    ELRA is a partner of OLAC (Open Language Archives Community). The catalogue can be viewed as an OLAC repository.

    New Resources
  • ELRA-T0377 : Collins Multilingual database (MLD) - PhraseBank
    This multilingual dataset covers Real
    Life Daily vocabulary in 28 languages.
    It contains 2,000 phrases for each
    language, organised under 12 topics and
    67 subtopics. Romanization is provided
    for Arabic, Farsi and Hindi.

  • ELRA-S0382 : Collins Multilingual database (MLD) – WordBank with audio files
    This multilingual lexicon covers Real
    Life Daily vocabulary in 26 languages.
    It contains 10,000 words for each
    language, XML-annotated for
    part-of-speech, gender, irregular forms
    and disambiguating information for
    homographs, with the corresponding audio
    files recorded by a native speaker and
    10,000 additional headwords with audio
    for 12 languages.

  • ELRA-S0383 : Collins Multilingual database (MLD) – PhraseBank with audio files
    This multilingual dataset covers Real
    Life Daily vocabulary in 28 languages.
    It contains 2,000 phrases for each
    language, organised under 12 topics and
    67 subtopics, and the corresponding
    audio files recorded by a native
    speaker.

  • ELRA-T0376 : Collins Multilingual database (MLD) - WordBank
    This multilingual lexicon covers Real
    Life Daily vocabulary in 32 languages.
    It contains 10,000 words for each
    language, XML-annotated for
    part-of-speech, gender, irregular forms
    and disambiguating information for
    homographs, and 10,000 additional
    headwords for 12 languages.

  • ELRA-S0374 : FoxPersonTracks: a Benchmark for Person Re-Identification from TV Broadcast Shows
    FoxPersonTracks is a person track
    dataset dedicated to person
    re-identification. The dataset is built
    from a set of real life TV shows
    broadcasted from BFMTV and LCP TV french
    channels, provided during REPERE
    challenge. It contains a total 4,604
    persontracks (short video sequences
    featuring an individual with no
    background) from 266 persons. The
    dataset also provides re-identification
    results using space-time histograms as a
    baseline, together with an evaluation
    tool in order to ease the comparison to
    other re- identification methods.

  • (last update: July 2016)

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0