Bulgarian Treebank Corpus
View resource name in all available languages
Corpus Treebank bulgare
ID:
ELRA-W0328
The Bulgarian Treebank Corpus is composed of 156,149 tokens (11,138 sentences) coming from three main sources in the domain of Grammar Notebooks (1,391 sentences), News (6,698 sentences), Other (3,049 sentences). It is available with syntactical and morphological annotation on a sentence basis in Universal Dependencies format. This subset of BulTreeBank excludes ellipses and some rare phenomena. The conversion of BulTreeBank into Universal Dependency format was supported by the EU Project QTLeap (http://qtleap.eu/).
View resource description in
French
Le Corpus Treebank bulgare est un corpus de 156 149 tokens (11 138 phrases) provenant de 3 sources différentes : livres de grammaire (1391 phrases), actualités (6698 phrases) et divers (3049 phrases). Il est disponible avec des annotations syntaxiques et morphologiques au niveau de la phrase. Ce sous-ensemble du BulTreeBank exclut les ellipses et d’autres phénomènes. La conversion du BulTreeBank au format Universal Dependencies a été soutenu par le projet européen QTLeap (http://qtleap.eu/).
MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|
NON MEMBER | academic | commercial |
---|---|---|
Licence: Attribution, Share Alike - CC-BY-SA-3.0 |
0.00 €
|
0.00 €
|