Loading...
Thumbnail Image
Item

PoeTree: Poetry Treebanks in Czech, English, French, German, Hungarian, Italian, Portuguese, Russian, Slovenian and Spanish

Plecháč,Petr
Cinková,Silvie
Kolár,Robert
Šeļa,Artjoms
De Sisto,Mirella
Nugues,Lara
Haider,Thomas
Kočnik,Neža
Abstract
This article presents a set of standardised corpora of poetry comprising over 330,000 poems in ten languages (Czech, English, French, German, Hungarian, Italian, Portuguese, Russian, Slovenian, and Spanish). Each corpus has been deduplicated, enriched with Universal Dependencies, provided with additional metadata, and converted into a unified json structure.
Description
Publisher Copyright: © 2024 Petr Plecháč et al.
Date
2024-09
Journal Title
Journal ISSN
Volume Title
Publisher
Research Projects
Organizational Units
Journal Issue
Keywords
poetry, computational poetry, corpus linguistics, digital humanities
Citation
Plecháč, P, Cinková, S, Kolár, R, Šeļa, A, De Sisto, M, Nugues, L, Haider, T & Kočnik, N 2024, 'PoeTree: Poetry Treebanks in Czech, English, French, German, Hungarian, Italian, Portuguese, Russian, Slovenian and Spanish', Research Data Journal for the Humanities and Social Sciences. https://doi.org/10.1163/24523666-bja10044
License
info:eu-repo/semantics/openAccess
Embedded videos