Towards UDPipe 3: Insights from the LatinPipe winning submission to EvaLatin 2024

Speaker:
Jana Straková and Federica Gamba (ÚFAL MFF UK)
Abstract:
UDPipe 2 is a popular dependency parsing, morphological analysis and lemmatization tool. In this talk, we present our on-going work towards UDPipe 3, and we aim to share our insights from winning the EvaLatin 2024 Dependency Parsing shared task with our new architecture, dubbed LatinPipe for this purpose. We describe our experiences with multi-treebank/multilingual training in practical usage and other shared tasks, the importance of annotation style harmonization, how contextualization and position encoding matter for dependency parsing, and some other tricks to boost performance in competition. Besides what we hope to be an accessible, high-level explanation of the LatinPipe architecture, we will also describe the annotation process of harmonizing the annotation styles of the publicly available Latin corpora and investigate its impact on parsing results.
Length:
00:54:27
Date:
18/03/2024
views: 55

Images:
Attachments: (video, slides, etc.)
82.0 MB
56 downloads