Comparison of coreferential expressions in Czech and English
Speaker:
Michal Novák, Anna Nedoluzhko
Abstract:
In this work, we present a comprehensive study on mappings between certain classes of coreferential expressions in English and Czech. We focused on central pronouns, relative pronouns and anaphoric zeros. For instance, the English sentence 'It switched to a caffeine-free formula using its new Coke in 1985' has been in PCEDT translated to 'V roce 1985 přešla na bezkofeinovou recepturu, kterou používá pro svojí novou kolu'. This pair of sentences exhibits several types of changes in expressing coreference: English personal pronouns turns into a Czech zero, possessive pronoun into a possessive reflexive and finally, the -ing participle has been translated to a relative clause. In a similar manner, we have collected a statistics of mappings from a subsection of PCEDT, which we will support by multiple examples and contrast with the theoretical assumptions. For such a study, the quality of word alignment is crucial. Thus, we designed a rule-based refining algorithm for English personal and possessive pronouns and Czech relative pronouns, which served as an automatic alignment pre-annotation. Subsequently, this annotation has been manually corrected and completed, obtaining a basis for this empirical study.