Video Recordings

Learning Morphology From The Corpus

Speaker:

Ondřej Dušek

Abstract:

This talk will present a simple method of learning morphological analysis and generation from annotated data, aimed at robustness to unseen inputs. We will focus mainly on the generation part, where we use a trainable classifier to predict "edit scripts" that are used to transform lemmas into inflected word forms. Our morphology generation system has been evaluated on 6 languages and shown to be able to learn most morphological phenomena and generalize to unseen inputs. The analysis part will show a few simple experiments with edit scripts on Czech, including a comparison with the dictionary-based morphological analysis by Hajic (2004).

Length:

00:52:43

Date:

11/11/2013

Video Recordings

Institute of Formal and Applied Linguistics

Learning Morphology From The Corpus