Learning Morphology From The Corpus

Speaker:
Ondřej Dušek
Abstract:
This talk will present a simple method of learning morphological analysis and generation from annotated data, aimed at robustness to unseen inputs. We will focus mainly on the generation part, where we use a trainable classifier to predict "edit scripts" that are used to transform lemmas into inflected word forms. Our morphology generation system has been evaluated on 6 languages and shown to be able to learn most morphological phenomena and generalize to unseen inputs. The analysis part will show a few simple experiments with edit scripts on Czech, including a comparison with the dictionary-based morphological analysis by Hajic (2004).
Length:
00:52:43
Date:
11/11/2013
views: 1064

Images:
Preview of img-001.jpg
Image img-001.jpg
Preview of img-002.jpg
Image img-002.jpg
Preview of img-003.jpg
Image img-003.jpg
Attachments: (video, slides, etc.)
49M
290 downloads
54M
271 downloads
255M
282 downloads
1,2M
346 downloads