Introducing Context Lexical Models To TectoMT

Speaker:
Michal Novak
Abstract:
Much of the attention in the MT community has recently been devoted to improving the lexical choice in the target language by capturing a context wider than just a single sentence. In this talk I will present our contribution to these efforts, particularly the progress in enriching translation models for English to Czech translation within the TectoMT system. First of all, we performed a pure technical step. We replaced the MaxEnt learning module used so far with the Vowpal Wabbit learning toolkit. It not only allows us to train our models faster, to exploit more training examples and features, but it also offers rich parametrization options, which can together lead to improvement of our MT system. However, the main objective of this ongoing work is to explore the potential of lexical context features to improve the translation. We do so by including the standard bag-of-words features and by introducing novel features representing concepts coming from Explicit Semantic Analysis, which was originally developed in the field of Information Retrieval.
Length:
00:53:07
Date:
19/11/2012
views: 814

Images:
Preview of img-019.jpg
Image img-019.jpg
Preview of img-021.jpg
Image img-021.jpg
Preview of img-026.jpg
Image img-026.jpg
Attachments: (video, slides, etc.)
24.3M
348 downloads
63.8M
345 downloads
245.8M
337 downloads