Tackling Intrinsic Uncertainty with SCONES

Speaker:
Felix Stahlberg
Abstract:
In many natural language processing (NLP) tasks the same input can have multiple possible outputs. In machine translation (MT), for example, a source sentence may have multiple acceptable translations. We describe how this kind of ambiguity (also known as intrinsic uncertainty) shapes the distributions learned by neural sequence models, and how it impacts various aspects of search such as the inductive biases in beam search and the complexity of exact search. We show that well-known pathologies such as a high number of beam search errors, the inadequacy of the mode, and the drop in system performance with large beam sizes (the beam search curse) apply to tasks with high level of ambiguity such as MT but not to less uncertain tasks such as grammatical error correction. The second part of the talk discusses an attempt to mitigate the negative effects of intrinsic uncertainty to sequence models called SCONES, which frames MT as a multi-label classification problem. We demonstrate that SCONES can be tuned to either improve the translation quality or runtime of traditional softmax-based models, or to fix model pathologies like the beam search curse that are connected with intrinsic uncertainty.
Length:
00:58:55
Date:
08/09/2022
views: 436

Images:
Attachments: (video, slides, etc.)
195 MB
437 downloads