Efficient Machine Translation

Speaker:
Nick Bogoychev
Abstract:
Neural networks are notorious for their high computational intensity and energy usage. However, recent advances in the field have made it possible to reduce their computational load to the point where machine translation systems can be run on a mobile phone.

In this talk, we will a take kaleidoscopic view of neural network optimisation, focusing on neural machine translation as a case study. We will cover model improvements, neural machine translation specific improvements and software improvements both on the GPU and the CPU. Combining all improvements we manage to decrease inference time by a factor of ~600 with a tiny drop in BLEU.  

Length:
00:56:21
Date:
07/09/2022
views: 652

Images:
Attachments: (video, slides, etc.)
158 MB
653 downloads