Breaking the Curse of Multilinguality in Language Models
Speaker:
Terra Blevins (University of Vienna)
Abstract:
While language models (LMs) grow larger and gain new capabilities, their performance in non-English languages increasingly lags behind. This is due to the curse of multilinguality, where each individual language's performance suffers when models are trained on more languages. In this talk, I first examine how current language models do and don't capture different languages and uncover how the curse of multilinguality develops during multilingual model training. Building on these insights, I then present a new method, Multilingual Expert Language Models (X-ELM), that breaks this curse of multilinguality by facilitating more equitable multilingual language modeling. We show that X-ELMs provide many performance and efficiency benefits over exisiting multilingual modeling approaches, indicating their potential to democratize multilingual NLP.