Adapting Multilingual Neural Machine Translation to Unseen Languages

Abstract

Multilingual Neural Machine Translation (MNMT) for low-resource languages(LRL) can be enhanced by the presence of related high-resource languages (HRL),but the relatedness of HRL usually relies on predefined linguistic assumptionsabout language similarity. Recently, adapting MNMT to a LRL has shown togreatly improve performance. In this work, we explore the problem of adaptingan MNMT model to an unseen LRL using data selection and model adaptation. Inorder to improve NMT for LRL, we employ perplexity to select HRL data that aremost similar to the LRL on the basis of language distance. We extensivelyexplore data selection in popular multilingual NMT settings, namely in(zero-shot) translation, and in adaptation from a multilingual pre-trainedmodel, for both directions (LRL-en). We further show that dynamic adaptation ofthe model's vocabulary results in a more favourable segmentation for the LRL incomparison with direct adaptation. Experiments show reductions in training timeand significant performance gains over LRL baselines, even with zero LRL data(+13.0 BLEU), up to +17.0 BLEU for pre-trained multilingual model dynamicadaptation with related data selection. Our method outperforms currentapproaches, such as massively multilingual models and data augmentation, onfour LRL.

Quick Read (beta)

loading the full paper ...