Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary

Abstract

We propose a method to transfer knowledge across neural machine translation(NMT) models by means of a shared dynamic vocabulary. Our approach allows toextend an initial model for a given language pair to cover new languages byadapting its vocabulary as long as new data become available (i.e., introducingnew vocabulary items if they are not included in the initial model). Theparameter transfer mechanism is evaluated in two scenarios: i) to adapt atrained single language NMT system to work with a new language pair and ii) tocontinuously add new language pairs to grow to a multilingual NMT system. Inboth the scenarios our goal is to improve the translation performance, whileminimizing the training convergence time. Preliminary experiments spanning fivelanguages with different training data sizes (i.e., 5k and 50k parallelsentences) show a significant performance gain ranging from +3.85 up to +13.63BLEU in different language directions. Moreover, when compared with training anNMT model from scratch, our transfer-learning approach allows us to reachhigher performance after training up to 4% of the total training steps.

Quick Read (beta)

loading the full paper ...