Learning Language Specific Sub-network for Multilingual Machine Translation

Abstract

Multilingual neural machine translation aims at learning a single translationmodel for multiple languages. These jointly trained models often suffer fromperformance degradation on rich-resource language pairs. We attribute thisdegeneration to parameter interference. In this paper, we propose LaSS tojointly train a single unified multilingual MT model. LaSS learns LanguageSpecific Sub-network (LaSS) for each language pair to counter parameterinterference. Comprehensive experiments on IWSLT and WMT datasets with variousTransformer architectures show that LaSS obtains gains on 36 language pairs byup to 1.2 BLEU. Besides, LaSS shows its strong generalization performance ateasy extension to new language pairs and zero-shot translation.LaSS boostszero-shot translation with an average of 8.3 BLEU on 30 language pairs. Codesand trained models are available at https://github.com/NLP-Playground/LaSS.

Quick Read (beta)

loading the full paper ...