Complete Multilingual Neural Machine Translation

  • 2020-10-20 13:03:48
  • Markus Freitag, Orhan Firat
  • 3

Abstract

Multilingual Neural Machine Translation (MNMT) models are commonly trained ona joint set of bilingual corpora which is acutely English-centric (i.e. Englisheither as the source or target language). While direct data between twolanguages that are non-English is explicitly available at times, its use is notcommon. In this paper, we first take a step back and look at the commonly usedbilingual corpora (WMT), and resurface the existence and importance of implicitstructure that existed in it: multi-way alignment across examples (the samesentence in more than two languages). We set out to study the use of multi-wayaligned examples to enrich the original English-centric parallel corpora. Wereintroduce this direct parallel data from multi-way aligned corpora betweenall source and target languages. By doing so, the English-centric graph expandsinto a complete graph, every language pair being connected. We call MNMT withsuch connectivity pattern complete Multilingual Neural Machine Translation(cMNMT) and demonstrate its utility and efficacy with a series of experimentsand analysis. In combination with a novel training data sampling strategy thatis conditioned on the target language only, cMNMT yields competitivetranslation quality for all language pairs. We further study the size effect ofmulti-way aligned data, its transfer learning capabilities and how it easesadding a new language in MNMT. Finally, we stress test cMNMT at scale anddemonstrate that we can train a cMNMT model with up to 111*112=12,432 languagepairs that provides competitive translation quality for all language pairs.

 

Quick Read (beta)

loading the full paper ...