Multilingual Neural Machine Translation for Zero-Resource Languages

  • 2019-09-16 17:22:25
  • Surafel M. Lakew, Marcello Federico, Matteo Negri, Marco Turchi
  • 0

Abstract

In recent years, Neural Machine Translation (NMT) has been shown to be moreeffective than phrase-based statistical methods, thus quickly becoming thestate of the art in machine translation (MT). However, NMT systems are limitedin translating low-resourced languages, due to the significant amount ofparallel data that is required to learn useful mappings between languages. Inthis work, we show how the so-called multilingual NMT can help to tackle thechallenges associated with low-resourced language translation. The underlyingprinciple of multilingual NMT is to force the creation of hiddenrepresentations of words in a shared semantic space across multiple languages,thus enabling a positive parameter transfer across languages. Along thisdirection, we present multilingual translation experiments with three languages(English, Italian, Romanian) covering six translation directions, utilizingboth recurrent neural networks and transformer (or self-attentive) neuralnetworks. We then focus on the zero-shot translation problem, that is how toleverage multi-lingual data in order to learn translation directions that arenot covered by the available training material. To this aim, we introduce ourrecently proposed iterative self-training method, which incrementally improvesa multilingual NMT on a zero-shot direction by just relying on monolingualdata. Our results on TED talks data show that multilingual NMT outperformsconventional bilingual NMT, that the transformer NMT outperforms recurrent NMT,and that zero-shot NMT outperforms conventional pivoting methods and evenmatches the performance of a fully-trained bilingual system.

 

Quick Read (beta)

loading the full paper ...