Multi-Source Neural Machine Translation with Data Augmentation

Abstract

Multi-source translation systems translate from multiple languages to asingle target language. By using information from these multiple sources, thesesystems achieve large gains in accuracy. To train these systems, it isnecessary to have corpora with parallel text in multiple sources and the targetlanguage. However, these corpora are rarely complete in practice due to thedifficulty of providing human translations in all of the relevant languages. Inthis paper, we propose a data augmentation approach to fill such incompleteparts using multi-source neural machine translation (NMT). In our experiments,results varied over different language combinations but significant gains wereobserved when using a source language similar to the target language.

Quick Read (beta)

loading the full paper ...