Unsupervised Neural Machine Translation with Generative Language Models Only

Abstract

We show how to derive state-of-the-art unsupervised neural machinetranslation systems from generatively pre-trained language models. Our methodconsists of three steps: few-shot amplification, distillation, andbacktranslation. We first use the zero-shot translation ability of largepre-trained language models to generate translations for a small set ofunlabeled sentences. We then amplify these zero-shot translations by using themas few-shot demonstrations for sampling a larger synthetic dataset. Thisdataset is distilled by discarding the few-shot demonstrations and thenfine-tuning. During backtranslation, we repeatedly generate translations for aset of inputs and then fine-tune a single language model on both directions ofthe translation task at once, ensuring cycle-consistency by swapping the rolesof gold monotext and generated translations when fine-tuning. By using ourmethod to leverage GPT-3's zero-shot translation capability, we achieve a newstate-of-the-art in unsupervised translation on the WMT14 English-Frenchbenchmark, attaining a BLEU score of 42.1.

Quick Read (beta)

loading the full paper ...