Unsupervised Neural Machine Translation with Generative Language Models Only

  • 2021-10-11 17:35:34
  • Jesse Michael Han, Igor Babuschkin, Harrison Edwards, Arvind Neelakantan, Tao Xu, Stanislas Polu, Alex Ray, Pranav Shyam, Aditya Ramesh, Alec Radford, Ilya Sutskever
  • 7

Abstract

We show how to derive state-of-the-art unsupervised neural machinetranslation systems from generatively pre-trained language models. Our methodconsists of three steps: few-shot amplification, distillation, andbacktranslation. We first use the zero-shot translation ability of largepre-trained language models to generate translations for a small set ofunlabeled sentences. We then amplify these zero-shot translations by using themas few-shot demonstrations for sampling a larger synthetic dataset. Thisdataset is distilled by discarding the few-shot demonstrations and thenfine-tuning. During backtranslation, we repeatedly generate translations for aset of inputs and then fine-tune a single language model on both directions ofthe translation task at once, ensuring cycle-consistency by swapping the rolesof gold monotext and generated translations when fine-tuning. By using ourmethod to leverage GPT-3's zero-shot translation capability, we achieve a newstate-of-the-art in unsupervised translation on the WMT14 English-Frenchbenchmark, attaining a BLEU score of 42.1.

 

Quick Read (beta)

loading the full paper ...