Putting Machine Translation in Context with the Noisy Channel Model

  • 2019-10-01 17:30:56
  • Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer
  • 8

Abstract

We show that Bayes' rule provides a compelling mechanism for controllingunconditional document language models, using the long-standing challenge ofeffectively leveraging document context in machine translation. In ourformulation, we estimate the probability of a candidate translation as theproduct of the unconditional probability of the candidate output document andthe ``reverse translation probability'' of translating the candidate outputback into the input source language document---the so-called ``noisy channel''decomposition. A particular advantage of our model is that it requires onlyparallel sentences to train, rather than parallel documents, which are notalways available. Using a new beam search reranking approximation to solve thedecoding problem, we find that document language models outperform languagemodels that assume independence between sentences, and that using either adocument or sentence language model outperforms comparable models that directlyestimate the translation probability. We obtain the best-published results onthe NIST Chinese--English translation task, a standard task for evaluatingdocument translation. Our model also outperforms the benchmark Transformermodel by approximately 2.5 BLEU on the WMT19 Chinese--English translation task.

 

Quick Read (beta)

loading the full paper ...