Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models

  • 2019-07-18 16:59:40
  • Gongbo Tang, Rico Sennrich, Joakim Nivre
  • 9

Abstract

In this paper, we try to understand neural machine translation (NMT) viasimplifying NMT architectures and training encoder-free NMT models. In anencoder-free model, the sums of word embeddings and positional embeddingsrepresent the source. The decoder is a standard Transformer or recurrent neuralnetwork that directly attends to embeddings via attention mechanisms.Experimental results show (1) that the attention mechanism in encoder-freemodels acts as a strong feature extractor, (2) that the word embeddings inencoder-free models are competitive to those in conventional models, (3) thatnon-contextualized source representations lead to a big performance drop, and(4) that encoder-free models have different effects on alignment quality forGerman-English and Chinese-English.

 

Quick Read (beta)

loading the full paper ...