When and Why are Pre-trainedWord Embeddings Useful for Neural Machine Translation?

Abstract

The performance of Neural Machine Translation (NMT) systems often suffers inlow-resource scenarios where sufficiently large-scale parallel corpora cannotbe obtained. Pre-trained word embeddings have proven to be invaluable forimproving performance in natural language analysis tasks, which often sufferfrom paucity of data. However, their utility for NMT has not been extensivelyexplored. In this work, we perform five sets of experiments that analyze whenwe can expect pre-trained word embeddings to help in NMT tasks. We show thatsuch embeddings can be surprisingly effective in some cases -- providing gainsof up to 20 BLEU points in the most favorable setting.

Quick Read (beta)

loading the full paper ...