Target Conditioning for One-to-Many Generation

Abstract

Neural Machine Translation (NMT) models often lack diversity in theirgenerated translations, even when paired with search algorithm, like beamsearch. A challenge is that the diversity in translations are caused by thevariability in the target language, and cannot be inferred from the sourcesentence alone. In this paper, we propose to explicitly model this one-to-manymapping by conditioning the decoder of a NMT model on a latent variable thatrepresents the domain of target sentences. The domain is a discrete variablegenerated by a target encoder that is jointly trained with the NMT model. Thepredicted domain of target sentences are given as input to the decoder duringtraining. At inference, we can generate diverse translations by decoding withdifferent domains. Unlike our strongest baseline (Shen et al., 2019), ourmethod can scale to any number of domains without affecting the performance orthe training time. We assess the quality and diversity of translationsgenerated by our model with several metrics, on three different datasets.

Quick Read (beta)

loading the full paper ...