A Bilingual Generative Transformer for Semantic Sentence Embedding

  • 2020-11-19 17:21:10
  • John Wieting, Graham Neubig, Taylor Berg-Kirkpatrick
  • 0


Semantic sentence embedding models encode natural language sentences intovectors, such that closeness in embedding space indicates closeness in thesemantics between the sentences. Bilingual data offers a useful signal forlearning such embeddings: properties shared by both sentences in a translationpair are likely semantic, while divergent properties are likely stylistic orlanguage-specific. We propose a deep latent variable model that attempts toperform source separation on parallel sentences, isolating what they have incommon in a latent semantic vector, and explaining what is left over withlanguage-specific latent vectors. Our proposed approach differs from past workon semantic sentence encoding in two ways. First, by using a variationalprobabilistic framework, we introduce priors that encourage source separation,and can use our model's posterior to predict sentence embeddings formonolingual data at test time. Second, we use high-capacity transformers asboth data generating distributions and inference networks -- contrasting withmost past work on sentence embeddings. In experiments, our approachsubstantially outperforms the state-of-the-art on a standard suite ofunsupervised semantic similarity evaluations. Further, we demonstrate that ourapproach yields the largest gains on more difficult subsets of theseevaluations where simple word overlap is not a good indicator of similarity.


Quick Read (beta)

loading the full paper ...