An efficient framework for learning sentence representations

Abstract

In this work we propose a simple and efficient framework for learningsentence representations from unlabelled data. Drawing inspiration from thedistributional hypothesis and recent work on learning sentence representations,we reformulate the problem of predicting the context in which a sentenceappears as a classification problem. Given a sentence and its context, aclassifier distinguishes context sentences from other contrastive sentencesbased on their vector representations. This allows us to efficiently learndifferent types of encoding functions, and we show that the model learnshigh-quality sentence representations. We demonstrate that our sentencerepresentations outperform state-of-the-art unsupervised and supervisedrepresentation learning methods on several downstream NLP tasks that involveunderstanding sentence semantics while achieving an order of magnitude speedupin training time.

Quick Read (beta)

loading the full paper ...