Can Unconditional Language Models Recover Arbitrary Sentences?

Abstract

Neural network-based generative language models like ELMo and BERT can workeffectively as general purpose sentence encoders in text classification withoutfurther fine-tuning. Is it possible to adapt them in a similar way for use asgeneral-purpose decoders? For this to be possible, it would need to be the casethat for any target sentence of interest, there is some continuousrepresentation that can be passed to the language model to cause it toreproduce that sentence. We set aside the difficult problem of designing anencoder that can produce such representations and instead ask directly whethersuch representations exist at all. To do this, we introduce a pair of effectivecomplementary methods for feeding representations into pretrained unconditionallanguage models and a corresponding set of methods to map sentences into andout of this representation space, the \textit{reparametrized sentence space}.We then investigate the conditions under which a language model can be made togenerate a sentence through the identification of a point in such a space andfind that it is possible to recover arbitrary sentences nearly perfectly withlanguage models and representations of moderate size.

Quick Read (beta)

loading the full paper ...