AdaVAE: Exploring Adaptive GPT-2s in Variational Auto-Encoders for Language Modeling

Abstract

Variational Auto-Encoder (VAE) has become the de-facto learning paradigm inachieving representation learning and generation for natural language at thesame time. Nevertheless, existing VAE-based language models either employelementary RNNs, which is not powerful to handle complex works in themulti-task situation, or fine-tunes two pre-trained language models (PLMs) forany downstream task, which is a huge drain on resources. In this paper, wepropose the first VAE framework empowered with adaptive GPT-2s (AdaVAE).Different from existing systems, we unify both the encoder\&decoder of the VAEmodel using GPT-2s with adaptive parameter-efficient components, and furtherintroduce Latent Attention operation to better construct latent space fromtransformer models. Experiments from multiple dimensions validate that AdaVAEis competent to effectively organize language in three related tasks (languagemodeling, representation modeling and guided text generation) even with lessthan $15\%$ activated parameters in training. Our code is available at\url{https://github.com/ImKeTT/AdaVAE}.

Quick Read (beta)

loading the full paper ...