DIRECTOR: Generator-Classifiers For Supervised Language Modeling

  • 2022-06-15 18:44:08
  • Kushal Arora, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston
  • 1

Abstract

Current language models achieve low perplexity but their resultinggenerations still suffer from toxic responses, repetitiveness andcontradictions. The standard language modeling setup fails to address theseissues. In this paper, we introduce a new architecture, {\sc Director}, thatconsists of a unified generator-classifier with both a language modeling and aclassification head for each output token. Training is conducted jointly usingboth standard language modeling data, and data labeled with desirable andundesirable sequences. Experiments in several settings show that the model hascompetitive training and decoding speed compared to standard language modelswhile yielding superior results, alleviating known issues while maintaininggeneration quality. It also outperforms existing model guiding approaches interms of both accuracy and efficiency.

 

Quick Read (beta)

loading the full paper ...