Abstract
We present a method for combining multi-agent communication and traditionaldata-driven approaches to natural language learning, with an end goal ofteaching agents to communicate with humans in natural language. Our startingpoint is a language model that has been trained on generic, not task-specificlanguage data. We then place this model in a multi-agent self-play environmentthat generates task-specific rewards used to adapt or modulate the model,turning it into a task-conditional language model. We introduce a new way forcombining the two types of learning based on the idea of reranking languagemodel samples, and show that this method outperforms others in communicatingwith humans in a visual referential communication task. Finally, we present ataxonomy of different types of language drift that can occur alongside a set ofmeasures to detect them.