Interactive Grounded Language Acquisition and Generalization in a 2D World

  • 2018-08-13 23:29:31
  • Haonan Yu, Haichao Zhang, Wei Xu
  • 0

Abstract

We build a virtual agent for learning language in a 2D maze-like world. Theagent sees images of the surrounding environment, listens to a virtual teacher,and takes actions to receive rewards. It interactively learns the teacher'slanguage from scratch based on two language use cases: sentence-directednavigation and question answering. It learns simultaneously the visualrepresentations of the world, the language, and the action control. Bydisentangling language grounding from other computational routines and sharinga concept detection function between language grounding and prediction, theagent reliably interpolates and extrapolates to interpret sentences thatcontain new word combinations or new words missing from training sentences. Thenew words are transferred from the answers of language prediction. Such alanguage ability is trained and evaluated on a population of over 1.6 milliondistinct sentences consisting of 119 object words, 8 color words, 9spatial-relation words, and 50 grammatical words. The proposed modelsignificantly outperforms five comparison methods for interpreting zero-shotsentences. In addition, we demonstrate human-interpretable intermediate outputsof the model in the appendix.

 

Quick Read (beta)

loading the full paper ...