Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents

  • 2018-09-04 18:16:40
  • Haonan Yu, Xiaochen Lian, Haichao Zhang, Wei Xu
  • 0

Abstract

Recently there has been a rising interest in training agents, embodied invirtual environments, to perform language-directed tasks by deep reinforcementlearning. In this paper, we propose a simple but effective neural languagegrounding module for embodied agents that can be trained end to end fromscratch taking raw pixels, unstructured linguistic commands, and sparse rewardsas the inputs. We model the language grounding process as a language-guidedtransformation of visual features, where latent sentence embeddings are used asthe transformation matrices. In several language-directed navigation tasks thatfeature challenging partial observability and require simple reasoning, ourmodule significantly outperforms the state of the art. We also releaseXWorld3D, an easy-to-customize 3D environment that can potentially be modifiedto evaluate a variety of embodied agents.

 

Quick Read (beta)

loading the full paper ...