Hierarchical Decision Making by Generating and Following Natural Language Instructions

Abstract

We explore using latent natural language instructions as an expressive andcompositional representation of complex actions for hierarchical decisionmaking. Rather than directly selecting micro-actions, our agent first generatesa latent plan in natural language, which is then executed by a separate model.We introduce a challenging real-time strategy game environment in which theactions of a large number of units must be coordinated across long time scales.We gather a dataset of 76 thousand pairs of instructions and executions fromhuman play, and train instructor and executor models. Experiments show thatmodels using natural language as a latent variable significantly outperformmodels that directly imitate human actions. The compositional structure oflanguage proves crucial to its effectiveness for action representation. We alsorelease our code, models and data.

Quick Read (beta)

loading the full paper ...