Abstract
In this work, we analyze the performance of general deep reinforcementlearning algorithms for a task-oriented language grounding problem, wherelanguage input contains multiple sub-goals and their order of execution isnon-linear. We generate a simple instructional language for the GridWorld environment,that is built around three language elements (order connectors) defining theorder of execution: one linear - "comma" and two non-linear - "but first", "butbefore". We apply one of the deep reinforcement learning baselines - Double DQNwith frame stacking and ablate several extensions such as PrioritizedExperience Replay and Gated-Attention architecture. Our results show that the introduction of non-linear order connectorsimproves the success rate on instructions with a higher number of sub-goals in2-3 times, but it still does not exceed 20%. Also, we observe that the usage ofGated-Attention provides no competitive advantage against concatenation in thissetting. Source code and experiments' results are available athttps://github.com/vkurenkov/language-grounding-multigoal