Abstract
Recent work on using natural language to specify commands to robots hasgrounded that language to LTL. However, mapping natural language taskspecifications to LTL task specifications using language models requireprobability distributions over finite vocabulary. Existing state-of-the-artmethods have extended this finite vocabulary to include unseen terms from theinput sequence to improve output generalization. However, novelout-of-vocabulary atomic propositions cannot be generated using these methods.To overcome this, we introduce an intermediate contextual query representationwhich can be learned from single positive task specification examples,associating a contextual query with an LTL template. We demonstrate that thisintermediate representation allows for generalization over unseen objectreferences, assuming accurate groundings are available. We compare our methodof mapping natural language task specifications to intermediate contextualqueries against state-of-the-art CopyNet models capable of translating naturallanguage to LTL, by evaluating whether correct LTL for manipulation andnavigation task specifications can be output, and show that our methodoutperforms the CopyNet model on unseen object references. We demonstrate thatthe grounded LTL our method outputs can be used for planning in a simulatedOO-MDP environment. Finally, we discuss some common failure modes encounteredwhen translating natural language task specifications to grounded LTL.