Exploiting Transformer in Reinforcement Learning for Interpretable Temporal Logic Motion Planning

Abstract

Automaton based approaches have enabled robots to perform various complextasks. However, most existing automaton based algorithms highly rely on themanually customized representation of states for the considered task, limitingits applicability in deep reinforcement learning algorithms. To address thisissue, by incorporating Transformer into reinforcement learning, we develop aDouble-Transformer-guided Temporal Logic framework (T2TL) that exploits thestructural feature of Transformer twice, i.e., first encoding the LTLinstruction via the Transformer module for efficient understanding of taskinstructions during the training and then encoding the context variable via theTransformer again for improved task performance. Particularly, the LTLinstruction is specified by co-safe LTL. As a semantics-preserving rewritingoperation, LTL progression is exploited to decompose the complex task intolearnable sub-goals, which not only converts non-Markovian reward decisionprocess to Markovian ones, but also improves the sampling efficiency bysimultaneous learning of multiple sub-tasks. An environment-agnostic LTLpre-training scheme is further incorporated to facilitate the learning of theTransformer module resulting in improved representation of LTL. The simulationand experiment results demonstrate the effectiveness of the T2TL framework.

Quick Read (beta)

loading the full paper ...