A Boolean Task Algebra for Reinforcement Learning

Abstract

The ability to compose learned skills to solve new tasks is an importantproperty of lifelong-learning agents. In this work, we formalise the logicalcomposition of tasks as a Boolean algebra. This allows us to formulate newtasks in terms of the negation, disjunction and conjunction of a set of basetasks. We then show that by learning goal-oriented value functions andrestricting the transition dynamics of the tasks, an agent can solve these newtasks with no further learning. We prove that by composing these valuefunctions in specific ways, we immediately recover the optimal policies for alltasks expressible under the Boolean algebra. We verify our approach in twodomains---including a high-dimensional video game environment requiringfunction approximation---where an agent first learns a set of base skills, andthen composes them to solve a super-exponential number of new tasks.

Quick Read (beta)

loading the full paper ...