Near-Optimal Representation Learning for Hierarchical Reinforcement Learning

Abstract

We study the problem of representation learning in goal-conditionedhierarchical reinforcement learning. In such hierarchical structures, ahigher-level controller solves tasks by iteratively communicating goals which alower-level policy is trained to reach. Accordingly, the choice ofrepresentation -- the mapping of observation space to goal space -- is crucial.To study this problem, we develop a notion of sub-optimality of arepresentation, defined in terms of expected reward of the optimal hierarchicalpolicy using this representation. We derive expressions which bound thesub-optimality and show how these expressions can be translated torepresentation learning objectives which may be optimized in practice. Resultson a number of difficult continuous-control tasks show that our approach torepresentation learning yields qualitatively better representations as well asquantitatively better hierarchical policies, compared to existing methods (seevideos at https://sites.google.com/view/representation-hrl).

Quick Read (beta)

loading the full paper ...