PEAR: Primitive enabled Adaptive Relabeling for boosting Hierarchical Reinforcement Learning

Abstract

Hierarchical reinforcement learning (HRL) has the potential to solve complexlong horizon tasks using temporal abstraction and increased exploration.However, hierarchical agents are difficult to train due to inherentnon-stationarity. We present primitive enabled adaptive relabeling (PEAR), atwo-phase approach where we first perform adaptive relabeling on a few expertdemonstrations to generate efficient subgoal supervision, and then jointlyoptimize HRL agents by employing reinforcement learning (RL) and imitationlearning (IL). We perform theoretical analysis to $(i)$ bound thesub-optimality of our approach, and $(ii)$ derive a generalized plug-and-playframework for joint optimization using RL and IL. PEAR uses a handful of expertdemonstrations and makes minimal limiting assumptions on the task structure.Additionally, it can be easily integrated with typical model free RL algorithmsto produce a practical HRL algorithm. We perform experiments on challengingrobotic environments and show that PEAR is able to solve tasks that requirelong term decision making. We empirically show that PEAR exhibits improvedperformance and sample efficiency over previous hierarchical andnon-hierarchical approaches. We also perform real world robotic experiments oncomplex tasks and demonstrate that PEAR consistently outperforms the baselines.

Quick Read (beta)

loading the full paper ...