PEAR: Primitive Enabled Adaptive Relabeling for Boosting Hierarchical Reinforcement Learning

Abstract

Hierarchical reinforcement learning (HRL) has the potential to solve complexlong horizon tasks using temporal abstraction and increased exploration.However, hierarchical agents are difficult to train due to inherentnon-stationarity. We present primitive enabled adaptive relabeling (PEAR), atwo-phase approach where we first perform adaptive relabeling on a few expertdemonstrations to generate efficient subgoal supervision, and then jointlyoptimize HRL agents by employing reinforcement learning (RL) and imitationlearning (IL). We perform theoretical analysis to bound the sub-optimality ofour approach and derive a joint optimization framework using RL and IL. SincePEAR utilizes only a few expert demonstrations and considers minimal limitingassumptions on the task structure, it can be easily integrated with typicaloff-policy RL algorithms to produce a practical HRL approach. We performextensive experiments on challenging environments and show that PEAR is able tooutperform various hierarchical and non-hierarchical baselines and achieve upto$80\%$ success rates in complex sparse robotic control tasks where otherbaselines typically fail to show significant progress. We also performablations to thoroughly analyse the importance of our various design choices.Finally, we perform real world robotic experiments on complex tasks anddemonstrate that PEAR consistently outperforms the baselines.

Quick Read (beta)

loading the full paper ...