Abstract
We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm inthe context of Reinforcement Learning (RL). SAC-X enables learning of complexbehaviors - from scratch - in the presence of multiple sparse reward signals.To this end, the agent is equipped with a set of general auxiliary tasks, thatit attempts to learn simultaneously via off-policy RL. The key idea behind ourmethod is that active (learned) scheduling and execution of auxiliary policiesallows the agent to efficiently explore its environment - enabling it to excelat sparse reward RL. Our experiments in several challenging roboticmanipulation settings demonstrate the power of our approach.
Quick Read (beta)
loading the full paper ...