Backward Curriculum Reinforcement Learning

Abstract

Current reinforcement learning algorithms train an agent usingforward-generated trajectories, which provide little guidance so that the agentcan explore as much as possible. While realizing the value of reinforcementlearning results from sufficient exploration, this approach leads to atrade-off in losing sample efficiency, an essential factor impacting algorithmperformance. Previous tasks use reward-shaping techniques and network structuremodification to increase sample efficiency. However, these methods require manysteps to implement. In this work, we propose novel backward curriculumreinforcement learning that begins training the agent using the backwardtrajectory of the episode instead of the original forward trajectory. Thisapproach provides the agent with a strong reward signal, enabling moresample-efficient learning. Moreover, our method only requires a minor change inthe algorithm of reversing the order of the trajectory before agent training,allowing a straightforward application to any state-of-the-art algorithm.

Quick Read (beta)

loading the full paper ...