DisCoRL: Continual Reinforcement Learning via Policy Distillation

Abstract

In multi-task reinforcement learning there are two main challenges: attraining time, the ability to learn different policies with a single model; attest time, inferring which of those policies applying without an externalsignal. In the case of continual reinforcement learning a third challengearises: learning tasks sequentially without forgetting the previous ones. Inthis paper, we tackle these challenges by proposing DisCoRL, an approachcombining state representation learning and policy distillation. We experimenton a sequence of three simulated 2D navigation tasks with a 3 wheelomni-directional robot. Moreover, we tested our approach's robustness bytransferring the final policy into a real life setting. The policy can solveall tasks and automatically infer which one to run.

Quick Read (beta)

loading the full paper ...