DisCoRL: Continual Reinforcement Learning via Policy Distillation

  • 2019-07-11 09:12:42
  • René Traoré, Hugo Caselles-Dupré, Timothée Lesort, Te Sun, Guanghang Cai, Natalia Díaz-Rodríguez, David Filliat
  • 3

Abstract

In multi-task reinforcement learning there are two main challenges: attraining time, the ability to learn different policies with a single model; attest time, inferring which of those policies applying without an externalsignal. In the case of continual reinforcement learning a third challengearises: learning tasks sequentially without forgetting the previous ones. Inthis paper, we tackle these challenges by proposing DisCoRL, an approachcombining state representation learning and policy distillation. We experimenton a sequence of three simulated 2D navigation tasks with a 3 wheelomni-directional robot. Moreover, we tested our approach's robustness bytransferring the final policy into a real life setting. The policy can solveall tasks and automatically infer which one to run.

 

Quick Read (beta)

loading the full paper ...