Kickstarting Deep Reinforcement Learning

Abstract

We present a method for using previously-trained 'teacher' agents tokickstart the training of a new 'student' agent. To this end, we leverage ideasfrom policy distillation and population based training. Our method places noconstraints on the architecture of the teacher or student agents, and itregulates itself to allow the students to surpass their teachers inperformance. We show that, on a challenging and computationally-intensivemulti-task benchmark (DMLab-30), kickstarted training improves the dataefficiency of new agents, making it significantly easier to iterate on theirdesign. We also show that the same kickstarting pipeline can allow a singlestudent agent to leverage multiple 'expert' teachers which specialize onindividual tasks. In this setting kickstarting yields surprisingly large gains,with the kickstarted agent matching the performance of an agent trained fromscratch in almost 10x fewer steps, and surpassing its final performance by 42percent. Kickstarting is conceptually simple and can easily be incorporatedinto reinforcement learning experiments.

Quick Read (beta)

loading the full paper ...