Abstract
Reinforcement learning provides a general framework for flexible decisionmaking and control, but requires extensive data collection for each new taskthat an agent needs to learn. In other machine learning fields, such as naturallanguage processing or computer vision, pre-training on large, previouslycollected datasets to bootstrap learning for new tasks has emerged as apowerful paradigm to reduce data requirements when learning a new task. In thispaper, we ask the following question: how can we enable similarly usefulpre-training for RL agents? We propose a method for pre-training behavioralpriors that can capture complex input-output relationships observed insuccessful trials from a wide range of previously seen tasks, and we show howthis learned prior can be used for rapidly learning new tasks without impedingthe RL agent's ability to try out novel behaviors. We demonstrate theeffectiveness of our approach in challenging robotic manipulation domainsinvolving image observations and sparse reward functions, where our methodoutperforms prior works by a substantial margin.