A Survey of Meta-Reinforcement Learning

Abstract

While deep reinforcement learning (RL) has fueled multiple high-profilesuccesses in machine learning, it is held back from more widespread adoption byits often poor data efficiency and the limited generality of the policies itproduces. A promising approach for alleviating these limitations is to cast thedevelopment of better RL algorithms as a machine learning problem itself in aprocess called meta-RL. Meta-RL is most commonly studied in a problem settingwhere, given a distribution of tasks, the goal is to learn a policy that iscapable of adapting to any new task from the task distribution with as littledata as possible. In this survey, we describe the meta-RL problem setting indetail as well as its major variations. We discuss how, at a high level,meta-RL research can be clustered based on the presence of a task distributionand the learning budget available for each individual task. Using theseclusters, we then survey meta-RL algorithms and applications. We conclude bypresenting the open problems on the path to making meta-RL part of the standardtoolbox for a deep RL practitioner.

Quick Read (beta)

loading the full paper ...