A Survey of Generalisation in Deep Reinforcement Learning

Abstract

The study of generalisation in deep Reinforcement Learning (RL) aims toproduce RL algorithms whose policies generalise well to novel unseen situationsat deployment time, avoiding overfitting to their training environments.Tackling this is vital if we are to deploy reinforcement learning algorithms inreal world scenarios, where the environment will be diverse, dynamic andunpredictable. This survey is an overview of this nascent field. We provide aunifying formalism and terminology for discussing different generalisationproblems, building upon previous works. We go on to categorise existingbenchmarks for generalisation, as well as current methods for tackling thegeneralisation problem. Finally, we provide a critical discussion of thecurrent state of the field, including recommendations for future work. Amongother conclusions, we argue that taking a purely procedural content generationapproach to benchmark design is not conducive to progress in generalisation, wesuggest fast online adaptation and tackling RL-specific problems as some areasfor future work on methods for generalisation, and we recommend buildingbenchmarks in underexplored problem settings such as offline RL generalisationand reward-function variation.

Quick Read (beta)

loading the full paper ...