A Survey of Zero-shot Generalisation in Deep Reinforcement Learning

Abstract

The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning(RL) aims to produce RL algorithms whose policies generalise well to novelunseen situations at deployment time, avoiding overfitting to their trainingenvironments. Tackling this is vital if we are to deploy reinforcement learningalgorithms in real world scenarios, where the environment will be diverse,dynamic and unpredictable. This survey is an overview of this nascent field. Werely on a unifying formalism and terminology for discussing different ZSGproblems, building upon previous works. We go on to categorise existingbenchmarks for ZSG, as well as current methods for tackling these problems.Finally, we provide a critical discussion of the current state of the field,including recommendations for future work. Among other conclusions, we arguethat taking a purely procedural content generation approach to benchmark designis not conducive to progress in ZSG, we suggest fast online adaptation andtackling RL-specific problems as some areas for future work on methods for ZSG,and we recommend building benchmarks in underexplored problem settings such asoffline RL ZSG and reward-function variation.

Quick Read (beta)

loading the full paper ...