Automatic Data Augmentation for Generalization in Deep Reinforcement Learning

Abstract

Deep reinforcement learning (RL) agents often fail to generalize to unseenscenarios, even when they are trained on many instances of semantically similarenvironments. Data augmentation has recently been shown to improve the sampleefficiency and generalization of RL agents. However, different tasks tend tobenefit from different kinds of data augmentation. In this paper, we comparethree approaches for automatically finding an appropriate augmentation. Theseare combined with two novel regularization terms for the policy and valuefunction, required to make the use of data augmentation theoretically sound forcertain actor-critic algorithms. We evaluate our methods on the Procgenbenchmark which consists of 16 procedurally-generated environments and showthat it improves test performance by ~40% relative to standard RL algorithms.Our agent outperforms other baselines specifically designed to improvegeneralization in RL. In addition, we show that our agent learns policies andrepresentations that are more robust to changes in the environment that do notaffect the agent, such as the background. Our implementation is available athttps://github.com/rraileanu/auto-drac.

Quick Read (beta)

loading the full paper ...