Abstract
Deep reinforcement learning (RL) agents often fail to generalize to unseenenvironments (yet semantically similar to trained agents), particularly whenthey are trained on high-dimensional state spaces, such as images. In thispaper, we propose a simple technique to improve a generalization ability ofdeep RL agents by introducing a randomized (convolutional) neural network thatrandomly perturbs input observations. It enables trained agents to adapt to newdomains by learning robust features invariant across varied and randomizedenvironments. Furthermore, we consider an inference method based on the MonteCarlo approximation to reduce the variance induced by this randomization. Wedemonstrate the superiority of our method across 2D CoinRun, 3D DeepMind Labexploration and 3D robotics control tasks: it significantly outperforms variousregularization and data augmentation methods for the same purpose.