Time Matters in Using Data Augmentation for Vision-based Deep Reinforcement Learning

Abstract

Data augmentation technique from computer vision has been widely consideredas a regularization method to improve data efficiency and generalizationperformance in vision-based reinforcement learning. We variate the timing ofusing augmentation, which is, in turn, critical depending on tasks to be solvedin training and testing. According to our experiments on Open AI ProcgenBenchmark, if the regularization imposed by augmentation is helpful only intesting, it is better to procrastinate the augmentation after training than touse it during training in terms of sample and computation complexity. We notethat some of such augmentations can disturb the training process. Conversely,an augmentation providing regularization useful in training needs to be usedduring the whole training period to fully utilize its benefit in terms of notonly generalization but also data efficiency. These phenomena suggest a usefultiming control of data augmentation in reinforcement learning.

Quick Read (beta)

loading the full paper ...