Robust Deep Reinforcement Learning through Bootstrapped Opportunistic Curriculum

Abstract

Despite considerable advances in deep reinforcement learning, it has beenshown to be highly vulnerable to adversarial perturbations to stateobservations. Recent efforts that have attempted to improve adversarialrobustness of reinforcement learning can nevertheless tolerate only very smallperturbations, and remain fragile as perturbation size increases. We proposeBootstrapped Opportunistic Adversarial Curriculum Learning (BCL), a novelflexible adversarial curriculum learning framework for robust reinforcementlearning. Our framework combines two ideas: conservatively bootstrapping eachcurriculum phase with highest quality solutions obtained from multiple runs ofthe previous phase, and opportunistically skipping forward in the curriculum.In our experiments we show that the proposed BCL framework enables dramaticimprovements in robustness of learned policies to adversarial perturbations.The greatest improvement is for Pong, where our framework yields robustness toperturbations of up to 25/255; in contrast, the best existing approach can onlytolerate adversarial noise up to 5/255. Our code is available at:https://github.com/jlwu002/BCL.

Quick Read (beta)

loading the full paper ...