Adversarial Robust Deep Reinforcement Learning Requires Redefining Robustness

Abstract

Learning from raw high dimensional data via interaction with a givenenvironment has been effectively achieved through the utilization of deepneural networks. Yet the observed degradation in policy performance caused byimperceptible worst-case policy dependent translations along high sensitivitydirections (i.e. adversarial perturbations) raises concerns on the robustnessof deep reinforcement learning policies. In our paper, we show that these highsensitivity directions do not lie only along particular worst-case directions,but rather are more abundant in the deep neural policy landscape and can befound via more natural means in a black-box setting. Furthermore, we show thatvanilla training techniques intriguingly result in learning more robustpolicies compared to the policies learnt via the state-of-the-art adversarialtraining techniques. We believe our work lays out intriguing properties of thedeep reinforcement learning policy manifold and our results can help to buildrobust and generalizable deep reinforcement learning policies.

Quick Read (beta)

loading the full paper ...