Certified Adversarial Robustness for Deep Reinforcement Learning

Abstract

Deep Neural Network-based systems are now the state-of-the-art in manyrobotics tasks, but their application in safety-critical domains remainsdangerous without formal guarantees on network robustness. Small perturbationsto sensor inputs (from noise or adversarial examples) are often enough tochange network-based decisions, which was already shown to cause an autonomousvehicle to swerve into oncoming traffic. In light of these dangers, numerousalgorithms have been developed as defensive mechanisms from these adversarialinputs, some of which provide formal robustness guarantees or certificates.This work leverages research on certified adversarial robustness to develop anonline certified defense for deep reinforcement learning algorithms. Theproposed defense computes guaranteed lower bounds on state-action values duringexecution to identify and choose the optimal action under a worst-casedeviation in input space due to possible adversaries or noise. The approach isdemonstrated on a Deep Q-Network policy and is shown to increase robustness tonoise and adversaries in pedestrian collision avoidance scenarios and a classiccontrol task.

Quick Read (beta)

loading the full paper ...