Optimal Transport Perturbations for Safe Reinforcement Learning with Robustness Guarantees

Abstract

Robustness and safety are critical for the trustworthy deployment of deepreinforcement learning. Real-world decision making applications requirealgorithms that can guarantee robust performance and safety in the presence ofgeneral environment disturbances, while making limited assumptions on the datacollection process during training. In order to accomplish this goal, weintroduce a safe reinforcement learning framework that incorporates robustnessthrough the use of an optimal transport cost uncertainty set. We provide anefficient implementation based on applying Optimal Transport Perturbations toconstruct worst-case virtual state transitions, which does not impact datacollection during training and does not require detailed simulator access. Inexperiments on continuous control tasks with safety constraints, our approachdemonstrates robust performance while significantly improving safety atdeployment time compared to standard safe reinforcement learning.

Quick Read (beta)

loading the full paper ...