Wasserstein Robust Reinforcement Learning

Abstract

Reinforcement learning algorithms, though successful, tend to over-fit totraining environments hampering their application to the real-world. This paperproposes $\text{W}\text{R}^{2}\text{L}$ -- a robust reinforcement learningalgorithm with significant robust performance on low and high-dimensionalcontrol tasks. Our method formalises robust reinforcement learning as a novelmin-max game with a Wasserstein constraint for a correct and convergent solver.Apart from the formulation, we also propose an efficient and scalable solverfollowing a novel zero-order optimisation method that we believe can be usefulto numerical optimisation in general. We empirically demonstrate significantgains compared to standard and robust state-of-the-art algorithms onhigh-dimensional MuJuCo environments.

Quick Read (beta)

loading the full paper ...