Distributionally Robust Model-based Reinforcement Learning with Large State Spaces

Abstract

Three major challenges in reinforcement learning are the complex dynamicalsystems with large state spaces, the costly data acquisition processes, and thedeviation of real-world dynamics from the training environment deployment. Toovercome these issues, we study distributionally robust Markov decisionprocesses with continuous state spaces under the widely used Kullback-Leibler,chi-square, and total variation uncertainty sets. We propose a model-basedapproach that utilizes Gaussian Processes and the maximum variance reductionalgorithm to efficiently learn multi-output nominal transition dynamics,leveraging access to a generative model (i.e., simulator). We furtherdemonstrate the statistical sample complexity of the proposed method fordifferent uncertainty sets. These complexity bounds are independent of thenumber of states and extend beyond linear dynamics, ensuring the effectivenessof our approach in identifying near-optimal distributionally-robust policies.The proposed method can be further combined with other model-freedistributionally robust reinforcement learning methods to obtain a near-optimalrobust policy. Experimental results demonstrate the robustness of our algorithmto distributional shifts and its superior performance in terms of the number ofsamples needed.

Quick Read (beta)

loading the full paper ...