UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning Leveraging Planning

Abstract

Offline reinforcement learning (RL) provides a framework for learningdecision-making from offline data and therefore constitutes a promisingapproach for real-world applications as automated driving. Self-drivingvehicles (SDV) learn a policy, which potentially even outperforms the behaviorin the sub-optimal data set. Especially in safety-critical applications asautomated driving, explainability and transferability are key to success. Thismotivates the use of model-based offline RL approaches, which leverageplanning. However, current state-of-the-art methods often neglect the influenceof aleatoric uncertainty arising from the stochastic behavior of multi-agentsystems. This work proposes a novel approach for Uncertainty-aware Model-BasedOffline REinforcement Learning Leveraging plAnning (UMBRELLA), which solves theprediction, planning, and control problem of the SDV jointly in aninterpretable learning-based fashion. A trained action-conditioned stochasticdynamics model captures distinctively different future evolutions of thetraffic scene. The analysis provides empirical evidence for the effectivenessof our approach in challenging automated driving simulations and based on areal-world public dataset.

Quick Read (beta)

loading the full paper ...