Estimating Risk and Uncertainty in Deep Reinforcement Learning

Abstract

We demonstrate a method for separately estimating aleatoric risk andepistemic uncertainty in deep reinforcement learning. Aleatoric risk, whicharises from inherently stochastic environments or agents, must be accounted forin the design of risk-sensitive algorithms. Epistemic uncertainty, which stemsfrom limited data, is important both for risk-sensitivity and to efficientlyexplore an environment. We present a Bayesian framework for learning the returndistribution in reinforcement learning, which provides theoretical foundationsfor quantifying both types of uncertainty. The variance of the returndistribution yields the aleatoric uncertainty, and our Bayesian formulationprovides the epistemic uncertainty. Based on our framework, we show that thedisagreement between only two neural networks is sufficient to produce anestimate of the epistemic uncertainty on the expected return, thus providing asimple and computationally cheap uncertainty metric. We demonstrate experimentsthat illustrate our method and some applications.

Quick Read (beta)

loading the full paper ...