Distributional Reinforcement Learning with Dual Expectile-Quantile Regression

Abstract

Successful applications of distributional reinforcement learning withquantile regression prompt a natural question: can we use other statistics torepresent the distribution of returns? In particular, expectile regression isknown to be more efficient than quantile regression for approximatingdistributions, especially on extreme values, and by providing a straightforwardestimator of the mean it is a natural candidate for reinforcement learning.Prior work has answered this question positively in the case of expectiles,with the major caveat that expensive computations must be performed to ensureconvergence. In this work, we propose a dual expectile-quantile approach whichsolves the shortcomings of previous work while leveraging the complementaryproperties of expectiles and quantiles. Our method outperforms bothquantile-based and expectile-based baselines on the MuJoCo continuous controlbenchmark.

Quick Read (beta)

loading the full paper ...