An Analysis of Categorical Distributional Reinforcement Learning

  • 2018-02-22 16:50:08
  • Mark Rowland, Marc G. Bellemare, Will Dabney, RĂ©mi Munos, Yee Whye Teh
  • 5


Distributional approaches to value-based reinforcement learning model theentire distribution of returns, rather than just their expected values, andhave recently been shown to yield state-of-the-art empirical performance. Thiswas demonstrated by the recently proposed C51 algorithm, based on categoricaldistributional reinforcement learning (CDRL) [Bellemare et al., 2017]. However,the theoretical properties of CDRL algorithms are not yet well understood. Inthis paper, we introduce a framework to analyse CDRL algorithms, establish theimportance of the projected distributional Bellman operator in distributionalRL, draw fundamental connections between CDRL and the Cram\'er distance, andgive a proof of convergence for sample-based categorical distributionalreinforcement learning algorithms.


Quick Read (beta)

loading the full paper ...