Causally Correct Partial Models for Reinforcement Learning

  • 2020-02-07 15:18:15
  • Danilo J. Rezende, Ivo Danihelka, George Papamakarios, Nan Rosemary Ke, Ray Jiang, Theophane Weber, Karol Gregor, Hamza Merzic, Fabio Viola, Jane Wang, Jovana Mitrovic, Frederic Besse, Ioannis Antonoglou, Lars Buesing
  • 11

Abstract

In reinforcement learning, we can learn a model of future observations andrewards, and use it to plan the agent's next actions. However, jointly modelingfuture observations can be computationally expensive or even intractable if theobservations are high-dimensional (e.g. images). For this reason, previousworks have considered partial models, which model only part of the observation.In this paper, we show that partial models can be causally incorrect: they areconfounded by the observations they don't model, and can therefore lead toincorrect planning. To address this, we introduce a general family of partialmodels that are provably causally correct, yet remain fast because they do notneed to fully model future observations.

 

Quick Read (beta)

loading the full paper ...