Model-based Reinforcement Learning: A Survey

  • 2021-02-25 15:09:38
  • Thomas M. Moerland, Joost Broekens, Catholijn M. Jonker
  • 0

Abstract

Sequential decision making, commonly formalized as Markov Decision Process(MDP) optimization, is a key challenge in artificial intelligence. Two keyapproaches to this problem are reinforcement learning (RL) and planning. Thispaper presents a survey of the integration of both fields, better known asmodel-based reinforcement learning. Model-based RL has two main steps. First,we systematically cover approaches to dynamics model learning, includingchallenges like dealing with stochasticity, uncertainty, partial observability,and temporal abstraction. Second, we present a systematic categorization ofplanning-learning integration, including aspects like: where to start planning,what budgets to allocate to planning and real data collection, how to plan, andhow to integrate planning in the learning and acting loop. After these twosection, we also discuss implicit model-based RL as an end-to-end alternativefor model learning and planning, and we cover the potential benefits ofmodel-based RL, like enhanced data efficiency, targeted exploration, andimproved stability. The survey also draws connection to several related RLfields, like hierarchical RL and transfer. Altogether, the survey presents abroad conceptual overview of planning-learning combinations for MDPoptimization.

 

Quick Read (beta)

loading the full paper ...