Driving Reinforcement Learning with Models

Abstract

Over the years, Reinforcement Learning (RL) established itself as aconvenient paradigm to learn optimal policies from data. However, most RLalgorithms achieve optimal policies by exploring all the possible actions andthis, in real-world scenarios, is often infeasible or impractical due to e.g.safety constraints. Motivated by this, in this paper we propose to augment RLwith Model Predictive Control (MPC), a popular model-based control algorithmthat allows to optimally control a system while satisfying a set ofconstraints. The result is an algorithm, the MPC-augmented RL algorithm(MPCaRL) that makes use of MPC to both drive how RL explores the actions and tomodify the corresponding rewards. We demonstrate the effectiveness of theMPCaRL by letting it play against the Atari game Pong. The results obtainedhighlight the ability of the algorithm to learn general tasks with essentiallyno training.

Quick Read (beta)

loading the full paper ...