A Tour of Reinforcement Learning: The View from Continuous Control

Abstract

This manuscript surveys reinforcement learning from the perspective ofoptimization and control with a focus on continuous control applications. Itsurveys the general formulation, terminology, and typical experimentalimplementations of reinforcement learning and reviews competing solutionparadigms. In order to compare the relative merits of various techniques, thissurvey presents a case study of the Linear Quadratic Regulator (LQR) withunknown dynamics, perhaps the simplest and best-studied problem in optimalcontrol. The manuscript describes how merging techniques from learning theoryand control can provide non-asymptotic characterizations of LQR performance andshows that these characterizations tend to match experimental behavior. Inturn, when revisiting more complex applications, many of the observed phenomenain LQR persist. In particular, theory and experiment demonstrate the role andimportance of models and the cost of generality in reinforcement learningalgorithms. This survey concludes with a discussion of some of the challengesin designing learning systems that safely and reliably interact with complexand uncertain environments and how tools from reinforcement learning andcontrol might be combined to approach these challenges.

Quick Read (beta)

loading the full paper ...