Physics-Informed Model-Based Reinforcement Learning

Abstract

We apply reinforcement learning (RL) to robotics tasks. One of the drawbacksof traditional RL algorithms has been their poor sample efficiency. Oneapproach to improve the sample efficiency is model-based RL. In our model-basedRL algorithm, we learn a model of the environment, essentially its transitiondynamics and reward function, use it to generate imaginary trajectories andbackpropagate through them to update the policy, exploiting thedifferentiability of the model. Intuitively, learning more accurate modelsshould lead to better model-based RL performance. Recently, there has beengrowing interest in developing better deep neural network based dynamics modelsfor physical systems, by utilizing the structure of the underlying physics. Wefocus on robotic systems undergoing rigid body motion without contacts. Wecompare two versions of our model-based RL algorithm, one which uses a standarddeep neural network based dynamics model and the other which uses a much moreaccurate, physics-informed neural network based dynamics model. We show that,in model-based RL, model accuracy mainly matters in environments that aresensitive to initial conditions, where numerical errors accumulate fast. Inthese environments, the physics-informed version of our algorithm achievessignificantly better average-return and sample efficiency. In environments thatare not sensitive to initial conditions, both versions of our algorithm achievesimilar average-return, while the physics-informed version achieves bettersample efficiency. We also show that, in challenging environments,physics-informed model-based RL achieves better average-return thanstate-of-the-art model-free RL algorithms such as Soft Actor-Critic, as itcomputes the policy-gradient analytically, while the latter estimates itthrough sampling.

Quick Read (beta)

loading the full paper ...