Lipschitz Continuity in Model-based Reinforcement Learning

Abstract

Model-based reinforcement-learning methods learn transition and reward modelsand use them to guide behavior. We analyze the impact of learning models thatare Lipschitz continuous---the distance between function values for two inputsis bounded by a linear function of the distance between the inputs. Our firstresult shows a tight bound on model errors for multi-step predictions withLipschitz continuous models. We go on to prove an error bound for thevalue-function estimate arising from such models and show that the estimatedvalue function is itself Lipschitz continuous. We conclude with empiricalresults that demonstrate significant benefits to enforcing Lipschitz continuityof neural net models during reinforcement learning.

Quick Read (beta)

loading the full paper ...