Abstract
We examine the impact of learning Lipschitz continuous models in the contextof model-based reinforcement learning. We provide a novel bound on multi-stepprediction error of Lipschitz models where we quantify the error using theWasserstein metric. We go on to prove an error bound for the value-functionestimate arising from Lipschitz models and show that the estimated valuefunction is itself Lipschitz. We conclude with empirical results that show thebenefits of controlling the Lipschitz constant of neural-network models.
Quick Read (beta)
loading the full paper ...