Abstract
Model-based reinforcement learning (MBRL) approaches rely on discrete-timestate transition models whereas physical systems and the vast majority ofcontrol tasks operate in continuous-time. To avoid time-discretizationapproximation of the underlying process, we propose a continuous-time MBRLframework based on a novel actor-critic method. Our approach also infers theunknown state evolution differentials with Bayesian neural ordinarydifferential equations (ODE) to account for epistemic uncertainty. We implementand test our method on a new ODE-RL suite that explicitly solvescontinuous-time control systems. Our experiments illustrate that the model isrobust against irregular and noisy data, is sample-efficient, and can solvecontrol problems which pose challenges to discrete-time MBRL methods.