Neural Ordinary Differential Equations

Abstract

We introduce a new family of deep neural network models. Instead ofspecifying a discrete sequence of hidden layers, we parameterize the derivativeof the hidden state using a neural network. The output of the network iscomputed using a blackbox differential equation solver. These continuous-depthmodels have constant memory cost, adapt their evaluation strategy to eachinput, and can explicitly trade numerical precision for speed. We demonstratethese properties in continuous-depth residual networks and continuous-timelatent variable models. We also construct continuous normalizing flows, agenerative model that can train by maximum likelihood, without partitioning orordering the data dimensions. For training, we show how to scalablybackpropagate through any ODE solver, without access to its internaloperations. This allows end-to-end training of ODEs within larger models.

Quick Read (beta)

loading the full paper ...