Differential Dynamic Programming Neural Optimizer

Abstract

Interpretation of Deep Neural Networks (DNNs) training as an optimal controlproblem with nonlinear dynamical systems has received considerable attentionrecently, yet the algorithmic development remains relatively limited. In thiswork, we make an attempt along this line by reformulating the trainingprocedure from the trajectory optimization perspective. We first show that mostwidely-used algorithms for training DNNs can be linked to the DifferentialDynamic Programming (DDP), a celebrated second-order trajectory optimizationalgorithm rooted in the Approximate Dynamic Programming. In this vein, wepropose a new variant of DDP that can accept batch optimization for trainingfeedforward networks, while integrating naturally with the recent progress incurvature approximation. The resulting algorithm features layer-wise feedbackpolicies which improve convergence rate and reduce sensitivity tohyper-parameter over existing methods. We show that the algorithm iscompetitive against state-ofthe-art first and second order methods. Our workopens up new avenues for principled algorithmic design built upon the optimalcontrol theory.

Quick Read (beta)

loading the full paper ...