Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes

  • 2021-04-22 13:22:12
  • James Lucas, Juhan Bae, Michael R. Zhang, Stanislav Fort, Richard Zemel, Roger Grosse
  • 40

Abstract

Linear interpolation between initial neural network parameters and convergedparameters after training with stochastic gradient descent (SGD) typicallyleads to a monotonic decrease in the training objective. This Monotonic LinearInterpolation (MLI) property, first observed by Goodfellow et al. (2014)persists in spite of the non-convex objectives and highly non-linear trainingdynamics of neural networks. Extending this work, we evaluate severalhypotheses for this property that, to our knowledge, have not yet beenexplored. Using tools from differential geometry, we draw connections betweenthe interpolated paths in function space and the monotonicity of the network -providing sufficient conditions for the MLI property under mean squared error.While the MLI property holds under various settings (e.g. network architecturesand learning problems), we show in practice that networks violating the MLIproperty can be produced systematically, by encouraging the weights to move farfrom initialization. The MLI property raises important questions about the losslandscape geometry of neural networks and highlights the need to further studytheir global properties.

 

Quick Read (beta)

loading the full paper ...