Non-Convex Optimization with Spectral Radius Regularization

  • 2021-02-22 17:39:05
  • Adam Sandler, Diego Klabjan, Yuan Luo
  1


We develop a regularization method which finds flat minima during thetraining of deep neural networks and other machine learning models. Theseminima generalize better than sharp minima, allowing models to bettergeneralize to real word test data, which may be distributed differently fromthe training data. Specifically, we propose a method of regularizedoptimization to reduce the spectral radius of the Hessian of the loss function.Additionally, we derive algorithms to efficiently perform this optimization onneural networks and prove convergence results for these algorithms.Furthermore, we demonstrate that our algorithm works effectively on multiplereal world applications in multiple domains including healthcare. In order toshow our models generalize well, we introduce different methods of testinggeneralizability.


