A Gradient Flow Framework For Analyzing Network Pruning

  • 2020-09-24 17:37:32
  • Ekdeep Singh Lubana, Robert P. Dick
  • 17

Abstract

Recent network pruning methods focus on pruning models early-on in training.To estimate the impact of removing a parameter, these methods use importancemeasures that were originally designed for pruning trained models. Despitelacking justification for their use early-on in training, models pruned usingsuch measures result in surprisingly minimal accuracy loss. To better explainthis behavior, we develop a general, gradient-flow based framework that relatesstate-of-the-art importance measures through an order of time-derivative of thenorm of model parameters. We use this framework to determine the relationshipbetween pruning measures and evolution of model parameters, establishingseveral findings related to pruning models early-on in training: (i)magnitude-based pruning removes parameters that contribute least to reductionin loss, resulting in models that converge faster than magnitude-agnosticmethods; (ii) loss-preservation based pruning preserves first-order modelevolution dynamics and is well-motivated for pruning minimally trained models;and (iii) gradient-norm based pruning affects second-order model evolutiondynamics, and increasing gradient norm via pruning can produce poorlyperforming models. We validate our claims on several VGG-13, MobileNet-V1, andResNet-56 models trained on CIFAR-10 and CIFAR-100. Code available athttps://github.com/EkdeepSLubana/flowandprune.

 

Quick Read (beta)

loading the full paper ...