Structured Model Pruning of Convolutional Networks on Tensor Processing Units

Abstract

The deployment of convolutional neural networks is often hindered by highcomputational and storage requirements. Structured model pruning is a promisingapproach to alleviate these requirements. Using the VGG-16 model as an example,we measure the accuracy-efficiency trade-off for various structured modelpruning methods and datasets (CIFAR-10 and ImageNet) on Tensor Processing Units(TPUs). To measure the actual performance of models, we develop a structuredmodel pruning library for TensorFlow2 to modify models in place (instead ofadding mask layers). We show that structured model pruning can significantlyimprove model memory usage and speed on TPUs without losing accuracy,especially for small datasets (e.g., CIFAR-10).

Quick Read (beta)

loading the full paper ...