Abstract
Deep neural networks (DNNs) have dramatically achieved great success on avariety of challenging tasks. However, most of the successful DNNs arestructurally so complex, leading to much storage requirement and floating-pointoperation. This paper proposes a novel technique, named Drop Pruning, tocompress the DNNs by pruning the weights from a dense high-accuracy baselinemodel without accuracy loss. Drop Pruning also falls into the standarditerative prune-retrain procedure, where a \emph{drop} strategy exists at eachpruning step: \emph{drop out}, stochastic deleting some unimportant weights and\emph{drop in}, stochastic recovering some pruned weights. \emph{Drop out} and\emph{drop in} are supposed to handle the two drawbacks of the traditionalpruning methods: local importance judgment and irretrievable pruning process,respectively. The suitable choosing of \emph{drop} probabilities can decreasethe model size during pruning process and lead it to flow to the targetsparsity. Drop Pruning also has some similar spirits with dropout, a stochasticalgorithm in Integer Optimization and the Dense-Sparse-Dense trainingtechnique. Drop Pruning can significantly reducing overfitting whilecompressing the model. Experimental results demonstrates that Drop Pruning canachieve the state-of-the-art performance on many benchmark pruning tasks, about${11.1\times}$ compression of VGG-16 on CIFAR10 and ${14.3\times}$ compressionof LeNet-5 on MNIST without accuracy loss, which may provide some new insightsinto the aspect of model compression.