Abstract
Model compression is an effective technique facilitating the deployment ofneural network models on mobile devices that have limited computation resourcesand a tight power budget. However, conventional model compression techniquesuse hand-crafted features and require domain experts to explore the largedesign space trading off model size, speed, and accuracy, which is usuallysub-optimal and time-consuming. In this paper, we propose Automated DeepCompression (ADC) that leverages reinforcement learning in order to efficientlysample the design space and greatly improve the model compression quality. Weachieved state-of-the-art model compression results in a fully automated waywithout any human efforts. Under 4x FLOPs reduction, we achieved 2.7% betteraccuracy than hand-crafted model compression method for VGG-16 on ImageNet. Weapplied this automated, push-the-button compression pipeline to MobileNet andachieved a 2x reduction in FLOPs, and a speedup of 1.49x on Titan Xp and 1.65xon an Android phone (Samsung Galaxy S7), with negligible loss of accuracy.