Learning Strict Identity Mappings in Deep Residual Networks

Abstract

A family of super deep networks, referred to as residual networks or ResNet,achieved record-beating performance in various visual tasks such as imagerecognition, object detection, and semantic segmentation. The ability to trainvery deep networks naturally pushed the researchers to use enormous resourcesto achieve the best performance. Consequently, in many applications super deepresidual networks were employed for just a marginal improvement in performance.In this paper, we propose $\epsilon$-ResNet that allows us to automaticallydiscard redundant layers, which produces responses that are smaller than athreshold $\epsilon$, without any loss in performance. The $\epsilon$-ResNetarchitecture can be achieved using a few additional rectified linear units inthe original ResNet. Our method does not use any additional variables nornumerous trials like other hyper-parameter optimization techniques. The layerselection is achieved using a single training process and the evaluation isperformed on CIFAR-10, CIFAR-100, SVHN, and ImageNet datasets. In someinstances, we achieve about 80\% reduction in the number of parameters.

Quick Read (beta)

loading the full paper ...