Abstract
ImageNet pre-training has been regarded as essential for training accurateobject detectors for a long time. Recently, it has been shown that objectdetectors trained from randomly initialized weights can be on par with thosefine-tuned from ImageNet pre-trained models. However, the effects ofpre-training and the differences caused by pre-training are still not fullyunderstood. In this paper, we analyze the eigenspectrum dynamics of thecovariance matrix of each feature map in object detectors. Based on ouranalysis on ResNet-50, Faster R-CNN with FPN, and Mask R-CNN, we show thatobject detectors trained from ImageNet pre-trained models and those trainedfrom scratch behave differently from each other even if both object detectorshave similar accuracy. Furthermore, we propose a method for automaticallydetermining the widths (the numbers of channels) of object detectors based onthe eigenspectrum. We train Faster R-CNN with FPN from randomly initializedweights, and show that our method can reduce ~27% of the parameters ofResNet-50 without increasing Multiply-Accumulate operations and losingaccuracy. Our results indicate that we should develop more appropriate methodsfor transferring knowledge from image classification to object detection (orother tasks).