ResNet strikes back: An improved training procedure in timm

Abstract

The influential Residual Networks designed by He et al. remain thegold-standard architecture in numerous scientific publications. They typicallyserve as the default architecture in studies, or as baselines when newarchitectures are proposed. Yet there has been significant progress on bestpractices for training neural networks since the inception of the ResNetarchitecture in 2015. Novel optimization & data-augmentation have increased theeffectiveness of the training recipes. In this paper, we re-evaluate theperformance of the vanilla ResNet-50 when trained with a procedure thatintegrates such advances. We share competitive training settings andpre-trained models in the timm open-source library, with the hope that theywill serve as better baselines for future work. For instance, with our moredemanding training setting, a vanilla ResNet-50 reaches 80.4% top-1 accuracy atresolution 224x224 on ImageNet-val without extra data or distillation. We alsoreport the performance achieved with popular models with our trainingprocedure.

Quick Read (beta)

loading the full paper ...