Abstract
We present a simple but powerful architecture of convolutional neuralnetwork, which has a VGG-like inference-time body composed of nothing but astack of 3x3 convolution and ReLU, while the training-time model has amulti-branch topology. Such decoupling of the training-time and inference-timearchitecture is realized by a structural re-parameterization technique so thatthe model is named RepVGG. On ImageNet, RepVGG reaches over 80\% top-1accuracy, which is the first time for a plain model, to the best of ourknowledge. On NVIDIA 1080Ti GPU, RepVGG models run 83% faster than ResNet-50 or101% faster than ResNet-101 with higher accuracy and show favorableaccuracy-speed trade-off compared to the state-of-the-art models likeEfficientNet and RegNet. The code and trained models are available athttps://github.com/megvii-model/RepVGG.