Abstract
The success of learning with noisy labels (LNL) methods relies heavily on thesuccess of a warm-up stage where standard supervised training is performedusing the full (noisy) training set. In this paper, we identify a "warm-upobstacle": the inability of standard warm-up stages to train high qualityfeature extractors and avert memorization of noisy labels. We propose "Contrastto Divide" (C2D), a simple framework that solves this problem by pre-trainingthe feature extractor in a self-supervised fashion. Using self-supervisedpre-training boosts the performance of existing LNL approaches by drasticallyreducing the warm-up stage's susceptibility to noise level, shortening itsduration, and increasing extracted feature quality. C2D works out of the boxwith existing methods and demonstrates markedly improved performance,especially in the high noise regime, where we get a boost of more than 27% forCIFAR-100 with 90% noise over the previous state of the art. In real-life noisesettings, C2D trained on mini-WebVision outperforms previous works both inWebVision and ImageNet validation sets by 3% top-1 accuracy. We perform anin-depth analysis of the framework, including investigating the performance ofdifferent pre-training approaches and estimating the effective upper bound ofthe LNL performance with semi-supervised learning. Code for reproducing ourexperiments is available at https://github.com/ContrastToDivide/C2D