Unveiling the Power of Mixup for Stronger Classifiers

Abstract

Mixup-based data augmentations have achieved great success as regularizersfor deep neural networks. However, existing methods rely on deliberatelyhandcrafted mixup policies, which ignore or oversell the semantic matchingbetween mixed samples and labels. Driven by their prior assumptions, earlymethods attempt to smooth decision boundaries by random linear interpolationwhile others focus on maximizing class-related information via offline saliencyoptimization. As a result, the issue of label mismatch has not been welladdressed. Additionally, the optimization stability of mixup training isconstantly troubled by the label mismatch. To address these challenges, wefirst reformulate mixup for supervised classification as two sub-tasks, mixupsample generation and classification, then propose Automatic Mixup (AutoMix), arevolutionary mixup framework. Specifically, a learnable lightweight Mix Block(MB) with a cross-attention mechanism is proposed to generate a mixed sample bymodeling a fair relationship between the pair of samples under directsupervision of the corresponding mixed label. Moreover, the proposed MomentumPipeline (MP) enhances training stability and accelerates convergence on top ofmaking the Mix Block fully trained end-to-end. Extensive experiments on fivepopular classification benchmarks show that the proposed approach consistentlyoutperforms leading methods by a large margin.

Quick Read (beta)

loading the full paper ...