Online Distillation with Mixed Sample Augmentation

  • 2022-06-24 17:44:06
  • Yiqing Shen, Liwu Xu, Yuzhe Yang, Yaqian Li, Yandong Guo
  • 1

Abstract

Mixed Sample Regularization (MSR), such as MixUp or CutMix, is a powerfuldata augmentation strategy to generalize convolutional neural networks.Previous empirical analysis has illustrated an orthogonal performance gainbetween MSR and the conventional offline Knowledge Distillation (KD). To bemore specific, student networks can be enhanced with the involvement of MSR inthe training stage of the sequential distillation. Yet, the interplay betweenMSR and online knowledge distillation, a stronger distillation paradigm, wherean ensemble of peer students learn mutually from each other, remainsunexplored. To bridge the gap, we make the first attempt at incorporatingCutMix into online distillation, where we empirically observe a significantimprovement. Encouraged by this fact, we propose an even stronger MSRspecifically for online distillation, named as Cut^nMix. Furthermore, a novelonline distillation framework is designed upon Cut^nMix, to enhance thedistillation with feature level mutual learning and a self-ensemble teacher.Comprehensive evaluations on CIFAR10 and CIFAR100 with six networkarchitectures show that our approach can consistently outperformstate-of-the-art distillation methods.

 

Quick Read (beta)

loading the full paper ...