Mixup Without Hesitation

Abstract

Mixup linearly interpolates pairs of examples to form new samples, which iseasy to implement and has been shown to be effective in image classificationtasks. However, there are two drawbacks in mixup: one is that more trainingepochs are needed to obtain a well-trained model; the other is that mixuprequires tuning a hyper-parameter to gain appropriate capacity but that is adifficult task. In this paper, we find that mixup constantly explores therepresentation space, and inspired by the exploration-exploitation dilemma inreinforcement learning, we propose mixup Without hesitation (mWh), a concise,effective, and easy-to-use training algorithm. We show that mWh strikes a goodbalance between exploration and exploitation by gradually replacing mixup withbasic data augmentation. It can achieve a strong baseline with less trainingtime than original mixup and without searching for optimal hyper-parameter,i.e., mWh acts as mixup without hesitation. mWh can also transfer to CutMix,and gain consistent improvement on other machine learning and computer visiontasks such as object detection. Our code is open-source and available athttps://github.com/yuhao318/mwh

Quick Read (beta)

loading the full paper ...