Can We Break Free from Strong Data Augmentations in Self-Supervised Learning?

Abstract

Self-supervised learning (SSL) has emerged as a promising solution foraddressing the challenge of limited labeled data in deep neural networks(DNNs), offering scalability potential. However, the impact of designdependencies within the SSL framework remains insufficiently investigated. Inthis study, we comprehensively explore SSL behavior across a spectrum ofaugmentations, revealing their crucial role in shaping SSL model performanceand learning mechanisms. Leveraging these insights, we propose a novel learningapproach that integrates prior knowledge, with the aim of curtailing the needfor extensive data augmentations and thereby amplifying the efficacy of learnedrepresentations. Notably, our findings underscore that SSL models imbued withprior knowledge exhibit reduced texture bias, diminished reliance on shortcutsand augmentations, and improved robustness against both natural and adversarialcorruptions. These findings not only illuminate a new direction in SSLresearch, but also pave the way for enhancing DNN performance whileconcurrently alleviating the imperative for intensive data augmentation,thereby enhancing scalability and real-world problem-solving capabilities.

Quick Read (beta)

loading the full paper ...