Robust and Generalizable Visual Representation Learning via Random Convolutions

Abstract

While successful for various computer vision tasks, deep neural networks haveshown to be vulnerable to texture style shifts and small perturbations to whichhumans are robust. In this work, we show that the robustness of neural networkscan be greatly improved through the use of random convolutions as dataaugmentation. Random convolutions are approximately shape-preserving and maydistort local textures. Intuitively, randomized convolutions create an infinitenumber of new domains with similar global shapes but random local textures.Therefore, we explore using outputs of multi-scale random convolutions as newimages or mixing them with the original images during training. When applying anetwork trained with our approach to unseen domains, our method consistentlyimproves the performance on domain generalization benchmarks and is scalable toImageNet. In particular, in the challenging scenario of generalizing to thesketch domain in PACS and to ImageNet-Sketch, our method outperformsstate-of-art methods by a large margin. More interestingly, our method canbenefit downstream tasks by providing a more robust pretrained visualrepresentation.

Quick Read (beta)

loading the full paper ...