Ensembling with Deep Generative Views

Abstract

Recent generative models can synthesize "views" of artificial images thatmimic real-world variations, such as changes in color or pose, simply bylearning from unlabeled image collections. Here, we investigate whether suchviews can be applied to real images to benefit downstream analysis tasks suchas image classification. Using a pretrained generator, we first find the latentcode corresponding to a given real input image. Applying perturbations to thecode creates natural variations of the image, which can then be ensembledtogether at test-time. We use StyleGAN2 as the source of generativeaugmentations and investigate this setup on classification tasks involvingfacial attributes, cat faces, and cars. Critically, we find that several designdecisions are required towards making this process work; the perturbationprocedure, weighting between the augmentations and original image, and trainingthe classifier on synthesized images can all impact the result. Currently, wefind that while test-time ensembling with GAN-based augmentations can offersome small improvements, the remaining bottlenecks are the efficiency andaccuracy of the GAN reconstructions, coupled with classifier sensitivities toartifacts in GAN-generated images.

Quick Read (beta)

loading the full paper ...