Generative Models as a Data Source for Multiview Representation Learning

Abstract

Generative models are now capable of producing highly realistic images thatlook nearly indistinguishable from the data on which they are trained. Thisraises the question: if we have good enough generative models, do we still needdatasets? We investigate this question in the setting of learninggeneral-purpose visual representations from a black-box generative model ratherthan directly from data. Given an off-the-shelf image generator without anyaccess to its training data, we train representations from the samples outputby this generator. We compare several representation learning methods that canbe applied to this setting, using the latent space of the generator to generatemultiple "views" of the same semantic content. We show that for contrastivemethods, this multiview data can naturally be used to identify positive pairs(nearby in latent space) and negative pairs (far apart in latent space). Wefind that the resulting representations rival those learned directly from realdata, but that good performance requires care in the sampling strategy appliedand the training method. Generative models can be viewed as a compressed andorganized copy of a dataset, and we envision a future where more and more"model zoos" proliferate while datasets become increasingly unwieldy, missing,or private. This paper suggests several techniques for dealing with visualrepresentation learning in such a future. Code is released on our project page:https://ali-design.github.io/GenRep/

Quick Read (beta)

loading the full paper ...