HoloGAN: Unsupervised learning of 3D representations from natural images

Abstract

We propose a novel generative adversarial network (GAN) for the task ofunsupervised learning of 3D representations from natural images. Mostgenerative models rely on 2D kernels to generate images and make fewassumptions about the 3D world. These models therefore tend to create blurryimages or artefacts in tasks that require a strong 3D understanding, such asnovel-view synthesis. HoloGAN instead learns a 3D representation of the world,and to render this representation in a realistic manner. Unlike other GANs,HoloGAN provides explicit control over the pose of generated objects throughrigid-body transformations of the learnt 3D features. Our experiments show thatusing explicit 3D features enables HoloGAN to disentangle 3D pose and identity,which is further decomposed into shape and appearance, while still being ableto generate images with similar or higher visual quality than other generativemodels. HoloGAN can be trained end-to-end from unlabelled 2D images only.Particularly, we do not require pose labels, 3D shapes, or multiple views ofthe same objects. This shows that HoloGAN is the first generative model thatlearns 3D representations from natural images in an entirely unsupervisedmanner.

Quick Read (beta)

loading the full paper ...