Self-supervised Learning of 3D Objects from Natural Images

Abstract

We present a method to learn single-view reconstruction of the 3D shape,pose, and texture of objects from categorized natural images in aself-supervised manner. Since this is a severely ill-posed problem, carefullydesigning a training method and introducing constraints are essential. To avoidthe difficulty of training all elements at the same time, we propose trainingcategory-specific base shapes with fixed pose distribution and simple texturesfirst, and subsequently training poses and textures using the obtained shapes.Another difficulty is that shapes and backgrounds sometimes become excessivelycomplicated to mistakenly reconstruct textures on object surfaces. To suppressit, we propose using strong regularization and constraints on object surfacesand background images. With these two techniques, we demonstrate that we canuse natural image collections such as CIFAR-10 and PASCAL objects for training,which indicates the possibility to realize 3D object reconstruction on diverseobject categories beyond synthetic datasets.

Quick Read (beta)

loading the full paper ...