Abstract
Numerous methods have been proposed for probabilistic generative modelling of3D objects. However, none of these is able to produce textured objects, whichrenders them of limited use for practical tasks. In this work, we present thefirst generative model of textured 3D meshes. Training such a model wouldtraditionally require a large dataset of textured meshes, but unfortunately,existing datasets of meshes lack detailed textures. We instead propose a newtraining methodology that allows learning from collections of 2D images withoutany 3D information. To do so, we train our model to explain a distribution ofimages by modelling each image as a 3D foreground object placed in front of a2D background. Thus, it learns to generate meshes that when rendered, produceimages similar to those in its training set. A well-known problem when generating meshes with deep networks is theemergence of self-intersections, which are problematic for many use-cases. As asecond contribution we therefore introduce a new generation process for 3Dmeshes that guarantees no self-intersections arise, based on the physicalintuition that faces should push one another out of the way as they move. We conduct extensive experiments on our approach, reporting quantitative andqualitative results on both synthetic data and natural images. These show ourmethod successfully learns to generate plausible and diverse textured 3Dsamples for five challenging object classes.