Bringing Diversity from Diffusion Models to Semantic-Guided Face Asset Generation

Abstract

Digital modeling and reconstruction of human faces serve variousapplications. However, its availability is often hindered by the requirementsof data capturing devices, manual labor, and suitable actors. This situationrestricts the diversity, expressiveness, and control over the resulting models.This work aims to demonstrate that a semantically controllable generativenetwork can provide enhanced control over the digital face modeling process. Toenhance diversity beyond the limited human faces scanned in a controlledsetting, we introduce a novel data generation pipeline that creates ahigh-quality 3D face database using a pre-trained diffusion model. Our proposednormalization module converts synthesized data from the diffusion model intohigh-quality scanned data. Using the 44,000 face models we obtained, we furtherdeveloped an efficient GAN-based generator. This generator accepts semanticattributes as input, and generates geometry and albedo. It also allowscontinuous post-editing of attributes in the latent space. Our asset refinementcomponent subsequently creates physically-based facial assets. We introduce acomprehensive system designed for creating and editing high-quality faceassets. Our proposed model has undergone extensive experiment, comparison andevaluation. We also integrate everything into a web-based interactive tool. Weaim to make this tool publicly available with the release of the paper.

Quick Read (beta)

loading the full paper ...