Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images

Abstract

3D GAN inversion aims to project a single image into the latent space of a 3DGenerative Adversarial Network (GAN), thereby achieving 3D geometryreconstruction. While there exist encoders that achieve good results in 3D GANinversion, they are predominantly built on EG3D, which specializes insynthesizing near-frontal views and is limiting in synthesizing comprehensive3D scenes from diverse viewpoints. In contrast to existing approaches, wepropose a novel framework built on PanoHead, which excels in synthesizingimages from a 360-degree perspective. To achieve realistic 3D modeling of theinput image, we introduce a dual encoder system tailored for high-fidelityreconstruction and realistic generation from different viewpoints. Accompanyingthis, we propose a stitching framework on the triplane domain to get the bestpredictions from both. To achieve seamless stitching, both encoders must outputconsistent results despite being specialized for different tasks. For thisreason, we carefully train these encoders using specialized losses, includingan adversarial loss based on our novel occlusion-aware triplane discriminator.Experiments reveal that our approach surpasses the existing encoder trainingmethods qualitatively and quantitatively. Please visit the project page:https://berkegokmen1.github.io/dual-enc-3d-gan-inv.

Quick Read (beta)

loading the full paper ...