NormalGAN: Learning Detailed 3D Human from a Single RGB-D Image

Abstract

We propose NormalGAN, a fast adversarial learning-based method to reconstructthe complete and detailed 3D human from a single RGB-D image. Given a singlefront-view RGB-D image, NormalGAN performs two steps: front-view RGB-Drectification and back-view RGBD inference. The final model was then generatedby simply combining the front-view and back-view RGB-D information. However,inferring backview RGB-D image with high-quality geometric details andplausible texture is not trivial. Our key observation is: Normal maps generallyencode much more information of 3D surface details than RGB and depth images.Therefore, learning geometric details from normal maps is superior than otherrepresentations. In NormalGAN, an adversarial learning framework conditioned bynormal maps is introduced, which is used to not only improve the front-viewdepth denoising performance, but also infer the back-view depth image withsurprisingly geometric details. Moreover, for texture recovery, we removeshading information from the front-view RGB image based on the refined normalmap, which further improves the quality of the back-view color inference.Results and experiments on both testing data set and real captured datademonstrate the superior performance of our approach. Given a consumer RGB-Dsensor, NormalGAN can generate the complete and detailed 3D humanreconstruction results in 20 fps, which further enables convenient interactiveexperiences in telepresence, AR/VR and gaming scenarios.

Quick Read (beta)

loading the full paper ...