Abstract
We present EgoRenderer, a system for rendering full-body neural avatars of aperson captured by a wearable, egocentric fisheye camera that is mounted on acap or a VR headset. Our system renders photorealistic novel views of the actorand her motion from arbitrary virtual camera locations. Rendering full-bodyavatars from such egocentric images come with unique challenges due to thetop-down view and large distortions. We tackle these challenges by decomposingthe rendering process into several steps, including texture synthesis, poseconstruction, and neural image translation. For texture synthesis, we proposeEgo-DPNet, a neural network that infers dense correspondences between the inputfisheye images and an underlying parametric body model, and to extract texturesfrom egocentric inputs. In addition, to encode dynamic appearances, ourapproach also learns an implicit texture stack that captures detailedappearance variation across poses and viewpoints. For correct pose generation,we first estimate body pose from the egocentric view using a parametric model.We then synthesize an external free-viewpoint pose image by projecting theparametric model to the user-specified target viewpoint. We next combine thetarget pose image and the textures into a combined feature image, which istransformed into the output color image using a neural image translationnetwork. Experimental evaluations show that EgoRenderer is capable ofgenerating realistic free-viewpoint avatars of a person wearing an egocentriccamera. Comparisons to several baselines demonstrate the advantages of ourapproach.