Abstract
Creating relightable and animatable human avatars from monocular videos is arising research topic with a range of applications, e.g. virtual reality,sports, and video games. Previous works utilize neural fields together withphysically based rendering (PBR), to estimate geometry and disentangleappearance properties of human avatars. However, one drawback of these methodsis the slow rendering speed due to the expensive Monte Carlo ray tracing. Totackle this problem, we proposed to distill the knowledge from implicit neuralfields (teacher) to explicit 2D Gaussian splatting (student) representation totake advantage of the fast rasterization property of Gaussian splatting. Toavoid ray-tracing, we employ the split-sum approximation for PBR appearance. Wealso propose novel part-wise ambient occlusion probes for shadow computation.Shadow prediction is achieved by querying these probes only once per pixel,which paves the way for real-time relighting of avatars. These techniquescombined give high-quality relighting results with realistic shadow effects.Our experiments demonstrate that the proposed student model achieves comparableor even better relighting results with our teacher model while being 370 timesfaster at inference time, achieving a 67 FPS rendering speed.