Abstract
We present a novel point-based, differentiable neural rendering pipeline forscene refinement and novel view synthesis. The input are an initial estimate ofthe point cloud and the camera parameters. The output are synthesized imagesfrom arbitrary camera poses. The point cloud rendering is performed by adifferentiable renderer using multi-resolution one-pixel point rasterization.Spatial gradients of the discrete rasterization are approximated by the novelconcept of ghost geometry. After rendering, the neural image pyramid is passedthrough a deep neural network for shading calculations and hole-filling. Adifferentiable, physically-based tonemapper then converts the intermediateoutput to the target image. Since all stages of the pipeline aredifferentiable, we optimize all of the scene's parameters i.e. camera model,camera pose, point position, point color, environment map, rendering networkweights, vignetting, camera response function, per image exposure, and perimage white balance. We show that our system is able to synthesize sharper andmore consistent novel views than existing approaches because the initialreconstruction is refined during training. The efficient one-pixel pointrasterization allows us to use arbitrary camera models and display scenes withwell over 100M points in real time.