NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Abstract

We present a method that achieves state-of-the-art results for synthesizingnovel views of complex scenes by optimizing an underlying continuous volumetricscene function using a sparse set of input views. Our algorithm represents ascene using a fully-connected (non-convolutional) deep network, whose input isa single continuous 5D coordinate (spatial location $(x,y,z)$ and viewingdirection $(\theta, \phi)$) and whose output is the volume density andview-dependent emitted radiance at that spatial location. We synthesize viewsby querying 5D coordinates along camera rays and use classic volume renderingtechniques to project the output colors and densities into an image. Becausevolume rendering is naturally differentiable, the only input required tooptimize our representation is a set of images with known camera poses. Wedescribe how to effectively optimize neural radiance fields to renderphotorealistic novel views of scenes with complicated geometry and appearance,and demonstrate results that outperform prior work on neural rendering and viewsynthesis. View synthesis results are best viewed as videos, so we urge readersto view our supplementary video for convincing comparisons.

Quick Read (beta)

loading the full paper ...