Abstract
We consider the problem of novel view synthesis (NVS) for dynamic scenes.Recent neural approaches have accomplished exceptional NVS results for static3D scenes, but extensions to 4D time-varying scenes remain non-trivial. Priorefforts often encode dynamics by learning a canonical space plus implicit orexplicit deformation fields, which struggle in challenging scenarios likesudden movements or capturing high-fidelity renderings. In this paper, weintroduce 4D Gaussian Splatting (4DGS), a novel method that represents dynamicscenes with anisotropic 4D XYZT Gaussians, inspired by the success of 3DGaussian Splatting in static scenes. We model dynamics at each timestamp bytemporally slicing the 4D Gaussians, which naturally compose dynamic 3DGaussians and can be seamlessly projected into images. As an explicitspatial-temporal representation, 4DGS demonstrates powerful capabilities formodeling complicated dynamics and fine details, especially for scenes withabrupt motions. We further implement our temporal slicing and splattingtechniques in a highly optimized CUDA acceleration framework, achievingreal-time inference rendering speeds of up to 277 FPS on an RTX 3090 GPU and583 FPS on an RTX 4090 GPU. Rigorous evaluations on scenes with diverse motionsshowcase the superior efficiency and effectiveness of 4DGS, which consistentlyoutperforms existing methods both quantitatively and qualitatively.