Abstract
Recently, photo-realistic novel view synthesis from multi-view images, suchas neural radiance field (NeRF) and 3D Gaussian Splatting (3DGS), have garneredwidespread attention due to their superior performance. However, most worksrely on low dynamic range (LDR) images, which limits the capturing of richerscene details. Some prior works have focused on high dynamic range (HDR) scenereconstruction, typically require capturing of multi-view sharp images withdifferent exposure times at fixed camera positions during exposure times, whichis time-consuming and challenging in practice. For a more flexible dataacquisition, we propose a one-stage method: \textbf{CasualHDRSplat} to easilyand robustly reconstruct the 3D HDR scene from casually captured videos withauto-exposure enabled, even in the presence of severe motion blur and varyingunknown exposure time. \textbf{CasualHDRSplat} contains a unifieddifferentiable physical imaging model which first applies continuous-timetrajectory constraint to imaging process so that we can jointly optimizeexposure time, camera response function (CRF), camera poses, and sharp 3D HDRscene. Extensive experiments demonstrate that our approach outperforms existingmethods in terms of robustness and rendering quality. Our source code will beavailable at https://github.com/WU-CVGL/CasualHDRSplat