Abstract
This work presents Prior Depth Anything, a framework that combines incompletebut precise metric information in depth measurement with relative but completegeometric structures in depth prediction, generating accurate, dense, anddetailed metric depth maps for any scene. To this end, we design acoarse-to-fine pipeline to progressively integrate the two complementary depthsources. First, we introduce pixel-level metric alignment and distance-awareweighting to pre-fill diverse metric priors by explicitly using depthprediction. It effectively narrows the domain gap between prior patterns,enhancing generalization across varying scenarios. Second, we develop aconditioned monocular depth estimation (MDE) model to refine the inherent noiseof depth priors. By conditioning on the normalized pre-filled prior andprediction, the model further implicitly merges the two complementary depthsources. Our model showcases impressive zero-shot generalization across depthcompletion, super-resolution, and inpainting over 7 real-world datasets,matching or even surpassing previous task-specific methods. More importantly,it performs well on challenging, unseen mixed priors and enables test-timeimprovements by switching prediction models, providing a flexibleaccuracy-efficiency trade-off while evolving with advancements in MDE models.