Relative Pose Estimation through Affine Corrections of Monocular Depth Priors

  • 2025-03-24 18:14:43
  • Yifan Yu, Shaohui Liu, RĂ©mi Pautrat, Marc Pollefeys, Viktor Larsson
Monocular depth estimation (MDE) models have undergone significantadvancements over recent years. Many MDE models aim to predict affine-invariantrelative depth from monocular images, while recent developments in large-scaletraining and vision foundation models enable reasonable estimation of metric(absolute) depth. However, effectively leveraging these predictions forgeometric vision tasks, in particular relative pose estimation, remainsrelatively under explored. While depths provide rich constraints for cross-viewimage alignment, the intrinsic noise and ambiguity from the monocular depthpriors present practical challenges to improving upon classic keypoint-basedsolutions. In this paper, we develop three solvers for relative pose estimationthat explicitly account for independent affine (scale and shift) ambiguities,covering both calibrated and uncalibrated conditions. We further propose ahybrid estimation pipeline that combines our proposed solvers with classicpoint-based solvers and epipolar constraints. We find that the affinecorrection modeling is beneficial to not only the relative depth priors butalso, surprisingly, the "metric" ones. Results across multiple datasetsdemonstrate large improvements of our approach over classic keypoint-basedbaselines and PnP-based solutions, under both calibrated and uncalibratedsetups. We also show that our method improves consistently with differentfeature matchers and MDE models, and can further benefit from very recentadvances on both modules. Code is available at


