RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild

  • 2022-08-11 18:59:59
  • Jason Y. Zhang, Deva Ramanan, Shubham Tulsiani
  • 1

Abstract

We describe a data-driven method for inferring the camera viewpoints givenmultiple images of an arbitrary object. This task is a core component ofclassic geometric pipelines such as SfM and SLAM, and also serves as a vitalpre-processing requirement for contemporary neural approaches (e.g. NeRF) toobject reconstruction and view synthesis. In contrast to existingcorrespondence-driven methods that do not perform well given sparse views, wepropose a top-down prediction based approach for estimating camera viewpoints.Our key technical insight is the use of an energy-based formulation forrepresenting distributions over relative camera rotations, thus allowing us toexplicitly represent multiple camera modes arising from object symmetries orviews. Leveraging these relative predictions, we jointly estimate a consistentset of camera rotations from multiple images. We show that our approachoutperforms state-of-the-art SfM and SLAM methods given sparse images on bothseen and unseen categories. Further, our probabilistic approach significantlyoutperforms directly regressing relative poses, suggesting that modelingmultimodality is important for coherent joint reconstruction. We demonstratethat our system can be a stepping stone toward in-the-wild reconstruction frommulti-view datasets. The project page with code and videos can be found athttps://jasonyzhang.com/relpose.

 

Quick Read (beta)

loading the full paper ...