SpatialTrackerV2: 3D Point Tracking Made Easy

  • 2025-07-16 17:59:03
  • Yuxi Xiao, Jianyuan Wang, Nan Xue, Nikita Karaev, Yuri Makarov, Bingyi Kang, Xing Zhu, Hujun Bao, Yujun Shen, Xiaowei Zhou
  • 0

Abstract

We present SpatialTrackerV2, a feed-forward 3D point tracking method formonocular videos. Going beyond modular pipelines built on off-the-shelfcomponents for 3D tracking, our approach unifies the intrinsic connectionsbetween point tracking, monocular depth, and camera pose estimation into ahigh-performing and feedforward 3D point tracker. It decomposes world-space 3Dmotion into scene geometry, camera ego-motion, and pixel-wise object motion,with a fully differentiable and end-to-end architecture, allowing scalabletraining across a wide range of datasets, including synthetic sequences, posedRGB-D videos, and unlabeled in-the-wild footage. By learning geometry andmotion jointly from such heterogeneous data, SpatialTrackerV2 outperformsexisting 3D tracking methods by 30%, and matches the accuracy of leadingdynamic 3D reconstruction approaches while running 50$\times$ faster.

 

Quick Read (beta)

loading the full paper ...