Abstract
We propose Deep Patch Visual Odometry (DPVO), a new deep learning system formonocular Visual Odometry (VO). DPVO uses a novel recurrent networkarchitecture designed for tracking image patches across time. Recent approachesto VO have significantly improved the state-of-the-art accuracy by using deepnetworks to predict dense flow between video frames. However, using dense flowincurs a large computational cost, making these previous methods impracticalfor many use cases. Despite this, it has been assumed that dense flow isimportant as it provides additional redundancy against incorrect matches. DPVOdisproves this assumption, showing that it is possible to get the best accuracyand efficiency by exploiting the advantages of sparse patch-based matching overdense flow. DPVO introduces a novel recurrent update operator for patch basedcorrespondence coupled with differentiable bundle adjustment. On Standardbenchmarks, DPVO outperforms all prior work, including the learning-basedstate-of-the-art VO-system (DROID) using a third of the memory while running 3xfaster on average. Code is available at https://github.com/princeton-vl/DPVO