Abstract
In this paper, we propose ProTracker, a novel framework for robust andaccurate long-term dense tracking of arbitrary points in videos. The key ideaof our method is incorporating probabilistic integration to refine multiplepredictions from both optical flow and semantic features for robust short-termand long-term tracking. Specifically, we integrate optical flow estimations ina probabilistic manner, producing smooth and accurate trajectories bymaximizing the likelihood of each prediction. To effectively re-localizechallenging points that disappear and reappear due to occlusion, we furtherincorporate long-term feature correspondence into our flow predictions forcontinuous trajectory generation. Extensive experiments show that ProTrackerachieves the state-of-the-art performance among unsupervised andself-supervised approaches, and even outperforms supervised methods on severalbenchmarks. Our code and model will be publicly available upon publication.