Abstract
Direct methods have shown excellent performance in the applications of visualodometry and SLAM. In this work we propose to leverage their effectiveness forthe task of 3D multi-object tracking. To this end, we propose DirectTracker, aframework that effectively combines direct image alignment for the short-termtracking and sliding-window photometric bundle adjustment for 3D objectdetection. Object proposals are estimated based on the sparse sliding-windowpointcloud and further refined using an optimization-based cost function thatcarefully combines 3D and 2D cues to ensure consistency in image and worldspace. We propose to evaluate 3D tracking using the recently introducedhigher-order tracking accuracy (HOTA) metric and the generalized intersectionover union similarity measure to mitigate the limitations of the conventionaluse of intersection over union for the evaluation of vision-based trackers. Weperform evaluation on the KITTI Tracking benchmark for the Car class and showcompetitive performance in tracking objects both in 2D and 3D.