Abstract
This paper presents a Multi-Object Tracking (MOT) framework that fuses radarand camera data to enhance tracking efficiency while minimizing manualinterventions. Contrary to many studies that underutilize radar and assign it asupplementary role--despite its capability to provide accurate range/depthinformation of targets in a world 3D coordinate system--our approach positionsradar in a crucial role. Meanwhile, this paper utilizes common features toenable online calibration to autonomously associate detections from radar andcamera. The main contributions of this work include: (1) the development of aradar-camera fusion MOT framework that exploits online radar-camera calibrationto simplify the integration of detection results from these two sensors, (2)the utilization of common features between radar and camera data to accuratelyderive real-world positions of detected objects, and (3) the adoption offeature matching and category-consistency checking to surpass the limitationsof mere position matching in enhancing sensor association accuracy. To the bestof our knowledge, we are the first to investigate the integration ofradar-camera common features and their use in online calibration for achievingMOT. The efficacy of our framework is demonstrated by its ability to streamlinethe radar-camera mapping process and improve tracking precision, as evidencedby real-world experiments conducted in both controlled environments and actualtraffic scenarios. Code is available athttps://github.com/radar-lab/Radar_Camera_MOT