SM3D: Simultaneous Monocular Mapping and 3D Detection

Abstract

Mapping and 3D detection are two major issues in vision-based robotics, andself-driving. While previous works only focus on each task separately, wepresent an innovative and efficient multi-task deep learning framework (SM3D)for Simultaneous Mapping and 3D Detection by bridging the gap with robust depthestimation and "Pseudo-LiDAR" point cloud for the first time. The Mappingmodule takes consecutive monocular frames to generate depth and poseestimation. In 3D Detection module, the depth estimation is projected into 3Dspace to generate "Pseudo-LiDAR" point cloud, where LiDAR-based 3D detector canbe leveraged on point cloud for vehicular 3D detection and localization. Byend-to-end training of both modules, the proposed mapping and 3D detectionmethod outperforms the state-of-the-art baseline by 10.0% and 13.2% inaccuracy, respectively. While achieving better accuracy, our monocularmulti-task SM3D is more than 2 times faster than pure stereo 3D detector, and18.3% faster than using two modules separately.

Quick Read (beta)

loading the full paper ...