Abstract
We delve into pseudo-labeling for semi-supervised monocular 3D objectdetection (SSM3OD) and discover two primary issues: a misalignment between theprediction quality of 3D and 2D attributes and the tendency of depthsupervision derived from pseudo-labels to be noisy, leading to significantoptimization conflicts with other reliable forms of supervision. We introduce anovel decoupled pseudo-labeling (DPL) approach for SSM3OD. Our approachfeatures a Decoupled Pseudo-label Generation (DPG) module, designed toefficiently generate pseudo-labels by separately processing 2D and 3Dattributes. This module incorporates a unique homography-based method foridentifying dependable pseudo-labels in BEV space, specifically for 3Dattributes. Additionally, we present a DepthGradient Projection (DGP) module tomitigate optimization conflicts caused by noisy depth supervision ofpseudo-labels, effectively decoupling the depth gradient and removingconflicting gradients. This dual decoupling strategy-at both the pseudo-labelgeneration and gradient levels-significantly improves the utilization ofpseudo-labels in SSM3OD. Our comprehensive experiments on the KITTI benchmarkdemonstrate the superiority of our method over existing approaches.