Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection

Abstract

Semi-supervised object detection (SSOD) based on pseudo-labelingsignificantly reduces dependence on large labeled datasets by effectivelyleveraging both labeled and unlabeled data. However, real-world applications ofSSOD often face critical challenges, including class imbalance, label noise,and labeling errors. We present an in-depth analysis of SSOD under real-worldconditions, uncovering causes of suboptimal pseudo-labeling and key trade-offsbetween label quality and quantity. Based on our findings, we propose fourbuilding blocks that can be seamlessly integrated into an SSOD framework. RareClass Collage (RCC): a data augmentation method that enhances therepresentation of rare classes by creating collages of rare objects. Rare ClassFocus (RCF): a stratified batch sampling strategy that ensures a more balancedrepresentation of all classes during training. Ground Truth Label Correction(GLC): a label refinement method that identifies and corrects false, missing,and noisy ground truth labels by leveraging the consistency of teacher modelpredictions. Pseudo-Label Selection (PLS): a selection method for removinglow-quality pseudo-labeled images, guided by a novel metric estimating themissing detection rate while accounting for class rarity. We validate ourmethods through comprehensive experiments on autonomous driving datasets,resulting in up to 6% increase in SSOD performance. Overall, our investigationand novel, data-centric, and broadly applicable building blocks enable robustand effective SSOD in complex, real-world scenarios. Code is available athttps://mos-ks.github.io/publications.

Quick Read (beta)

loading the full paper ...