ReCon: Region-Controllable Data Augmentation with Rectification and Alignment for Object Detection

  • 2025-10-17 16:06:06
  • Haowei Zhu, Tianxiang Pan, Rui Qin, Jun-Hai Yong, Bin Wang
  • 0

Abstract

The scale and quality of datasets are crucial for training robust perceptionmodels. However, obtaining large-scale annotated data is both costly andtime-consuming. Generative models have emerged as a powerful tool for dataaugmentation by synthesizing samples that adhere to desired distributions.However, current generative approaches often rely on complex post-processing orextensive fine-tuning on massive datasets to achieve satisfactory results, andthey remain prone to content-position mismatches and semantic leakage. Toovercome these limitations, we introduce ReCon, a novel augmentation frameworkthat enhances the capacity of structure-controllable generative models forobject detection. ReCon integrates region-guided rectification into thediffusion sampling process, using feedback from a pre-trained perception modelto rectify misgenerated regions within diffusion sampling process. We furtherpropose region-aligned cross-attention to enforce spatial-semantic alignmentbetween image regions and their textual cues, thereby improving both semanticconsistency and overall image fidelity. Extensive experiments demonstrate thatReCon substantially improve the quality and trainability of generated data,achieving consistent performance gains across various datasets, backbonearchitectures, and data scales. Our code is available athttps://github.com/haoweiz23/ReCon .

 

Quick Read (beta)

loading the full paper ...