Abstract
This paper addresses the inherent limitations of conventional bottleneckstructures (diminished instance discriminability due to overemphasis on batchstatistics) and decoupled heads (computational redundancy) in object detectionframeworks by proposing two novel modules: the Instance-Specific Bottleneckwith full-channel global self-attention (ISB) and the Instance-SpecificAsymmetric Decoupled Head (ISADH). The ISB module innovatively reconstructsfeature maps to establish an efficient full-channel global attention mechanismthrough synergistic fusion of batch-statistical and instance-specific features.Complementing this, the ISADH module pioneers an asymmetric decoupledarchitecture enabling hierarchical multi-dimensional feature integration viadual-stream batch-instance representation fusion. Extensive experiments on theMS-COCO benchmark demonstrate that the coordinated deployment of ISB and ISADHin the YOLO-PRO framework achieves state-of-the-art performance across allcomputational scales. Specifically, YOLO-PRO surpasses YOLOv8 by 1.0-1.6% AP(N/S/M/L/X scales) and outperforms YOLO11 by 0.1-0.5% AP in critical N/M/L/Xgroups, while maintaining competitive computational efficiency. This workprovides practical insights for developing high-precision detectors deployableon edge devices.