Abstract
Active learning strategies aim to train high-performance models with minimallabeled data by selecting the most informative instances for labeling. However,existing methods for assessing data informativeness often fail to aligndirectly with task model performance metrics, such as mean average precision(mAP) in object detection. This paper introduces Mean-AP Guided ReinforcedActive Learning for Object Detection (MGRAL), a novel approach that leveragesthe concept of expected model output changes as informativeness for deepdetection networks, directly optimizing the sampling strategy using mAP. MGRALemploys a reinforcement learning agent based on LSTM architecture toefficiently navigate the combinatorial challenge of batch sample selection andthe non-differentiable nature between performance and selected batches. Theagent optimizes selection using policy gradient with mAP improvement as thereward signal. To address the computational intensity of mAP estimation withunlabeled samples, we implement fast look-up tables, ensuring real-worldfeasibility. We evaluate MGRAL on PASCAL VOC and MS COCO benchmarks acrossvarious backbone architectures. Our approach demonstrates strong performance,establishing a new paradigm in reinforcement learning-based active learning forobject detection.