Abstract
Accurate segmentation of nodules in both 2D breast ultrasound (BUS) and 3Dautomated breast ultrasound (ABUS) is crucial for clinical diagnosis andtreatment planning. Therefore, developing an automated system for nodulesegmentation can enhance user independence and expedite clinical analysis.Unlike fully-supervised learning, weakly-supervised segmentation (WSS) canstreamline the laborious and intricate annotation process. However, current WSSmethods face challenges in achieving precise nodule segmentation, as many ofthem depend on inaccurate activation maps or inefficient pseudo-mask generationalgorithms. In this study, we introduce a novel multi-agent reinforcementlearning-based WSS framework called Flip Learning, which relies solely on 2D/3Dboxes for accurate segmentation. Specifically, multiple agents are employed toerase the target from the box to facilitate classification tag flipping, withthe erased region serving as the predicted segmentation mask. The keycontributions of this research are as follows: (1) Adoption of asuperpixel/supervoxel-based approach to encode the standardized environment,capturing boundary priors and expediting the learning process. (2) Introductionof three meticulously designed rewards, comprising a classification scorereward and two intensity distribution rewards, to steer the agents' erasingprocess precisely, thereby avoiding both under- and over-segmentation. (3)Implementation of a progressive curriculum learning strategy to enable agentsto interact with the environment in a progressively challenging manner, therebyenhancing learning efficiency. Extensively validated on the large in-house BUSand ABUS datasets, our Flip Learning method outperforms state-of-the-art WSSmethods and foundation models, and achieves comparable performance asfully-supervised learning algorithms.