Abstract
Convolutional neural networks (CNN) for medical imaging are constrained bythe number of annotated data required in the training stage. Usually, manualannotation is considered to be the "gold standard". However, medical imagingdatasets that include expert manual segmentation are scarce as this step istime-consuming, and therefore expensive. Moreover, single-rater manualannotation is most often used in data-driven approaches making the networkoptimal with respect to only that single expert. In this work, we propose a CNNfor brain extraction in magnetic resonance (MR) imaging, that is fully trainedwith what we refer to as silver standard masks. Our method consists of 1)developing a dataset with "silver standard" masks as input, and implementingboth 2) a tri-planar method using parallel 2D U-Net-based CNNs (referred to asCONSNet) and 3) an auto-context implementation of CONSNet. The term CONSNetrefers to our integrated approach, i.e., training with silver standard masksand using a 2D U-Net-based architecture. Our results showed that weoutperformed (i.e., larger Dice coefficients) the current state-of-the-art SSmethods. Our use of silver standard masks reduced the cost of manualannotation, decreased inter-intra-rater variability, and avoided CNNsegmentation super-specialization towards one specific manual annotationguideline that can occur when gold standard masks are used. Moreover, the usageof silver standard masks greatly enlarges the volume of input annotated databecause we can relatively easily generate labels for unlabeled data. Inaddition, our method has the advantage that, once trained, it takes only a fewseconds to process a typical brain image volume using modern hardware, such asa high-end graphics processing unit. In contrast, many of the other competitivemethods have processing times in the order of minutes.