Abstract
Artificial intelligence (AI) is transforming medicine and showing promise inimproving clinical diagnosis. In breast cancer screening, several recentstudies show that AI has the potential to improve radiologists' accuracy,subsequently helping in early cancer diagnosis and reducing unnecessary workup.As the number of proposed models and their complexity grows, it is becomingincreasingly difficult to re-implement them in order to reproduce the resultsand to compare different approaches. To enable reproducibility of research inthis application area and to enable comparison between different methods, werelease a meta-repository containing deep learning models for classification ofscreening mammograms. This meta-repository creates a framework that enables theevaluation of machine learning models on any private or public screeningmammography data set. At its inception, our meta-repository contains fivestate-of-the-art models with open-source implementations and cross-platformcompatibility. We compare their performance on five international data sets:two private New York University breast cancer screening data sets as well asthree public (DDSM, INbreast and Chinese Mammography Database) data sets. Ourframework has a flexible design that can be generalized to other medical imageanalysis tasks. The meta-repository is available athttps://www.github.com/nyukat/mammography_metarepository.