Abstract
Foundation models (FMs) are changing the way medical images are analyzed bylearning from large collections of unlabeled data. Instead of relying onmanually annotated examples, FMs are pre-trained to learn general-purposevisual features that can later be adapted to specific clinical tasks withlittle additional supervision. In this review, we examine how FMs are beingdeveloped and applied in pathology, radiology, and ophthalmology, drawing onevidence from over 150 studies. We explain the core components of FM pipelines,including model architectures, self-supervised learning methods, and strategiesfor downstream adaptation. We also review how FMs are being used in eachimaging domain and compare design choices across applications. Finally, wediscuss key challenges and open questions to guide future research.