Foundation Models -- A Panacea for Artificial Intelligence in Pathology?

  • 2025-03-03 10:35:23
  • Nita Mulliqi, Anders Blilie, Xiaoyi Ji, Kelvin Szolnoky, Henrik Olsson, Sol Erika Boman, Matteo Titus, Geraldine Martinez Gonzalez, Julia Anna Mielcarz, Masi Valkonen, Einar Gudlaugsson, Svein R. Kjosavik, José Asenjo, Marcello Gambacorta, Paolo Libretti, Marcin Braun, Radzislaw Kordek, Roman Łowicki, Kristina Hotakainen, Päivi Väre, Bodil Ginnerup Pedersen, Karina Dalsgaard Sørensen, Benedicte Parm Ulhøi, Pekka Ruusuvuori, Brett Delahunt, Hemamali Samaratunga, Toyonori Tsuzuki, Emilius A. M. Janssen, Lars Egevad, Martin Eklund, Kimmo Kartasalo
  • 0

Abstract

The role of artificial intelligence (AI) in pathology has evolved from aidingdiagnostics to uncovering predictive morphological patterns in whole slideimages (WSIs). Recently, foundation models (FMs) leveraging self-supervisedpre-training have been widely advocated as a universal solution for diversedownstream tasks. However, open questions remain about their clinicalapplicability and generalization advantages over end-to-end learning usingtask-specific (TS) models. Here, we focused on AI with clinical-gradeperformance for prostate cancer diagnosis and Gleason grading. We present thelargest validation of AI for this task, using over 100,000 core needle biopsiesfrom 7,342 patients across 15 sites in 11 countries. We compared two FMs with afully end-to-end TS model in a multiple instance learning framework. Ourfindings challenge assumptions that FMs universally outperform TS models. WhileFMs demonstrated utility in data-scarce scenarios, their performance convergedwith - and was in some cases surpassed by - TS models when sufficient labeledtraining data were available. Notably, extensive task-specific trainingmarkedly reduced clinically significant misgrading, misdiagnosis of challengingmorphologies, and variability across different WSI scanners. Additionally, FMsused up to 35 times more energy than the TS model, raising concerns about theirsustainability. Our results underscore that while FMs offer clear advantagesfor rapid prototyping and research, their role as a universal solution forclinically applicable medical AI remains uncertain. For high-stakes clinicalapplications, rigorous validation and consideration of task-specific trainingremain critically important. We advocate for integrating the strengths of FMsand end-to-end learning to achieve robust and resource-efficient AI pathologysolutions fit for clinical use.

 

Quick Read (beta)

loading the full paper ...