Explainable AI Methods for Neuroimaging: Systematic Failures of Common Tools, the Need for Domain-Specific Validation, and a Proposal for Safe Application

Abstract

Trustworthy interpretation of deep learning models is critical forneuroimaging applications, yet commonly used Explainable AI (XAI) methods lackrigorous validation, risking misinterpretation. We performed the firstlarge-scale, systematic comparison of XAI methods on ~45,000 structural brainMRIs using a novel XAI validation framework. This framework establishesverifiable ground truth by constructing prediction tasks with known signalsources - from localized anatomical features to subject-specific clinicallesions - without artificially altering input images. Our analysis revealssystematic failures in two of the most widely used methods: GradCAMconsistently failed to localize predictive features, while Layer-wise RelevancePropagation generated extensive, artifactual explanations that suggestincompatibility with neuroimaging data characteristics. Our results indicatethat these failures stem from a domain mismatch, where methods with designprinciples tailored to natural images require substantial adaptation forneuroimaging data. In contrast, the simpler, gradient-based method SmoothGrad,which makes fewer assumptions about data structure, proved consistentlyaccurate, suggesting its conceptual simplicity makes it more robust to thisdomain shift. These findings highlight the need for domain-specific adaptationand validation of XAI methods, suggest that interpretations from priorneuroimaging studies using standard XAI methodology warrant re-evaluation, andprovide urgent guidance for practical application of XAI in neuroimaging.

Quick Read (beta)

loading the full paper ...