From "Where" to "What": Towards Human-Understandable Explanations through Concept Relevance Propagation

Abstract

The emerging field of eXplainable Artificial Intelligence (XAI) aims to bringtransparency to today's powerful but opaque deep learning models. While localXAI methods explain individual predictions in form of attribution maps, therebyidentifying where important features occur (but not providing information aboutwhat they represent), global explanation techniques visualize what concepts amodel has generally learned to encode. Both types of methods thus only providepartial insights and leave the burden of interpreting the model's reasoning tothe user. Only few contemporary techniques aim at combining the principlesbehind both local and global XAI for obtaining more informative explanations.Those methods, however, are often limited to specific model architectures orimpose additional requirements on training regimes or data and labelavailability, which renders the post-hoc application to arbitrarily pre-trainedmodels practically impossible. In this work we introduce the Concept RelevancePropagation (CRP) approach, which combines the local and global perspectives ofXAI and thus allows answering both the "where" and "what" questions forindividual predictions, without additional constraints imposed. We furtherintroduce the principle of Relevance Maximization for finding representativeexamples of encoded concepts based on their usefulness to the model. We therebylift the dependency on the common practice of Activation Maximization and itslimitations. We demonstrate the capabilities of our methods in varioussettings, showcasing that Concept Relevance Propagation and RelevanceMaximization lead to more human interpretable explanations and provide deepinsights into the model's representations and reasoning through conceptatlases, concept composition analyses, and quantitative investigations ofconcept subspaces and their role in fine-grained decision making.

Quick Read (beta)

loading the full paper ...