Shapley explainability on the data manifold

Abstract

Explainability in AI is crucial for model development, compliance withregulation, and providing operational nuance to predictions. The Shapleyframework for explainability attributes a model's predictions to its inputfeatures in a mathematically principled and model-agnostic way. However,general implementations of Shapley explainability make an untenable assumption:that the model's features are uncorrelated. In this work, we demonstrateunambiguous drawbacks of this assumption and develop two solutions to Shapleyexplainability that respect the data manifold. One solution, based ongenerative modelling, provides flexible access to data imputations; the otherdirectly learns the Shapley value-function, providing performance and stabilityat the cost of flexibility. While "off-manifold" Shapley values can (i) giverise to incorrect explanations, (ii) hide implicit model dependence onsensitive attributes, and (iii) lead to unintelligible explanations inhigher-dimensional data, on-manifold explainability overcomes these problems.

Quick Read (beta)

loading the full paper ...