Abstract
We describe how answer-set programs can be used to declaratively specifycounterfactual interventions on entities under classification, and reason aboutthem. In particular, they can be used to define and compute responsibilityscores as attribution-based explanations for outcomes from classificationmodels. The approach allows for the inclusion of domain knowledge and supportsquery answering. A detailed example with a naive-Bayes classifier is presented.
Quick Read (beta)
loading the full paper ...