Restricting the Flow: Information Bottlenecks for Attribution

Abstract

Attribution methods provide insights into the decision-making of machinelearning models like artificial neural networks. For a given input sample, theyassign a relevance score to each individual input variable, such as the pixelsof an image. In this work we adapt the information bottleneck concept forattribution. By adding noise to intermediate feature maps we restrict the flowof information and can quantify (in bits) how much information image regionsprovide. We compare our method against ten baselines using three differentmetrics on VGG-16 and ResNet-50, and find that our methods outperform allbaselines in five out of six settings. The method's information-theoreticfoundation provides an absolute frame of reference for attribution values(bits) and a guarantee that regions scored close to zero are not necessary forthe network's decision.

Quick Read (beta)

loading the full paper ...