Sampling Prediction-Matching Examples in Neural Networks: A Probabilistic Programming Approach

Abstract

Though neural network models demonstrate impressive performance, we do notunderstand exactly how these black-box models make individual predictions. Thisdrawback has led to substantial research devoted to understand these models inareas such as robustness, interpretability, and generalization ability. In thispaper, we consider the problem of exploring the prediction level sets of aclassifier using probabilistic programming. We define a prediction level set tobe the set of examples for which the predictor has the same specifiedprediction confidence with respect to some arbitrary data distribution.Notably, our sampling-based method does not require the classifier to bedifferentiable, making it compatible with arbitrary classifiers. As a specificinstantiation, if we take the classifier to be a neural network and the datadistribution to be that of the training data, we can obtain examples that willresult in specified predictions by the neural network. We demonstrate thistechnique with experiments on a synthetic dataset and MNIST. Such level sets inclassification may facilitate human understanding of classification behaviors.

Quick Read (beta)

loading the full paper ...