Distribution-Free, Risk-Controlling Prediction Sets

Abstract

To communicate instance-wise uncertainty for prediction tasks, we show how togenerate set-valued predictions for black-box predictors that control theexpected loss on future test points at a user-specified level. Our approachprovides explicit finite-sample guarantees for any dataset by using a holdoutset to calibrate the size of the prediction sets. This framework enablessimple, distribution-free, rigorous error control for many tasks, and wedemonstrate it in five large-scale machine learning problems: (1)classification problems where some mistakes are more costly than others; (2)multi-label classification, where each observation has multiple associatedlabels; (3) classification problems where the labels have a hierarchicalstructure; (4) image segmentation, where we wish to predict a set of pixelscontaining an object of interest; and (5) protein structure prediction. Lastly,we discuss extensions to uncertainty quantification for ranking, metriclearning and distributionally robust learning.

Quick Read (beta)

loading the full paper ...