Submodular Information Selection for Hypothesis Testing with Misclassification Penalties

Abstract

We consider the problem of selecting an optimal subset of information sourcesfor a hypothesis testing/classification task where the goal is to identify thetrue state of the world from a finite set of hypotheses, based on finiteobservation samples from the sources. In order to characterize the learningperformance, we propose a misclassification penalty framework, which enablesnonuniform treatment of different misclassification errors. In a centralizedBayesian learning setting, we study two variants of the subset selectionproblem: (i) selecting a minimum cost information set to ensure that themaximum penalty of misclassifying the true hypothesis is below a desired boundand (ii) selecting an optimal information set under a limited budget tominimize the maximum penalty of misclassifying the true hypothesis. Undercertain assumptions, we prove that the objective (or constraints) of thesecombinatorial optimization problems are weak (or approximate) submodular, andestablish high-probability performance guarantees for greedy algorithms.Further, we propose an alternate metric for information set selection which isbased on the total penalty of misclassification. We prove that this metric issubmodular and establish near-optimal guarantees for the greedy algorithms forboth the information set selection problems. Finally, we present numericalsimulations to validate our theoretical results over several randomly generatedinstances.

Quick Read (beta)

loading the full paper ...