Active Learning with Safety Constraints

Abstract

Active learning methods have shown great promise in reducing the number ofsamples necessary for learning. As automated learning systems are adopted intoreal-time, real-world decision-making pipelines, it is increasingly importantthat such algorithms are designed with safety in mind. In this work weinvestigate the complexity of learning the best safe decision in interactiveenvironments. We reduce this problem to a constrained linear bandits problem,where our goal is to find the best arm satisfying certain (unknown) safetyconstraints. We propose an adaptive experimental design-based algorithm, whichwe show efficiently trades off between the difficulty of showing an arm isunsafe vs suboptimal. To our knowledge, our results are the first on best-armidentification in linear bandits with safety constraints. In practice, wedemonstrate that this approach performs well on synthetic and real worlddatasets.

Quick Read (beta)

loading the full paper ...