Gradient-Regularized Out-of-Distribution Detection

Abstract

One of the challenges for neural networks in real-life applications is theoverconfident errors these models make when the data is not from the originaltraining distribution. Addressing this issue is known as Out-of-Distribution (OOD) detection. Many state-of-the-art OOD methods employ an auxiliary dataset as a surrogatefor OOD data during training to achieve improved performance. However, these methods fail to fully exploit the local information embeddedin the auxiliary dataset. In this work, we propose the idea of leveraging the information embedded inthe gradient of the loss function during training to enable the network to notonly learn a desired OOD score for each sample but also to exhibit similarbehavior in a local neighborhood around each sample. We also develop a novel energy-based sampling method to allow the network tobe exposed to more informative OOD samples during the training phase. This isespecially important when the auxiliary dataset is large. We demonstrate theeffectiveness of our method through extensive experiments on several OODbenchmarks, improving the existing state-of-the-art FPR95 by 4% on our ImageNetexperiment. We further provide a theoretical analysis through the lens of certifiedrobustness and Lipschitz analysis to showcase the theoretical foundation of ourwork. We will publicly release our code after the review process.

Quick Read (beta)

loading the full paper ...