Bagged Regularized $k$-Distances for Anomaly Detection

  • 2025-08-01 16:12:08
  • Yuchao Cai, Hanfang Yang, Yuheng Ma, Hanyuan Hang
  • 0

Abstract

We consider the paradigm of unsupervised anomaly detection, which involvesthe identification of anomalies within a dataset in the absence of labeledexamples. Though distance-based methods are top-performing for unsupervisedanomaly detection, they suffer heavily from the sensitivity to the choice ofthe number of the nearest neighbors. In this paper, we propose a newdistance-based algorithm called bagged regularized $k$-distances for anomalydetection (BRDAD), converting the unsupervised anomaly detection problem into aconvex optimization problem. Our BRDAD algorithm selects the weights byminimizing the surrogate risk, i.e., the finite sample bound of the empiricalrisk of the bagged weighted $k$-distances for density estimation (BWDDE). Thisapproach enables us to successfully address the sensitivity challenge of thehyperparameter choice in distance-based algorithms. Moreover, when dealing withlarge-scale datasets, the efficiency issues can be addressed by theincorporated bagging technique in our BRDAD algorithm. On the theoretical side,we establish fast convergence rates of the AUC regret of our algorithm anddemonstrate that the bagging technique significantly reduces the computationalcomplexity. On the practical side, we conduct numerical experiments toillustrate the insensitivity of the parameter selection of our algorithmcompared with other state-of-the-art distance-based methods. Furthermore, ourmethod achieves superior performance on real-world datasets with the introducedbagging technique compared to other approaches.

 

Quick Read (beta)

loading the full paper ...