A BIC-based Mixture Model Defense against Data Poisoning Attacks on Classifiers

  • 2022-05-12 17:42:32
  • Xi Li, David J. Miller, Zhen Xiang, George Kesidis
  • 0

Abstract

Data Poisoning (DP) is an effective attack that causes trained classifiers tomisclassify their inputs. DP attacks significantly degrade a classifier'saccuracy by covertly injecting attack samples into the training set. Broadlyapplicable to different classifier structures, without strong assumptions aboutthe attacker, an {\it unsupervised} Bayesian Information Criterion (BIC)-basedmixture model defense against "error generic" DP attacks is herein proposedthat: 1) addresses the most challenging {\it embedded} DP scenario wherein, ifDP is present, the poisoned samples are an {\it a priori} unknown subset of thetraining set, and with no clean validation set available; 2) applies a mixturemodel both to well-fit potentially multi-modal class distributions and tocapture poisoned samples within a small subset of the mixture components; 3)jointly identifies poisoned components and samples by minimizing the BIC costdefined over the whole training set, with the identified poisoned data removedprior to classifier training. Our experimental results, for various classifierstructures and benchmark datasets, demonstrate the effectiveness anduniversality of our defense under strong DP attacks, as well as its superiorityover other works.

 

Quick Read (beta)

loading the full paper ...