Attacking Convolutional Neural Network using Differential Evolution

Abstract

The output of Convolutional Neural Networks (CNN) has been shown to bediscontinuous which can make the CNN image classifier vulnerable to smallwell-tuned artificial perturbations. That is, images modified by adding suchperturbations(i.e. adversarial perturbations) that make little difference tohuman eyes, can completely alter the CNN classification results. In this paper,we propose a practical attack using differential evolution(DE) for generatingeffective adversarial perturbations. We comprehensively evaluate theeffectiveness of different types of DEs for conducting the attack on differentnetwork structures. The proposed method is a black-box attack which onlyrequires the miracle feedback of the target CNN systems. The results show thatunder strict constraints which simultaneously control the number of pixelschanged and overall perturbation strength, attacking can achieve 72.29%, 78.24%and 61.28% non-targeted attack success rates, with 88.68%, 99.85% and 73.07%confidence on average, on three common types of CNNs. The attack only requiresmodifying 5 pixels with 20.44, 14.76 and 22.98 pixel values distortion. Thus,the result shows that the current DNNs are also vulnerable to such simplerblack-box attacks even under very limited attack conditions.

Quick Read (beta)

loading the full paper ...