Generative Adversarial Perturbations

Abstract

In this paper, we propose novel generative models for creating adversarialexamples, slightly perturbed images resembling natural images but maliciouslycrafted to fool pre-trained models. We present trainable deep neural networksfor transforming images to adversarial perturbations. Our proposed models canproduce image-agnostic and image-dependent perturbations for both targeted andnon-targeted attacks. We also demonstrate that similar architectures canachieve impressive results in fooling classification and semantic segmentationmodels, obviating the need for hand-crafting attack methods for each task.Using extensive experiments on challenging high-resolution datasets such asImageNet and Cityscapes, we show that our perturbations achieve high foolingrates with small perturbation norms. Moreover, our attacks are considerablyfaster than current iterative methods at inference time.

Quick Read (beta)

loading the full paper ...