Abstract
In this paper, we propose novel generative models for creating adversarialexamples, slightly perturbed images resembling natural images but maliciouslycrafted to fool pre-trained models. We present trainable deep neural networksfor transforming images to adversarial perturbations. Our proposed models canproduce image-agnostic and image-dependent perturbations for both targeted andnon-targeted attacks. We also demonstrate that similar architectures canachieve impressive results in fooling classification and semantic segmentationmodels, obviating the need for hand-crafting attack methods for each task.Using extensive experiments on challenging high-resolution datasets such asImageNet and Cityscapes, we show that our perturbations achieve high foolingrates with small perturbation norms. Moreover, our attacks are considerablyfaster than current iterative methods at inference time.