Towards Large yet Imperceptible Adversarial Image Perturbations with Perceptual Color Distance

  • 2019-11-06 16:27:32
  • Zhengyu Zhao, Zhuoran Liu, Martha Larson
  • 0

Abstract

The success of image perturbations that are designed to fool imageclassification is assessed in terms of both adversarial effect and visualimperceptibility. In this work, we investigate the contribution of human colorperception to perturbations that are not noticeable. Our basic insight is thatperceptual color distance makes it possible to drop the conventional assumptionthat imperceptible perturbations should strive for small $L_p$ norms in RGBspace. Our first approach, Perceptual Color distance C&W (PerC-C&W), extendsthe widely-used C&W approach and produces larger RGB perturbations. PerC-C&W isable to maintain adversarial strength, while contributing to imperceptibility.Our second approach, Perceptual Color distance Alternating Loss (PerC-AL),achieves the same outcome, but does so more efficiently by alternating betweenthe classification loss and perceptual color difference when updatingperturbations. Experimental evaluation shows PerC approaches improve robustnessand transferability of perturbations over conventional approaches and alsodemonstrates that the PerC distance can provide added value on top of existingstructure-based approaches to creating image perturbations.

 

Quick Read (beta)

Towards Large yet Imperceptible Adversarial Image Perturbations with Perceptual Color Distance

Zhengyu Zhao, Zhuoran Liu, Martha Larson
Radboud University, Nijmegen, Netherlands
{z.zhao, z.liu, m.larson}@cs.ru.nl
Abstract

The success of image perturbations that are designed to fool image classification is assessed in terms of both adversarial effect and visual imperceptibility. In this work, we investigate the contribution of human color perception to perturbations that are not noticeable. Our basic insight is that perceptual color distance makes it possible to drop the conventional assumption that imperceptible perturbations should strive for small Lp norms in RGB space. Our first approach, Perceptual Color distance C&W (PerC-C&W), extends the widely-used C&W approach and produces larger RGB perturbations. PerC-C&W is able to maintain adversarial strength, while contributing to imperceptibility. Our second approach, Perceptual Color distance Alternating Loss (PerC-AL), achieves the same outcome, but does so more efficiently by alternating between the classification loss and perceptual color difference when updating perturbations. Experimental evaluation shows PerC approaches improve robustness and transferability of perturbations over conventional approaches and also demonstrates that the PerC distance can provide added value on top of existing structure-based approaches to creating image perturbations.

Figure 1: Comparison of (a) C&W [carlini2017towards] with (b) our PerC-C&W. Perceptual color (PerC) distance allows larger RGB perturbations (cf. L2 and L norm in middle row), while also contributing to imperceptibility (bottom row). (Setting: untargeted with κ=40; classifier Inception v3.)

1 Introduction

Research on creating adversarial examples for deep visual classifiers has focused on perturbations that cause misclassification while being imperceptible to the human eye [szegedy2013intriguing, papernot2016limitations, carlini2017towards]. Larger image perturbations are known to improve adversarial strength (i.e., the ability to fool a classifier), but are also associated with visually noticeable changes in the image. A commonly agreed-upon assumption is that tight Lp-norm constraints on the size of adversarial perturbations in RGB space are a good guarantee of imperceptibility. Evaluation of adversarial examples has conventionally followed this assumption, considering perturbations with smaller Lp norms to be better (e.g., L [goodfellow2014explaining, kurakin2016adversarial, carlini2017towards], L2 [szegedy2013intriguing, moosavi2016deepfool, carlini2017towards] and L0 [papernot2016limitations, carlini2017towards]). Keeping with this assumption, defense approaches are designed to be effective against adversarial perturbations under a specific Lp bound [tramer2017ensemble, madry2017towards, wong2018provable, cohen2019certified]. Our research is motivated by the importance of questioning the necessity of small RGB perturbations for imperceptibility.

In this work, we propose to create adversarial examples by perturbing images with respect to perceptual color (PerC) distance. Using PerC distance makes it possible to move away from the assumption that it is necessary to tightly constrain the Lp norm of the perturbations in RGB space. Fig. 1 illustrates the difference between C&W [carlini2017towards], a well-known approach that perturbs with respect to an Lp norm in RGB space, and our own extension, PerC-C&W, which perturbs with respect to a perceptual color distance. PerC perturbations are less perceptible, especially in smooth regions of saturated color (cf. Fig. 1 in bottom row). Also, they are distributed strategically over the RGB color channels (cf. downsized perturbation images in the middle row). PerC distance effectively allows us to hide large perturbations in RGB space, in a way not readily noticeable to the human eye. Our PerC-based approaches can increase the Lp norm substantially (cf. Fig. 1, L2 and L in middle row), leading to a strong adversarial effect that maintains imperceptibility.

Fig. 2 motivates the use of perceptual color distance for creating adversarial images. Here, we have taken a solid color image (left) and added the same perturbations to the green channel (middle) and to the blue channel (right). Although both RGB channels were perturbed identically, the perturbations are only visible in the green channel. The reason is that color as it is perceived by the human eye does not change uniformly over distance in RGB space. Relatively small perturbations in RGB space may correspond to large difference in perceptual color space. Conversely, relatively large changes in RGB space may remain unnoticeable if they lead to small perceived color difference.

Our work is in line with a growing awareness in the literature on adversarial examples that the difference between two images as measured by an Lp norm in RGB space is actually quite poorly aligned with human perception [sharif2018suitability]. Building on this observation, researchers have attempted to address imperceptibility by exploiting similarity defined with respect to semantics [engstrom2017rotation, hosseini2018semantic, sharif2019adversarial, joshi2019semantic, eykholt2017robust] or structural information [luo2018towards, gragnaniello2019perceptual, zhang2019smooth, Croce_2019_ICCV] in the image. However, little work on adversarial examples has questioned the wisdom of optimizing perturbations with respect to distance in RGB space. The exceptions are a handful of approaches that have proposed allowing only luminance change when perturbing pixels [gragnaniello2019perceptual, Croce_2019_ICCV]. The approach that is closest to our own is [athalye2018synthesizing], which perturbs in CIELAB color space, but carries out no investigation of the potential and limitations of the idea. Our work is distinct from this initial effort because we use a more accurate polar form (known as CIELCH) of the CIELAB color space, and more importantly, use an actual perceptual color distance. The distance is CIEDE2000 [luo2001development, ciede2000], and will be discussed in detail in Section 2. To our knowledge, ours is the first work that proposes optimizing adversarial image perturbations directly with respect to a perceptual color distance.

In order to fully appreciate our proposal, it is necessary to understand two key aspects. First, we do not claim that PerC approaches will always yield dramatically less perceptible perturbations than conventional RGB approaches. For cases in which the perturbations are small, the difference may not be so great. However, we find that there are two cases in which PerC approaches are particularly important. First, our experimental results (see Section 5.2.2) show that as we attempt to create high-confidence adversarial examples that contain larger and larger perturbations, it becomes important to perturb with respect to perceptual color distance. Second, we demonstrate that the effect of PerC approaches is additive and can be used in combination with existing structural approaches to improve imperceptibility.

Figure 2: Left: Original image (a 20×20 8-bit RGB image patch with color (15,240,15)). Middle: Image perturbed by adding noise in the G channel, sampled from a uniform distribution in the range [-15,15]. Right: Image perturbed by adding the identical noise, but in the B channel. The B-channel perturbations are imperceptible.

The contributions of this paper are as follows:

  • An in-depth study of the use of perceptual color (PerC) distance to hide large RGB perturbations in images.

  • PerC-C&W: a method for creating adversarial images that introduces perceptual color distance into the joint optimization of C&W.

  • PerC-AL: an efficient method that optimizes alternating loss (AL) functions, switching between classification loss and perceptual color difference.

  • Experimental validation demonstrating that PerC perturbations at high-confidence settings yield more robust and transferable adversarial examples, without sacrificing imperceptibility.

  • Experimental results showing that PerC perturbations can be used in combination with structural information for further improvement of imperceptibility.

We release the code including a differentiable solution compatible with PyTorch’s autograd to efficiently implement perceptual color distance (CIEDE2000).11 1 Code available at https://github.com/ZhengyuZhao/PerC-Adversarial.

2 Background on Perceptual Color Distance

Conventionally, computer vision research has intensively explored color and human perception, but has paid surprisingly little attention to distance in perceptual color spaces. Here, we mention some key points about color in computer vision history. Early on, research focused on intensity-based descriptors, which then evolved to also capture color information. Unsurprisingly, color boosted the performance of object and scene recognition [khan2012color, van2009evaluating] and semantic segmentation [cheng2001color]. Researchers extracted descriptors from opponent color spaces, most notably HSV and CIELAB, which separate luminance and chrominance. Most recently, color is attracting more attention in the area of image synthesis. Notable examples, such as style transfer [gatys2016preserving] and cross-domain image generation [taigman2016unsupervised], find that color plays an important role in preserving the look of an image. In general, we observe that until now the focus has been on the color space itself, and not on color distance, which we explore here.

The perceptual color distance that we use is CIEDE2000 [luo2001development, ciede2000], which is the latest ΔE standard formula developed by the CIE (International Commission on Illumination), and has been experimentally demonstrated to have strong agreement with human perception. Specifically, the perceptual color distance between two pixels in the CIELCH space can be calculated as:

ΔE00=(ΔLkLSL)2+(ΔCkCSC)2+(ΔHkHSH)2+ΔR,ΔR=RT(ΔCkCSC)(ΔHkHSH), (1)

where ΔL, ΔC, ΔH denotes the distance between pixel values of the three channels, L (lightness), C (chroma) and H (hue), and ΔR is an interactive term between chroma and hue differences [luo2001development]. The weighting functions SL, SC, SH and RT are determined based on large-scale human studies and act as compensations to better simulate human color perception. The kL, kC and kH are usually unity for the application of graphic arts. Detailed definitions of all the parameters and relevant explanations can be found in [luo2001development]. We note that it is also possible to use an Lp norm to measure distance in CIELAB space. However, this distance is not as close to human perceptual distance as CIEDE2000 is.

We point out that a limited amount of previous research that has also adopted CIEDE2000. However, the goal has been to evaluate the color similarity of image pairs. Examples of such research include work on image quality assessment [yang2012color] and image super-resolution [liu2010colorization]. In contrast, in our work we use CIEDE2000 directly for optimization with back propagation and not only for evaluation.

3 Related work

In this section, we cover the existing literature, which focuses on creating Lp norm-bounded adversarial examples, and we also mention recent approaches that attempt to move beyond Lp norms. We preface our discussion with a short definition of an ‘adversary’, i.e., an approach that generates an adversarial image example. Given a classifier f(𝒙):𝒙ny that predicts a label y for an image 𝒙, the adversary attempts to induce a misclassification by modifying the original 𝒙 to create a new 𝒙. In the untargeted setting, the adversary is successful if the image is classified into an arbitrary class other than y, i.e., meets the condition f(𝒙)y. In the targeted setting, the adversary must ensure that the image is classified into a class with a pre-defined label t, i.e., meets the condition f(𝒙)=t. The untargeted case is generally recognized to be less challenging than the targeted case [carlini2017towards].

3.1 Lp norm-bounded Adversarial Examples

Typically, adversaries  [szegedy2013intriguing, kurakin2016adversarial, goodfellow2014explaining, moosavi2016deepfool, papernot2016limitations, carlini2017towards, rony2019decoupling] create an adversarial image, x, by adding a perturbation vector 𝜹n that is constrained by an Lp norm to the original image, 𝒙. The first Lp norm-bounded approach [szegedy2013intriguing] optimized an objective combining the classification loss and the L2 norm of the perturbations, balanced by a constant λ. Formally, the solution is expressed as:

minimize𝜹λ𝜹2-J(𝒙,y),s.t.𝒙[0,1]n, (2)

where J(𝒙,y) is the cross-entropy loss w.r.t. 𝒙. The authors of [szegedy2013intriguing] solved the problem by using box-constrained L-BFGS (Limited memory Broyden-Fletcher-Goldfarb-Shanno) method [liu1989limited].

The C&W method [carlini2017towards] improves on [szegedy2013intriguing] by introducing a new variable using the tanh function to eliminate the box constraint. Additionally, it introduces a more sophisticated objective function that optimizes differences between logits Z, which are output before the softmax layer. This can be formulated as:

minimize𝒘𝒙-𝒙22+λf(𝒙), (3)
wheref(𝒙)=max(max{Z(𝒙)i:it}-Z(𝒙)t,-κ),
and𝒙=12(tanh(arctanh(𝒙)+𝒘)+1),

where 𝒘 is the new variable and Z(x)i denotes the logits with respect to the i-th class. In an untargeted setting, the definition of f is modified to:

f(𝒙)=max(Z(𝒙)y-max{Z(𝒙)i:iy},-κ). (4)

The parameter κ controls the confidence level of the misclassification. The first approach that we propose, PerC-C&W, is built on C&W. In our experiments, we will vary κ in order to assess the ability of an adversary to create strong adversarial images, i.e., images that are misclassified with high confidence.

Due to the need for line search in order to find the optimal constant, λ, such optimization approach is inevitably time-consuming. For this reason, [goodfellow2014explaining, kurakin2016adversarial, rony2019decoupling] propose a more efficient solution that does not impose a penalty during optimization. Instead, respect of the norm constraint is ensured by projecting perturbations onto an ϵ-sphere around the original image. Specifically, the fast gradient sign method (FGSM) [goodfellow2014explaining] was first proposed to achieve adversarial effect with only one step, formulated as:

𝒙=𝒙+ϵsign(𝒙J(𝒙,y)), (5)

where the perturbation size is implicitly constrained by specifying a small ϵ.

Subsequently, an extension of this method referred to as I-FGSM [kurakin2016adversarial] was introduced for leveraging finer gradient information by iteratively updating the perturbations with a smaller step size α:

𝒙0=𝒙,𝒙k=𝒙k-1+αsign(𝒙J(𝒙k-1,y)), (6)

where the intermediate perturbed image 𝒙k is projected onto a ϵ-sphere around the original 𝒙, to satisfy the L-norm constraint. I-FGSM can also generalize to the L2 norm by changing the sign operation to:

𝒙J(𝒙k-1,y)𝒙J(𝒙k-1,y)2, (7)

where the projection is implemented by:

𝒙k=𝒙+ϵ𝒙k-𝒙𝒙k-𝒙2. (8)

Recently, an efficient method called the Decoupled Direction and Norm (DDN) [rony2019decoupling] was proposed and yielded the best performance (smallest L2 norm) in the untargeted track of NIPS 2018 Adversarial Vision Challenge [brendel2018adversarial], with substantially fewer iterations than the conventional C&W. This method is basically L2 norm-based I-FGSM with the ϵ being adjusted in each iteration based on whether the perturbed image is adversarial or not, leading to a finer-grained search for the minimal norm. Our second approach, PerC-AL, follows a similar strategy as DDN to improve efficiency by decoupling the joint optimization.

3.2 Adversarial examples beyond Lp norms

Our work is part of the current movement away from tight Lp norms and towards conceptualization of image similarity in terms of semantics or perceptual properties. Research that defines similarity in terms of semantics, requires the adversarial image to have the same content as the original image from the point of view of the human viewer. Some of the first work in this direction has explored geometric transformation [engstrom2017rotation, xiao2018spatially], global color shift [hosseini2018semantic, Laidlaw2019functional, bhattad2019big], and image filters [choi2017geo].

Such approaches are interesting, but we do not pursue them here because they tend to be limited in their adversarial strength, due to the restricted size of the search space for possible adversarial image transformations.

Research that investigates similarity with respect to texture and structure [luo2018towards, gragnaniello2019perceptual, zhang2019smooth, Croce_2019_ICCV], has focused on hiding perturbations in image regions with visual variation. In [luo2018towards, Croce_2019_ICCV], image regions with high variance are used to hide image perturbations. In [gragnaniello2019perceptual], additional supervision of structural similarity (SSIM) [wang2004image] is used to guide the perturbation updates. Other work [zhang2019smooth] has applied Laplacian smoothing to obtain image structure, which is used to modify the image while maintaining the original structure. All of these approaches share a common challenge: They have difficulties in dealing with smooth regions (e.g., sky, ground and artificial objects), which appear frequently in images taken in commonly occurring real-world settings (referred to as natural images). In contrast, our PerC perturbations are applicable in smooth regions in the case of saturated color. Our experiments show that it can be combined productively with a structure-based approach.

4 Proposed approaches

In this section, we present two approaches to using perceptual color (PerC) distance for adversarial image perturbations. We focus on image-level accumulated perceptual color difference, i.e., L2 norm of the color distance vector, in which each component represents the perceptual color distance (ΔE00 in Eq. (1)) calculated for the corresponding image pixel.

4.1 Perceptual color distance penalty (PerC-C&W)

Our first approach, PerC-C&W, adopts the joint optimization framework of the well-known C&W, but replaces the original penalty on the L2 norm with a new one based on perceptual color difference. It can be formally expressed as:

minimize𝒘ΔE00(𝒙,𝒙)2+λf(𝒙), (9)

where 𝒘 is the new introduced variable as in the Eq. (3) of C&W. Like the original C&W, the optimization problem is solved by binary search over the constant λ. By using the gradient information from perceptual color difference, the perturbation updating is translated into the perceptually uniform color space. Large RGB perturbations, which have a strong adversarial effect, remain hidden from the human eye, as will be shown in Section 5.

{algorithm}

[t] Alternating Classification Loss and Perceptual Color Differences (PerC-AL) \algrenewcommand\algorithmicrequireInput: \algrenewcommand\algorithmicensureOutput: \algorithmicrequire
𝒙: original image, t: target label, K: number of iterations
αl: step size in minimizing classification loss
αc: step size in minimizing perceptual color difference

\algorithmicensure

𝒙: adversarial image {algorithmic}[1] \StateInitialize 𝒙0𝒙, 𝜹0𝟎 \Fork1 to K \If𝒙k-1 is not adversarial \State𝒈-𝒙J(𝒙k-1,t) \State𝒈αl𝒈𝒈2 \State𝜹k𝜹k-1+𝒈\CommentUpdate 𝜹 in the direction of 𝒈 \Else\StateC2-ΔE00(𝒙,𝒙k-1)2 \State𝒈c𝒙C2 \State𝒈cαc𝒈c𝒈c2 \State𝜹k𝜹k-1+𝒈c\CommentUpdate 𝜹 in the direction of 𝒈c \EndIf\State𝒙kclip(𝒙+𝜹k,0,1) \State𝒙kquantize(𝒙k)\CommentEnsure 𝒙k is valid \EndFor\State\Return𝒙𝒙k that is adversarial and has smallest C2

4.2 Perceptual color distance alternating loss (PerC-AL)

Although, Eq. 9 enjoys a concise expression, the two-term joint optimization of PerC-C&W faces difficulties in practice. Adversarial training [kurakin2016adversarial], for example, presents challenges. The reason is that PerC-C&W requires time-consuming binary search in order to find an optimal λ, which normally varies substantially among different images [rony2019decoupling]. To address the inefficiency, we propose PerC-AL, which decouples the joint optimization by alternately updating the perturbations with respect to either classification loss or perceptual color difference. Our strategy is inspired by DDN, which is basically a projected gradient descent (PGD) method with a dynamic L2-norm bound. However, PerC-AL goes beyond this idea to alternate two gradient descents.

The full PerC-AL method is described in Algorithm 4.1. We start from an original image 𝒙 with the perturbation 𝜹 initialized as 𝟎, and iteratively update it to create an adversarial image. In each iteration, the perturbation is either enlarged to achieve stronger adversarial effect based on the gradients from the classification loss, or shrunk to minimize perceptual color differences. These two operations are alternated based on whether the intermediate perturbed image 𝒙k is adversarial or not. To ensure the final adversarial image is valid, the output is clipped into the range [0,1] and quantized into 255 levels (corresponding to 8-bit image encoding).

5 Experiments

In this section, we first provide a picture of the differences between RGB and PerC approaches (Section 5.2). Then, we carry out experiments that compare different approaches in terms of robustness (Section 5.3) and transferability (Section 5.4), by considering the case of high-confidence adversarial examples. Finally, in Section 5.5, we show that structural information can be elegantly integrated into our efficient decoupled approach, PerC-AL, for further improvement in the imperceptibility of images that contain areas with rich visual variation.

5.1 Experimental setup

Dataset and Networks. Following recent work [xiao2018spatially, zhang2019smooth, dong2019evading], we conduct our experiments on the development set (1000 RGB natural images with the size of 299×299) of the ImageNet-Compatible dataset22 2 https://github.com/tensorflow/cleverhans/tree/master/examples/nips17_adversarial_competition/dataset.. This dataset was introduced by the NIPS 2017 Competition on Adversarial Attacks and Defenses [kurakin2018adversarial] and consists of 6000 images labeled with 1000 ImageNet classes. We choose this dataset because we would like to study imperceptibility under real-world conditions. In contrast, some other work [luo2018towards, Croce_2019_ICCV] on addressing imperceptibility mainly focuses on the tiny images from MNIST [lecun1998gradient] and CIFAR-10 [krizhevsky2009learning]. As in the competition, the Inception V3 [szegedy2016rethinking] model pre-trained on ImageNet is used as the target classifier.

Baselines. Three well-known baselines, namely, I-FGSM [kurakin2016adversarial], C&W [carlini2017towards], and the state-of-the-art DDN [rony2019decoupling], are compared with our approaches. Among them, I-FGSM targets minimum L norm, while C&W and DDN target minimum L2 norm.

Parameters. I-FGSM is repeated multiple times with increased L-norm bound by step size α=1/255 for each time until success.

C&W and PerC-C&W use Adam optimizer [kingma2014adam] with a learning rate of 0.01 for updating the perturbations. We impose a budget on the number of search steps used to find the optimal λ. The initialization of λ is particularly important for small budgets. We perform grid search for the initialization value of λ over the range [0.01, 0.1, 1, 10, 100], and adopt the value that yields the smallest average perturbation size. The selected initialization values are given in the supplementary material.

For DDN and PerC-AL, we decrease the step size (α in DDN and αl in PerC-AL) that is used for updating the perturbations with respect to the classification loss from 1 to 0.01 with cosine annealing. The L2-norm constraint ϵ in DDN is initialized to 1 and adjusted iteratively by γ=0.05, as in the original work DDN [rony2019decoupling]. The αc in PerC-AL is gradually reduced from 0.5 to 0.05 with cosine annealing.

Evaluation Protocol. We investigate a set of reasonable operating points, based on pre-defined budgets. Note that our goal is to show the relative behavior of PerC vs. RGB approaches. For this purpose, we only need to create a fair comparison, and it is not necessary to drive all approaches to an absolute optimum. For each image, an approach is considered successful if the perturbed image can achieve adversarial effect with the given budget. Specifically, I-FGSM requires varied repetitions for different images. For C&W and PerC-C&W, the budget refers to N(search steps) × N(iterations of gradient descent). We apply relatively high budget (9×1000), and are also interested in lower budgets (5×200 and 3×100), which are more directly comparable with more efficient approaches, namely, DDN and PerC-AL. We test DDN and our PerC-AL with three different iteration budgets (100, 300 and 1000), adopted from the original work [rony2019decoupling].

Adversarial strength is evaluated by the success rate, i.e., the proportion of successful cases over the whole dataset. The averaged perturbation size over all successful images is reported. It is measured in terms of the L2 and L norm in RGB space (L2¯ and L¯) and also in terms of image-level accumulated perceptual color difference (C2¯).

Approach Budget Success Perturbation Size
Rate (%) L2¯ L¯ C2¯
I-FGSM [kurakin2016adversarial] - 100.0 2.51 1.59 317.96
C&W [carlini2017towards] 3×100 100.0 1.32 8.84 159.85
5×200 100.0 1.09 8.20 132.86
9×1000 100.0 0.92 8.45 114.36
PerC-C&W (ours) 3×100 100.0 2.77 14.29 150.44
5×200 100.0 1.48 12.06 83.93
9×1000 100.0 1.22 15.57 67.79
DDN [rony2019decoupling] 100 100.0 1.00 7.84 136.11
300 100.0 0.88 7.58 120.12
1000 100.0 0.82 7.62 111.65

PerC-AL (ours)
100 100.0 1.30 11.98 69.49
300 100.0 1.17 13.97 61.21
1000 100.0 1.13 17.04 57.10
Table 1: Success rates and perturbation sizes on the 1000 images from the ImageNet-Compatible dataset, with varied budgets in the targeted setting. Perturbation size is quantified in terms of L2 and L norms of the perturbations in RGB space (L2¯ and L¯) and also in terms of image-level accumulated perceptual color difference (C2¯). Note that C&W and PerC-C&W actually need more (here, 5×) iterations to find the optimal initialization of λ. The budget for I-FGSM varies on different images.

5.2 Adversarial strength and imperceptibility

In this section, we investigate the adversarial strength and imperceptibility of the perturbed images by different approaches in a white-box scenario, where the full information of the network is accessible.

5.2.1 Sufficient-confidence adversarial examples

We first present, in Table 1, a comparison demonstrating how PerC approaches relax Lp norms. Our comparison uses adversarial examples created under a commonly used condition where the aim is to achieve a just sufficient adversarial effect. Sufficient-confidence adversarial examples just cross the decision boundary without pursuing a higher confidence score for the adversarial label. As expected, all approaches achieve 100% success rate and the resulting perturbation size gets smaller as the budget increases.

Table 1 confirms that PerC approaches, PerC-C&W and PerC-AL, show the behavior they are designed for, i.e., decreasing the average accumulated perceptual color difference C2¯. More importantly, PerC approaches do this without tightly constraining the Lp norms in RGB space as the other approaches do, as reflected by L2¯ and L¯. Moreover, PerC-AL achieves lower C2¯ than PerC-C&W (57.10 vs. 67.79) with notably fewer iterations. For comparison, we provide C2¯ for the RGB approaches. The untargeted results follow a similar pattern and can be found in the supplementary material.

Approach κ=20 κ=40
Suc. (%) C2¯ Suc. (%) C2¯
I-FGSM [kurakin2016adversarial] 100.0 375.74 99.9 576.06
C&W [carlini2017towards] 100.0 159.00 100.0 241.92
DDN [rony2019decoupling] 100.0 150.68 98.1 238.37
PerC-C&W (ours) 100.0 90.86 100.0 136.22
PerC-AL (ours) 100.0 75.43 100.0 115.17
Table 2: Evaluation of the success rate and perceptual color difference achieved by different approaches on the high-confidence condition.
Figure 3: Examples of adversarial images generated by five different approaches with high confidence level κ=40

5.2.2 High-confidence adversarial examples

In order to gain deeper insight into the performance of our approaches, we investigate adversarial examples that have a high confidence score for the adversarial label. High confidence was initially investigated by [carlini2017towards] in order to achieve more transferable adversarial examples, and also been explored in the “Unrestricted Adversarial Example” contest [brown2018unrestricted]. An approach is regarded as successful only if the logit with respect to the original class becomes lower than the maximum of the other logits by a pre-defined margin κ. For C&W and our PerC-C&W, this requirement can be directly implemented by specifying the factor κ in Eq. (4). For I-FGSM, DDN and PerC-AL, this can be achieved by running the iterations until the required logit difference is satisfied. For this experiment, we adopt the settings generating the smallest perturbations for each approach in Section 5.2.1.

Fig. 3 shows some adversarial examples generated by different approaches at κ=40. The images produced by our PerC approaches look more visually acceptable than those of the other approaches. More examples can be found in our GitHub repository33 3 https://github.com/ZhengyuZhao/PerC-Adversarial.. The good visual appearance of the PerC examples is consistent with their low averaged aggregated perceptual color difference, C2¯, as seen in Table 2, which shows both κ=40 and κ=20 values. The challenge of the high-confidence setting is seen in the success rates, which are not longer perfect for all conditions.

Figure 4: Evaluation of robustness of high-confidence adversarial examples at (a) κ=20 and (b) κ=40, against two types of image transformations: JPEG compression (top row) and bit-depth reduction (bottom row).

5.3 Robustness

In order to gain additional practical insight, we test the robustness of the adversarial examples against two commonly studied image transformation-based defense methods, i.e., JPEG compression [dziugaite2016study, guo2017countering, das2018shield, dong2019evading] and bit-depth reduction [xu2017feature, guo2017countering, he2017adversarial].

The results are shown in Fig. 4. Overall, increasing κ from 20 to 40 leads to improved robustness. For a specific κ, unsurprisingly, I-FGSM outperforms other approaches by a large margin since it greedily perturbs all the pixels, but at the cost of worse image quality (see Fig. 3). Among the other four approaches that target minimal image-level accumulated image difference with very sparse perturbations, the best results are consistently achieved by either our PerC-C&W or PerC-AL. Specifically, PerC-C&W outperforms the original C&W in all cases, while PerC-AL consistently outperforms DDN. Recall that our PerC approaches cause fewer visual distortions, as shown in Fig. 3, contributing to imperceptibility.

5.4 Transferability

Existing research [tramer2017ensemble, liu2016delving] has demonstrated that the adversarial effect of some examples optimized for a specific network may transfer to another network. We test the transferability of different approaches from the original Inception V3 to other three pre-trained networks, namely, GoogLeNet [szegedy2016rethinking], ResNet-152 [he2016deep], and VGG-16 [simonyan2014very]. Specifically, an untargeted adversarial example generated for the original model is regarded to be transferable to a new model if it can also induce misclassification of that model.

It is less meaningful to analyze the adversarial perturbations in the case that an original image, without any added perturbations, has already yielded a different prediction by a new model. So we only consider the images that yield the same original predictions for all the four studied networks.

The success rates under transferability for different approaches on all the eligible images (494 in total) are reported in Table 3. I-FGSM again outperforms the other approaches, but uses excessive perturbations. Among the other approaches, we can observe that the best results are always achieved by one of our two PerC approaches.

GoogLeNet VGG-16 ResNet-152
κ=20 κ=40 κ=20 κ=40 κ=20 κ=40
I-FGSM [kurakin2016adversarial] 3.4 5.3 6.5 11.9 7.5 9.9
C&W [carlini2017towards] 1.8 2.8 3.9 5.9 4.5 5.1
DDN [rony2019decoupling] 1.0 2.0 4.5 6.7 4.3 5.1
PerC-C&W (ours) 2.2 3.9 4.3 8.1 5.5 6.5
PerC-AL (ours) 1.6 3.4 5.1 7.9 5.3 7.3
Table 3: Success rates of adversarial examples at two high confidence levels κ=20 and κ=40, achieved by different approaches under transferability from the original Inception V3 to three other networks.

5.5 Assembling structural information

We explore the possibility of assembling structural information for further improving imperceptibility without impacting the adversarial strength. Specifically, we introduce a texture complexity matrix 𝝈 as a weighting term into our efficient PerC-AL framework. Following existing work [luo2018towards, Croce_2019_ICCV] on addressing imperceptibility with respect to image structures, this matrix is obtained based on the standard deviation of the values in the neighbourhood (here 3×3 square) of each image coordinate. The components with top 5% highest values in the map are clipped for stability and the map is normalized into the range [0,1] before use.

Concretely, this approach adjusts step 8 in Algorithm 4.1 to:

C2-(𝟏-𝝈)ΔE00(𝒙,𝒙k-1)2, (10)

where the C2 becomes also sensitive to image differences in terms of local visual variation. As shown in Fig. 5, with the help of additional structural information, perturbations in the smooth regions are suppressed, while more changes, which are hardly perceived, are triggered in the area with rich visual variation. It is worthwhile for the future work to investigate the effectiveness of this combined approach in more detail.

Figure 5: Adversarial examples at κ=40 of an image that contains both smooth and textured regions, generated by PerC-AL (top) and PerC-AL plus structure (bottom).

6 Conclusion and Outlook

This paper has demonstrated the usefulness of perceptual color distance for creating large but imperceptible adversarial image perturbations. We have proposed two approaches to creating adversarial images, PerC-C&W and PerC-AL. Our experimental investigation of these approaches shows that perceptual color distance is able to improve imperceptibility, especially in smooth, saturated regions. We show that these approaches have perturbations with larger RGB Lp norms than approaches that perturb directly in RGB space. This effect translates into adversarial strength, i.e., the ability of the perturbations to fool a classifier.

Our work has made a contribution to recent work that seeks to create adversarial images that are imperceptible to the eye of the human observer. This work has been carried out in the area of security [carlini2017towards, eykholt2017robust, gragnaniello2019perceptual, kurakin2016adversarial, papernot2016limitations] (defend inference of a legitimate classifier) and privacy [mirjalili2018gender, oh2017adversarial, choi2017geo, liu2019s] (prevent inference of an illegitimate classifier). In the security area, imperceptible perturbations can mean that adversarial images can poison the training data without being noticed by human annotators. In the privacy area, imperceptible perturbations mean wider acceptance of the use of adversarial images to protect against classification attacks.

In the future, we will continue to consider perceptual color in adversarial images from both the privacy and the security angle. Our first direction will be related to the fact that neither conventional RGB perturbations nor PerC perturbations perform well in smooth regions with low saturation. We would like to develop techniques that can make perturbations imperceptible, or unnecessary, in such regions. Our second direction will be related to robustness. Here, we have looked at robustness as it is conventionally studied in the literature on adversarial image examples. However, since PerC-based approaches used perceptual color distance, it could be possible to mitigate PerC-based perturbations by limiting bit depth in perceptual color space. With regard to this possibility, we point out that in order to counteract the effect of PerC perturbations in this way, it is necessary to be able to infer that they have been applied to an image. For this reason, our future work will also investigate ways to detect that an image contains PerC perturbations, and new varieties of PerC perturbations that minimize the effectiveness of such detection.

References

Supplementary Material

Approach Budget λ
Targeted Untargeted
C&W [carlini2017towards] 3×100 1 0.1
5×200 1 1
9×1000 1 1
PerC-C&W (ours) 3×100 10 100
5×200 10 100
9×1000 10 10
Table 4: Selected initializations of λ via grid search.
Approach Budget Success Perturbation Size
Rate (%) L2¯ L¯ C2¯
I-FGSM [kurakin2016adversarial] - 100.0 1.94 1.02 255.92
C&W [carlini2017towards] 3×100 100.0 0.69 3.61 88.76
5×200 100.0 0.45 3.79 59.88
9×1000 100.0 0.41 3.74 54.17
PerC-C&W (ours) 3×100 100.0 1.47 6.78 78.25
5×200 100.0 0.90 6.71 51.35
9×1000 100.0 0.56 6.58 33.00
DDN [rony2019decoupling] 100 100.0 0.35 4.03 49.43
300 100.0 0.33 4.08 47.58
1000 100.0 0.32 4.11 46.51

PerC-AL (ours)
100 100.0 0.53 5.58 30.39
300 100.0 0.50 6.93 27.65
1000 100.0 0.51 8.92 26.62
Table 5: Success rates and perturbation sizes on the 1000 images from the ImageNet-Compatible dataset, with varied budgets in the targeted setting. Perturbation size is quantified in terms of L2 and L norms of the perturbations in RGB space (L2¯ and L¯) and also in terms of image-level accumulated perceptual color difference ( C2¯). For this relatively easy untargeted case, PerC-AL is initialized with αc=0.1.