Abstract
The success of image perturbations that are designed to fool imageclassification is assessed in terms of both adversarial effect and visualimperceptibility. In this work, we investigate the contribution of human colorperception to perturbations that are not noticeable. Our basic insight is thatperceptual color distance makes it possible to drop the conventional assumptionthat imperceptible perturbations should strive for small $L_p$ norms in RGBspace. Our first approach, Perceptual Color distance C&W (PerCC&W), extendsthe widelyused C&W approach and produces larger RGB perturbations. PerCC&W isable to maintain adversarial strength, while contributing to imperceptibility.Our second approach, Perceptual Color distance Alternating Loss (PerCAL),achieves the same outcome, but does so more efficiently by alternating betweenthe classification loss and perceptual color difference when updatingperturbations. Experimental evaluation shows PerC approaches improve robustnessand transferability of perturbations over conventional approaches and alsodemonstrates that the PerC distance can provide added value on top of existingstructurebased approaches to creating image perturbations.
Quick Read (beta)
Towards Large yet Imperceptible Adversarial Image Perturbations with Perceptual Color Distance
Abstract
The success of image perturbations that are designed to fool image classification is assessed in terms of both adversarial effect and visual imperceptibility. In this work, we investigate the contribution of human color perception to perturbations that are not noticeable. Our basic insight is that perceptual color distance makes it possible to drop the conventional assumption that imperceptible perturbations should strive for small ${L}_{p}$ norms in RGB space. Our first approach, Perceptual Color distance C&W (PerCC&W), extends the widelyused C&W approach and produces larger RGB perturbations. PerCC&W is able to maintain adversarial strength, while contributing to imperceptibility. Our second approach, Perceptual Color distance Alternating Loss (PerCAL), achieves the same outcome, but does so more efficiently by alternating between the classification loss and perceptual color difference when updating perturbations. Experimental evaluation shows PerC approaches improve robustness and transferability of perturbations over conventional approaches and also demonstrates that the PerC distance can provide added value on top of existing structurebased approaches to creating image perturbations.
1 Introduction
Research on creating adversarial examples for deep visual classifiers has focused on perturbations that cause misclassification while being imperceptible to the human eye [szegedy2013intriguing, papernot2016limitations, carlini2017towards]. Larger image perturbations are known to improve adversarial strength (i.e., the ability to fool a classifier), but are also associated with visually noticeable changes in the image. A commonly agreedupon assumption is that tight ${L}_{p}$norm constraints on the size of adversarial perturbations in RGB space are a good guarantee of imperceptibility. Evaluation of adversarial examples has conventionally followed this assumption, considering perturbations with smaller ${L}_{p}$ norms to be better (e.g., ${L}_{\mathrm{\infty}}$ [goodfellow2014explaining, kurakin2016adversarial, carlini2017towards], ${L}_{2}$ [szegedy2013intriguing, moosavi2016deepfool, carlini2017towards] and ${L}_{0}$ [papernot2016limitations, carlini2017towards]). Keeping with this assumption, defense approaches are designed to be effective against adversarial perturbations under a specific ${L}_{p}$ bound [tramer2017ensemble, madry2017towards, wong2018provable, cohen2019certified]. Our research is motivated by the importance of questioning the necessity of small RGB perturbations for imperceptibility.
In this work, we propose to create adversarial examples by perturbing images with respect to perceptual color (PerC) distance. Using PerC distance makes it possible to move away from the assumption that it is necessary to tightly constrain the ${L}_{p}$ norm of the perturbations in RGB space. Fig. 1 illustrates the difference between C&W [carlini2017towards], a wellknown approach that perturbs with respect to an ${L}_{p}$ norm in RGB space, and our own extension, PerCC&W, which perturbs with respect to a perceptual color distance. PerC perturbations are less perceptible, especially in smooth regions of saturated color (cf. Fig. 1 in bottom row). Also, they are distributed strategically over the RGB color channels (cf. downsized perturbation images in the middle row). PerC distance effectively allows us to hide large perturbations in RGB space, in a way not readily noticeable to the human eye. Our PerCbased approaches can increase the ${L}_{p}$ norm substantially (cf. Fig. 1, ${L}_{2}$ and ${L}_{\mathrm{\infty}}$ in middle row), leading to a strong adversarial effect that maintains imperceptibility.
Fig. 2 motivates the use of perceptual color distance for creating adversarial images. Here, we have taken a solid color image (left) and added the same perturbations to the green channel (middle) and to the blue channel (right). Although both RGB channels were perturbed identically, the perturbations are only visible in the green channel. The reason is that color as it is perceived by the human eye does not change uniformly over distance in RGB space. Relatively small perturbations in RGB space may correspond to large difference in perceptual color space. Conversely, relatively large changes in RGB space may remain unnoticeable if they lead to small perceived color difference.
Our work is in line with a growing awareness in the literature on adversarial examples that the difference between two images as measured by an ${L}_{p}$ norm in RGB space is actually quite poorly aligned with human perception [sharif2018suitability]. Building on this observation, researchers have attempted to address imperceptibility by exploiting similarity defined with respect to semantics [engstrom2017rotation, hosseini2018semantic, sharif2019adversarial, joshi2019semantic, eykholt2017robust] or structural information [luo2018towards, gragnaniello2019perceptual, zhang2019smooth, Croce_2019_ICCV] in the image. However, little work on adversarial examples has questioned the wisdom of optimizing perturbations with respect to distance in RGB space. The exceptions are a handful of approaches that have proposed allowing only luminance change when perturbing pixels [gragnaniello2019perceptual, Croce_2019_ICCV]. The approach that is closest to our own is [athalye2018synthesizing], which perturbs in CIELAB color space, but carries out no investigation of the potential and limitations of the idea. Our work is distinct from this initial effort because we use a more accurate polar form (known as CIELCH) of the CIELAB color space, and more importantly, use an actual perceptual color distance. The distance is CIEDE2000 [luo2001development, ciede2000], and will be discussed in detail in Section 2. To our knowledge, ours is the first work that proposes optimizing adversarial image perturbations directly with respect to a perceptual color distance.
In order to fully appreciate our proposal, it is necessary to understand two key aspects. First, we do not claim that PerC approaches will always yield dramatically less perceptible perturbations than conventional RGB approaches. For cases in which the perturbations are small, the difference may not be so great. However, we find that there are two cases in which PerC approaches are particularly important. First, our experimental results (see Section 5.2.2) show that as we attempt to create highconfidence adversarial examples that contain larger and larger perturbations, it becomes important to perturb with respect to perceptual color distance. Second, we demonstrate that the effect of PerC approaches is additive and can be used in combination with existing structural approaches to improve imperceptibility.
The contributions of this paper are as follows:

•
An indepth study of the use of perceptual color (PerC) distance to hide large RGB perturbations in images.

•
PerCC&W: a method for creating adversarial images that introduces perceptual color distance into the joint optimization of C&W.

•
PerCAL: an efficient method that optimizes alternating loss (AL) functions, switching between classification loss and perceptual color difference.

•
Experimental validation demonstrating that PerC perturbations at highconfidence settings yield more robust and transferable adversarial examples, without sacrificing imperceptibility.

•
Experimental results showing that PerC perturbations can be used in combination with structural information for further improvement of imperceptibility.
We release the code including a differentiable solution compatible with PyTorch’s autograd to efficiently implement perceptual color distance (CIEDE2000).^{1}^{1} 1 Code available at https://github.com/ZhengyuZhao/PerCAdversarial.
2 Background on Perceptual Color Distance
Conventionally, computer vision research has intensively explored color and human perception, but has paid surprisingly little attention to distance in perceptual color spaces. Here, we mention some key points about color in computer vision history. Early on, research focused on intensitybased descriptors, which then evolved to also capture color information. Unsurprisingly, color boosted the performance of object and scene recognition [khan2012color, van2009evaluating] and semantic segmentation [cheng2001color]. Researchers extracted descriptors from opponent color spaces, most notably HSV and CIELAB, which separate luminance and chrominance. Most recently, color is attracting more attention in the area of image synthesis. Notable examples, such as style transfer [gatys2016preserving] and crossdomain image generation [taigman2016unsupervised], find that color plays an important role in preserving the look of an image. In general, we observe that until now the focus has been on the color space itself, and not on color distance, which we explore here.
The perceptual color distance that we use is CIEDE2000 [luo2001development, ciede2000], which is the latest $\mathrm{\Delta}E$ standard formula developed by the CIE (International Commission on Illumination), and has been experimentally demonstrated to have strong agreement with human perception. Specifically, the perceptual color distance between two pixels in the CIELCH space can be calculated as:
$$\begin{array}{c}\hfill \mathrm{\Delta}{E}_{00}=\sqrt{{(\frac{\mathrm{\Delta}{L}^{\prime}}{{k}_{L}{S}_{L}})}^{2}+{(\frac{\mathrm{\Delta}{C}^{\prime}}{{k}_{C}{S}_{C}})}^{2}+{(\frac{\mathrm{\Delta}{H}^{\prime}}{{k}_{H}{S}_{H}})}^{2}+\mathrm{\Delta}R},\hfill \\ \hfill \mathrm{\Delta}R={R}_{T}(\frac{\mathrm{\Delta}{C}^{\prime}}{{k}_{C}{S}_{C}})(\frac{\mathrm{\Delta}{H}^{\prime}}{{k}_{H}{S}_{H}}),\hfill \end{array}$$  (1) 
where $\mathrm{\Delta}{L}^{\prime}$, $\mathrm{\Delta}{C}^{\prime}$, $\mathrm{\Delta}{H}^{\prime}$ denotes the distance between pixel values of the three channels, L (lightness), C (chroma) and H (hue), and $\mathrm{\Delta}R$ is an interactive term between chroma and hue differences [luo2001development]. The weighting functions ${S}_{L}$, ${S}_{C}$, ${S}_{H}$ and ${R}_{T}$ are determined based on largescale human studies and act as compensations to better simulate human color perception. The ${k}_{L}$, ${k}_{C}$ and ${k}_{H}$ are usually unity for the application of graphic arts. Detailed definitions of all the parameters and relevant explanations can be found in [luo2001development]. We note that it is also possible to use an ${L}_{p}$ norm to measure distance in CIELAB space. However, this distance is not as close to human perceptual distance as CIEDE2000 is.
We point out that a limited amount of previous research that has also adopted CIEDE2000. However, the goal has been to evaluate the color similarity of image pairs. Examples of such research include work on image quality assessment [yang2012color] and image superresolution [liu2010colorization]. In contrast, in our work we use CIEDE2000 directly for optimization with back propagation and not only for evaluation.
3 Related work
In this section, we cover the existing literature, which focuses on creating ${L}_{p}$ normbounded adversarial examples, and we also mention recent approaches that attempt to move beyond ${L}_{p}$ norms. We preface our discussion with a short definition of an ‘adversary’, i.e., an approach that generates an adversarial image example. Given a classifier $f(\bm{x}):\bm{x}\in {\mathbb{R}}^{n}\to y\in \mathbb{R}$ that predicts a label $y$ for an image $\bm{x}$, the adversary attempts to induce a misclassification by modifying the original $\bm{x}$ to create a new ${\bm{x}}^{\prime}$. In the untargeted setting, the adversary is successful if the image is classified into an arbitrary class other than $y$, i.e., meets the condition $f({\bm{x}}^{\prime})\ne y$. In the targeted setting, the adversary must ensure that the image is classified into a class with a predefined label $t$, i.e., meets the condition $f({\bm{x}}^{\prime})=t$. The untargeted case is generally recognized to be less challenging than the targeted case [carlini2017towards].
3.1 ${L}_{p}$ normbounded Adversarial Examples
Typically, adversaries [szegedy2013intriguing, kurakin2016adversarial, goodfellow2014explaining, moosavi2016deepfool, papernot2016limitations, carlini2017towards, rony2019decoupling] create an adversarial image, ${x}^{\prime}$, by adding a perturbation vector $\bm{\delta}\in {\mathbb{R}}^{n}$ that is constrained by an ${L}_{p}$ norm to the original image, $\bm{x}$. The first ${L}_{p}$ normbounded approach [szegedy2013intriguing] optimized an objective combining the classification loss and the ${L}_{2}$ norm of the perturbations, balanced by a constant $\lambda $. Formally, the solution is expressed as:
$$\underset{\bm{\delta}}{\mathrm{minimize}}\lambda {\parallel \bm{\delta}\parallel}_{2}J({\bm{x}}^{\prime},y),\text{s.t.}{\bm{x}}^{\prime}\in {[0,1]}^{n},$$  (2) 
where $J({\bm{x}}^{\prime},y)$ is the crossentropy loss w.r.t. ${\bm{x}}^{\prime}$. The authors of [szegedy2013intriguing] solved the problem by using boxconstrained LBFGS (Limited memory BroydenFletcherGoldfarbShanno) method [liu1989limited].
The C&W method [carlini2017towards] improves on [szegedy2013intriguing] by introducing a new variable using the tanh function to eliminate the box constraint. Additionally, it introduces a more sophisticated objective function that optimizes differences between logits $Z$, which are output before the softmax layer. This can be formulated as:
$\underset{\bm{w}}{\mathrm{minimize}}{\parallel {\bm{x}}^{\prime}\bm{x}\parallel}_{2}^{2}+\lambda f({\bm{x}}^{\prime}),$  (3)  
$\text{where}f({\bm{x}}^{\prime})=\mathrm{max}(\mathrm{max}\{Z{({\bm{x}}^{\prime})}_{i}:i\ne t\}Z{({\bm{x}}^{\prime})}_{t},\kappa ),$  
$\text{and}{\bm{x}}^{\prime}={\displaystyle \frac{1}{2}}(\mathrm{tanh}(\mathrm{arctanh}(\bm{x})+\bm{w})+1),$ 
where $\bm{w}$ is the new variable and $Z{({x}^{\prime})}_{i}$ denotes the logits with respect to the $i$th class. In an untargeted setting, the definition of $f$ is modified to:
$$f({\bm{x}}^{\prime})=\mathrm{max}(Z{({\bm{x}}^{\prime})}_{y}\mathrm{max}\{Z{({\bm{x}}^{\prime})}_{i}:i\ne y\},\kappa ).$$  (4) 
The parameter $\kappa $ controls the confidence level of the misclassification. The first approach that we propose, PerCC&W, is built on C&W. In our experiments, we will vary $\kappa $ in order to assess the ability of an adversary to create strong adversarial images, i.e., images that are misclassified with high confidence.
Due to the need for line search in order to find the optimal constant, $\lambda $, such optimization approach is inevitably timeconsuming. For this reason, [goodfellow2014explaining, kurakin2016adversarial, rony2019decoupling] propose a more efficient solution that does not impose a penalty during optimization. Instead, respect of the norm constraint is ensured by projecting perturbations onto an $\u03f5$sphere around the original image. Specifically, the fast gradient sign method (FGSM) [goodfellow2014explaining] was first proposed to achieve adversarial effect with only one step, formulated as:
$${\bm{x}}^{\prime}=\bm{x}+\u03f5\cdot \mathrm{sign}({\nabla}_{\bm{x}}J(\bm{x},y)),$$  (5) 
where the perturbation size is implicitly constrained by specifying a small $\u03f5$.
Subsequently, an extension of this method referred to as IFGSM [kurakin2016adversarial] was introduced for leveraging finer gradient information by iteratively updating the perturbations with a smaller step size $\alpha $:
$${\bm{x}}_{0}^{\prime}=\bm{x},{\bm{x}}_{k}^{\prime}={\bm{x}}_{k1}^{\prime}+\alpha \cdot \mathrm{sign}({\nabla}_{\bm{x}}J({\bm{x}}_{k1}^{\prime},y)),$$  (6) 
where the intermediate perturbed image ${\bm{x}}_{k}^{\prime}$ is projected onto a $\u03f5$sphere around the original $\bm{x}$, to satisfy the ${L}_{\mathrm{\infty}}$norm constraint. IFGSM can also generalize to the ${L}_{2}$ norm by changing the $\mathrm{sign}$ operation to:
$$\frac{{\nabla}_{\bm{x}}J({\bm{x}}_{k1}^{\prime},y)}{{\parallel {\nabla}_{\bm{x}}J({\bm{x}}_{k1}^{\prime},y)\parallel}_{2}},$$  (7) 
where the projection is implemented by:
$${\bm{x}}_{k}^{\prime}=\bm{x}+\u03f5\frac{{\bm{x}}_{k}^{\prime}\bm{x}}{{\parallel {\bm{x}}_{k}^{\prime}\bm{x}\parallel}_{2}}.$$  (8) 
Recently, an efficient method called the Decoupled Direction and Norm (DDN) [rony2019decoupling] was proposed and yielded the best performance (smallest ${L}_{2}$ norm) in the untargeted track of NIPS 2018 Adversarial Vision Challenge [brendel2018adversarial], with substantially fewer iterations than the conventional C&W. This method is basically ${L}_{2}$ normbased IFGSM with the $\u03f5$ being adjusted in each iteration based on whether the perturbed image is adversarial or not, leading to a finergrained search for the minimal norm. Our second approach, PerCAL, follows a similar strategy as DDN to improve efficiency by decoupling the joint optimization.
3.2 Adversarial examples beyond ${L}_{p}$ norms
Our work is part of the current movement away from tight ${L}_{p}$ norms and towards conceptualization of image similarity in terms of semantics or perceptual properties. Research that defines similarity in terms of semantics, requires the adversarial image to have the same content as the original image from the point of view of the human viewer. Some of the first work in this direction has explored geometric transformation [engstrom2017rotation, xiao2018spatially], global color shift [hosseini2018semantic, Laidlaw2019functional, bhattad2019big], and image filters [choi2017geo].
Such approaches are interesting, but we do not pursue them here because they tend to be limited in their adversarial strength, due to the restricted size of the search space for possible adversarial image transformations.
Research that investigates similarity with respect to texture and structure [luo2018towards, gragnaniello2019perceptual, zhang2019smooth, Croce_2019_ICCV], has focused on hiding perturbations in image regions with visual variation. In [luo2018towards, Croce_2019_ICCV], image regions with high variance are used to hide image perturbations. In [gragnaniello2019perceptual], additional supervision of structural similarity (SSIM) [wang2004image] is used to guide the perturbation updates. Other work [zhang2019smooth] has applied Laplacian smoothing to obtain image structure, which is used to modify the image while maintaining the original structure. All of these approaches share a common challenge: They have difficulties in dealing with smooth regions (e.g., sky, ground and artificial objects), which appear frequently in images taken in commonly occurring realworld settings (referred to as natural images). In contrast, our PerC perturbations are applicable in smooth regions in the case of saturated color. Our experiments show that it can be combined productively with a structurebased approach.
4 Proposed approaches
In this section, we present two approaches to using perceptual color (PerC) distance for adversarial image perturbations. We focus on imagelevel accumulated perceptual color difference, i.e., ${L}_{2}$ norm of the color distance vector, in which each component represents the perceptual color distance ($\mathrm{\Delta}{E}_{00}$ in Eq. (1)) calculated for the corresponding image pixel.
4.1 Perceptual color distance penalty (PerCC&W)
Our first approach, PerCC&W, adopts the joint optimization framework of the wellknown C&W, but replaces the original penalty on the ${L}_{2}$ norm with a new one based on perceptual color difference. It can be formally expressed as:
$$\underset{\bm{w}}{\mathrm{minimize}}{\parallel \mathrm{\Delta}{E}_{00}(\bm{x},{\bm{x}}^{\prime})\parallel}_{2}+\lambda f({\bm{x}}^{\prime}),$$  (9) 
where $\bm{w}$ is the new introduced variable as in the Eq. (3) of C&W. Like the original C&W, the optimization problem is solved by binary search over the constant $\lambda $. By using the gradient information from perceptual color difference, the perturbation updating is translated into the perceptually uniform color space. Large RGB perturbations, which have a strong adversarial effect, remain hidden from the human eye, as will be shown in Section 5.
[t]
\algrenewcommand\algorithmicrequireInput:
\algrenewcommand\algorithmicensureOutput:
\algorithmicrequire
$\bm{x}$: original image, $t$: target label, $K$: number of iterations
${\alpha}_{l}$: step size in minimizing classification loss
${\alpha}_{c}$: step size in minimizing perceptual color difference
${\bm{x}}^{\prime}$: adversarial image {algorithmic}[1] \StateInitialize ${\bm{x}}_{0}^{\prime}\leftarrow \bm{x}$, ${\bm{\delta}}_{0}\leftarrow \mathrm{\U0001d7ce}$ \For$k\leftarrow 1$ to $K$ \If${\bm{x}}_{k1}^{\prime}$ is not adversarial \State$\bm{g}\leftarrow {\nabla}_{\bm{x}}J({\bm{x}}_{k1}^{\prime},t)$ \State$\bm{g}\leftarrow {\alpha}_{l}\cdot \frac{\bm{g}}{{\parallel \bm{g}\parallel}_{2}}$ \State${\bm{\delta}}_{k}\leftarrow {\bm{\delta}}_{k1}+\bm{g}$\CommentUpdate $\bm{\delta}$ in the direction of $\bm{g}$ \Else\State${C}_{2}\leftarrow {\parallel \mathrm{\Delta}{E}_{00}(\bm{x},{\bm{x}}_{k1}^{\prime})\parallel}_{2}$ \State${\bm{g}}_{c}\leftarrow {\nabla}_{\bm{x}}{C}_{2}$ \State${\bm{g}}_{c}\leftarrow {\alpha}_{c}\cdot \frac{{\bm{g}}_{c}}{{\parallel {\bm{g}}_{c}\parallel}_{2}}$ \State${\bm{\delta}}_{k}\leftarrow {\bm{\delta}}_{k1}+{\bm{g}}_{c}$\CommentUpdate $\bm{\delta}$ in the direction of ${\bm{g}}_{c}$ \EndIf\State${\bm{x}}_{k}^{\prime}\leftarrow \mathrm{clip}(\bm{x}+{\bm{\delta}}_{k},0,1)$ \State${\bm{x}}_{k}^{\prime}\leftarrow \mathrm{quantize}({\bm{x}}_{k}^{\prime})$\CommentEnsure ${\bm{x}}_{k}^{\prime}$ is valid \EndFor\State\Return${\bm{x}}^{\prime}\leftarrow {\bm{x}}_{k}^{\prime}$ that is adversarial and has smallest ${C}_{2}$
4.2 Perceptual color distance alternating loss (PerCAL)
Although, Eq. 9 enjoys a concise expression, the twoterm joint optimization of PerCC&W faces difficulties in practice. Adversarial training [kurakin2016adversarial], for example, presents challenges. The reason is that PerCC&W requires timeconsuming binary search in order to find an optimal $\lambda $, which normally varies substantially among different images [rony2019decoupling]. To address the inefficiency, we propose PerCAL, which decouples the joint optimization by alternately updating the perturbations with respect to either classification loss or perceptual color difference. Our strategy is inspired by DDN, which is basically a projected gradient descent (PGD) method with a dynamic ${L}_{2}$norm bound. However, PerCAL goes beyond this idea to alternate two gradient descents.
The full PerCAL method is described in Algorithm 4.1. We start from an original image $\bm{x}$ with the perturbation $\bm{\delta}$ initialized as $\mathrm{\U0001d7ce}$, and iteratively update it to create an adversarial image. In each iteration, the perturbation is either enlarged to achieve stronger adversarial effect based on the gradients from the classification loss, or shrunk to minimize perceptual color differences. These two operations are alternated based on whether the intermediate perturbed image ${\bm{x}}_{k}^{\prime}$ is adversarial or not. To ensure the final adversarial image is valid, the output is clipped into the range [0,1] and quantized into 255 levels (corresponding to 8bit image encoding).
5 Experiments
In this section, we first provide a picture of the differences between RGB and PerC approaches (Section 5.2). Then, we carry out experiments that compare different approaches in terms of robustness (Section 5.3) and transferability (Section 5.4), by considering the case of highconfidence adversarial examples. Finally, in Section 5.5, we show that structural information can be elegantly integrated into our efficient decoupled approach, PerCAL, for further improvement in the imperceptibility of images that contain areas with rich visual variation.
5.1 Experimental setup
Dataset and Networks. Following recent work [xiao2018spatially, zhang2019smooth, dong2019evading], we conduct our experiments on the development set (1000 RGB natural images with the size of $299\times 299$) of the ImageNetCompatible dataset^{2}^{2} 2 https://github.com/tensorflow/cleverhans/tree/master/examples/nips17_adversarial_competition/dataset.. This dataset was introduced by the NIPS 2017 Competition on Adversarial Attacks and Defenses [kurakin2018adversarial] and consists of 6000 images labeled with 1000 ImageNet classes. We choose this dataset because we would like to study imperceptibility under realworld conditions. In contrast, some other work [luo2018towards, Croce_2019_ICCV] on addressing imperceptibility mainly focuses on the tiny images from MNIST [lecun1998gradient] and CIFAR10 [krizhevsky2009learning]. As in the competition, the Inception V3 [szegedy2016rethinking] model pretrained on ImageNet is used as the target classifier.
Baselines. Three wellknown baselines, namely, IFGSM [kurakin2016adversarial], C&W [carlini2017towards], and the stateoftheart DDN [rony2019decoupling], are compared with our approaches. Among them, IFGSM targets minimum ${L}_{\mathrm{\infty}}$ norm, while C&W and DDN target minimum ${L}_{2}$ norm.
Parameters. IFGSM is repeated multiple times with increased ${L}_{\mathrm{\infty}}$norm bound by step size $\alpha =1/255$ for each time until success.
C&W and PerCC&W use Adam optimizer [kingma2014adam] with a learning rate of 0.01 for updating the perturbations. We impose a budget on the number of search steps used to find the optimal $\lambda $. The initialization of $\lambda $ is particularly important for small budgets. We perform grid search for the initialization value of $\lambda $ over the range [0.01, 0.1, 1, 10, 100], and adopt the value that yields the smallest average perturbation size. The selected initialization values are given in the supplementary material.
For DDN and PerCAL, we decrease the step size ($\alpha $ in DDN and ${\alpha}_{l}$ in PerCAL) that is used for updating the perturbations with respect to the classification loss from 1 to 0.01 with cosine annealing. The ${L}_{2}$norm constraint $\u03f5$ in DDN is initialized to 1 and adjusted iteratively by $\gamma =0.05$, as in the original work DDN [rony2019decoupling]. The ${\alpha}_{c}$ in PerCAL is gradually reduced from 0.5 to 0.05 with cosine annealing.
Evaluation Protocol. We investigate a set of reasonable operating points, based on predefined budgets. Note that our goal is to show the relative behavior of PerC vs. RGB approaches. For this purpose, we only need to create a fair comparison, and it is not necessary to drive all approaches to an absolute optimum. For each image, an approach is considered successful if the perturbed image can achieve adversarial effect with the given budget. Specifically, IFGSM requires varied repetitions for different images. For C&W and PerCC&W, the budget refers to N(search steps) $\times $ N(iterations of gradient descent). We apply relatively high budget ($9\times 1000$), and are also interested in lower budgets ($5\times 200$ and $3\times 100$), which are more directly comparable with more efficient approaches, namely, DDN and PerCAL. We test DDN and our PerCAL with three different iteration budgets (100, 300 and 1000), adopted from the original work [rony2019decoupling].
Adversarial strength is evaluated by the success rate, i.e., the proportion of successful cases over the whole dataset. The averaged perturbation size over all successful images is reported. It is measured in terms of the ${L}_{2}$ and ${L}_{\mathrm{\infty}}$ norm in RGB space ($\overline{{L}_{2}}$ and $\overline{{L}_{\mathrm{\infty}}}$) and also in terms of imagelevel accumulated perceptual color difference ($\overline{{C}_{2}}$).
Approach  Budget  Success  Perturbation Size  

Rate (%)  $\overline{{L}_{2}}$  $\overline{{L}_{\mathrm{\infty}}}$  $\overline{{C}_{2}}$  
IFGSM [kurakin2016adversarial]    100.0  2.51  1.59  317.96 
C&W [carlini2017towards]  3$\times $100  100.0  1.32  8.84  159.85 
5$\times $200  100.0  1.09  8.20  132.86  
9$\times $1000  100.0  0.92  8.45  114.36  
PerCC&W (ours)  3$\times $100  100.0  2.77  14.29  150.44 
5$\times $200  100.0  1.48  12.06  83.93  
9$\times $1000  100.0  1.22  15.57  67.79  
DDN [rony2019decoupling]  100  100.0  1.00  7.84  136.11 
300  100.0  0.88  7.58  120.12  
1000  100.0  0.82  7.62  111.65  
PerCAL (ours) 
100  100.0  1.30  11.98  69.49 
300  100.0  1.17  13.97  61.21  
1000  100.0  1.13  17.04  57.10 
5.2 Adversarial strength and imperceptibility
In this section, we investigate the adversarial strength and imperceptibility of the perturbed images by different approaches in a whitebox scenario, where the full information of the network is accessible.
5.2.1 Sufficientconfidence adversarial examples
We first present, in Table 1, a comparison demonstrating how PerC approaches relax ${L}_{p}$ norms. Our comparison uses adversarial examples created under a commonly used condition where the aim is to achieve a just sufficient adversarial effect. Sufficientconfidence adversarial examples just cross the decision boundary without pursuing a higher confidence score for the adversarial label. As expected, all approaches achieve 100% success rate and the resulting perturbation size gets smaller as the budget increases.
Table 1 confirms that PerC approaches, PerCC&W and PerCAL, show the behavior they are designed for, i.e., decreasing the average accumulated perceptual color difference $\overline{{C}_{2}}$. More importantly, PerC approaches do this without tightly constraining the ${L}_{p}$ norms in RGB space as the other approaches do, as reflected by $\overline{{L}_{2}}$ and $\overline{{L}_{\mathrm{\infty}}}$. Moreover, PerCAL achieves lower $\overline{{C}_{2}}$ than PerCC&W (57.10 vs. 67.79) with notably fewer iterations. For comparison, we provide $\overline{{C}_{2}}$ for the RGB approaches. The untargeted results follow a similar pattern and can be found in the supplementary material.
Approach  $\kappa =20$  $\kappa =40$  

Suc. (%)  $\overline{{C}_{2}}$  Suc. (%)  $\overline{{C}_{2}}$  
IFGSM [kurakin2016adversarial]  100.0  375.74  99.9  576.06 
C&W [carlini2017towards]  100.0  159.00  100.0  241.92 
DDN [rony2019decoupling]  100.0  150.68  98.1  238.37 
PerCC&W (ours)  100.0  90.86  100.0  136.22 
PerCAL (ours)  100.0  75.43  100.0  115.17 
5.2.2 Highconfidence adversarial examples
In order to gain deeper insight into the performance of our approaches, we investigate adversarial examples that have a high confidence score for the adversarial label. High confidence was initially investigated by [carlini2017towards] in order to achieve more transferable adversarial examples, and also been explored in the “Unrestricted Adversarial Example” contest [brown2018unrestricted]. An approach is regarded as successful only if the logit with respect to the original class becomes lower than the maximum of the other logits by a predefined margin $\kappa $. For C&W and our PerCC&W, this requirement can be directly implemented by specifying the factor $\kappa $ in Eq. (4). For IFGSM, DDN and PerCAL, this can be achieved by running the iterations until the required logit difference is satisfied. For this experiment, we adopt the settings generating the smallest perturbations for each approach in Section 5.2.1.
Fig. 3 shows some adversarial examples generated by different approaches at $\kappa =40$. The images produced by our PerC approaches look more visually acceptable than those of the other approaches. More examples can be found in our GitHub repository^{3}^{3} 3 https://github.com/ZhengyuZhao/PerCAdversarial.. The good visual appearance of the PerC examples is consistent with their low averaged aggregated perceptual color difference, $\overline{{C}_{2}}$, as seen in Table 2, which shows both $\kappa =40$ and $\kappa =20$ values. The challenge of the highconfidence setting is seen in the success rates, which are not longer perfect for all conditions.
5.3 Robustness
In order to gain additional practical insight, we test the robustness of the adversarial examples against two commonly studied image transformationbased defense methods, i.e., JPEG compression [dziugaite2016study, guo2017countering, das2018shield, dong2019evading] and bitdepth reduction [xu2017feature, guo2017countering, he2017adversarial].
The results are shown in Fig. 4. Overall, increasing $\kappa $ from 20 to 40 leads to improved robustness. For a specific $\kappa $, unsurprisingly, IFGSM outperforms other approaches by a large margin since it greedily perturbs all the pixels, but at the cost of worse image quality (see Fig. 3). Among the other four approaches that target minimal imagelevel accumulated image difference with very sparse perturbations, the best results are consistently achieved by either our PerCC&W or PerCAL. Specifically, PerCC&W outperforms the original C&W in all cases, while PerCAL consistently outperforms DDN. Recall that our PerC approaches cause fewer visual distortions, as shown in Fig. 3, contributing to imperceptibility.
5.4 Transferability
Existing research [tramer2017ensemble, liu2016delving] has demonstrated that the adversarial effect of some examples optimized for a specific network may transfer to another network. We test the transferability of different approaches from the original Inception V3 to other three pretrained networks, namely, GoogLeNet [szegedy2016rethinking], ResNet152 [he2016deep], and VGG16 [simonyan2014very]. Specifically, an untargeted adversarial example generated for the original model is regarded to be transferable to a new model if it can also induce misclassification of that model.
It is less meaningful to analyze the adversarial perturbations in the case that an original image, without any added perturbations, has already yielded a different prediction by a new model. So we only consider the images that yield the same original predictions for all the four studied networks.
The success rates under transferability for different approaches on all the eligible images (494 in total) are reported in Table 3. IFGSM again outperforms the other approaches, but uses excessive perturbations. Among the other approaches, we can observe that the best results are always achieved by one of our two PerC approaches.
GoogLeNet  VGG16  ResNet152  

$\kappa =20$  $\kappa =40$  $\kappa =20$  $\kappa =40$  $\kappa =20$  $\kappa =40$  
IFGSM [kurakin2016adversarial]  3.4  5.3  6.5  11.9  7.5  9.9 
C&W [carlini2017towards]  1.8  2.8  3.9  5.9  4.5  5.1 
DDN [rony2019decoupling]  1.0  2.0  4.5  6.7  4.3  5.1 
PerCC&W (ours)  2.2  3.9  4.3  8.1  5.5  6.5 
PerCAL (ours)  1.6  3.4  5.1  7.9  5.3  7.3 
5.5 Assembling structural information
We explore the possibility of assembling structural information for further improving imperceptibility without impacting the adversarial strength. Specifically, we introduce a texture complexity matrix $\bm{\sigma}$ as a weighting term into our efficient PerCAL framework. Following existing work [luo2018towards, Croce_2019_ICCV] on addressing imperceptibility with respect to image structures, this matrix is obtained based on the standard deviation of the values in the neighbourhood (here $3\times 3$ square) of each image coordinate. The components with top 5% highest values in the map are clipped for stability and the map is normalized into the range [0,1] before use.
Concretely, this approach adjusts step 8 in Algorithm 4.1 to:
$${C}_{2}\leftarrow {\parallel (\mathrm{\U0001d7cf}\bm{\sigma})\cdot \mathrm{\Delta}{E}_{00}(\bm{x},{\bm{x}}_{k1}^{\prime})\parallel}_{2},$$  (10) 
where the ${C}_{2}$ becomes also sensitive to image differences in terms of local visual variation. As shown in Fig. 5, with the help of additional structural information, perturbations in the smooth regions are suppressed, while more changes, which are hardly perceived, are triggered in the area with rich visual variation. It is worthwhile for the future work to investigate the effectiveness of this combined approach in more detail.
6 Conclusion and Outlook
This paper has demonstrated the usefulness of perceptual color distance for creating large but imperceptible adversarial image perturbations. We have proposed two approaches to creating adversarial images, PerCC&W and PerCAL. Our experimental investigation of these approaches shows that perceptual color distance is able to improve imperceptibility, especially in smooth, saturated regions. We show that these approaches have perturbations with larger RGB ${L}_{p}$ norms than approaches that perturb directly in RGB space. This effect translates into adversarial strength, i.e., the ability of the perturbations to fool a classifier.
Our work has made a contribution to recent work that seeks to create adversarial images that are imperceptible to the eye of the human observer. This work has been carried out in the area of security [carlini2017towards, eykholt2017robust, gragnaniello2019perceptual, kurakin2016adversarial, papernot2016limitations] (defend inference of a legitimate classifier) and privacy [mirjalili2018gender, oh2017adversarial, choi2017geo, liu2019s] (prevent inference of an illegitimate classifier). In the security area, imperceptible perturbations can mean that adversarial images can poison the training data without being noticed by human annotators. In the privacy area, imperceptible perturbations mean wider acceptance of the use of adversarial images to protect against classification attacks.
In the future, we will continue to consider perceptual color in adversarial images from both the privacy and the security angle. Our first direction will be related to the fact that neither conventional RGB perturbations nor PerC perturbations perform well in smooth regions with low saturation. We would like to develop techniques that can make perturbations imperceptible, or unnecessary, in such regions. Our second direction will be related to robustness. Here, we have looked at robustness as it is conventionally studied in the literature on adversarial image examples. However, since PerCbased approaches used perceptual color distance, it could be possible to mitigate PerCbased perturbations by limiting bit depth in perceptual color space. With regard to this possibility, we point out that in order to counteract the effect of PerC perturbations in this way, it is necessary to be able to infer that they have been applied to an image. For this reason, our future work will also investigate ways to detect that an image contains PerC perturbations, and new varieties of PerC perturbations that minimize the effectiveness of such detection.
References
Supplementary Material
Approach  Budget  $\lambda $  

Targeted  Untargeted  
C&W [carlini2017towards]  3$\times $100  1  0.1 
5$\times $200  1  1  
9$\times $1000  1  1  
PerCC&W (ours)  3$\times $100  10  100 
5$\times $200  10  100  
9$\times $1000  10  10 
Approach  Budget  Success  Perturbation Size  

Rate (%)  $\overline{{L}_{2}}$  $\overline{{L}_{\mathrm{\infty}}}$  $\overline{{C}_{2}}$  
IFGSM [kurakin2016adversarial]    100.0  1.94  1.02  255.92 
C&W [carlini2017towards]  3$\times $100  100.0  0.69  3.61  88.76 
5$\times $200  100.0  0.45  3.79  59.88  
9$\times $1000  100.0  0.41  3.74  54.17  
PerCC&W (ours)  3$\times $100  100.0  1.47  6.78  78.25 
5$\times $200  100.0  0.90  6.71  51.35  
9$\times $1000  100.0  0.56  6.58  33.00  
DDN [rony2019decoupling]  100  100.0  0.35  4.03  49.43 
300  100.0  0.33  4.08  47.58  
1000  100.0  0.32  4.11  46.51  
PerCAL (ours) 
100  100.0  0.53  5.58  30.39 
300  100.0  0.50  6.93  27.65  
1000  100.0  0.51  8.92  26.62 