Prakash A, Moran N, Garber S, et al. Deflecting adversarial attacks with pixel deflection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 8571-8580.

1. Overview

1.1. Motivation

image classifier tend to be robust to natural noise
adversarial attacks tend to be agnostic to object location
most attacks search the entire image plane for adversarial perturbation without regard for the lacation of the image content

In this paper, it proposed pixel deflection + wavelet to defense attack

force the image to match natural image statistics
use semantic maps to obtain a better pixel to update
locally corrupts the image by redistributing pixel values via a process we term pixel deflection
then waelet-based denoising operation to softens the corruption

1.2.1. Attack

FGSM
IGSM
L-BFGS. minimize L2 distance between the image and adversarial example
Jacobian-based Saliency Map Attack (JSMA). modify the pixels which are most salient, targeted attack
Deep Fool. untargeted, approximate the classifier as a linear decision boundary then find the smallest perturbation needed to cross that boundary
Carlini&Wagner (C&W). Z_k: the logits of a model for a given class k

1.2.2. Defense

ensemble
distillation
transformation
quilting + TVM
foveation-based mechanism. crop the image around the object and then scale it back to the original size
random crop + random pad

2. Pixel Deflection

most deep classifiers are robust to the presence of natural noise, such as sensor noise

2.1. Algorithm

random sample a pixel
replace it with another randomly selected pixel from within a small square neighbourhood

even changing as much as 1% of the original pixels does notalter the classification of a clean image

2.2. Distribution of Attacks

2.3. Targeted Pixel Deflection

In natural image, many pixels do not correspond to a relevant semantic object and are therefore not salient to classification

2.3.1. Robust Activation Map

an adversary which successfully changes the most likely class tends to leave the rest of the top-k classes unchanged
38% of the time the predicted class of adversarial images is the second highest class of the model for the clean ima

3. Wavelet Denoising

3.1. Hard Thresholding

all coefficients with magnitude below the threshold are set to zero

3.2. Soft Thresholding

3.3. Adaptive Thresholding

VisuShrink. N pixels and σ of noise
BayesShrink. model the threshold for each wavelet coefficient as a Generalized Gaussian Distribution (GGD)

4. Methods

corrupt the image with pixel deflection
soften the impact of pixel deflection

convert image to YCbCr which has denoising advantages to the wavelet
project the image into the wavelet domain (use db1, but db2 annd haar have similar results)
soft threshold the wavelets using BayesShrinks
convert image back to RGB

(CVPR 2018) Deflecting adversarial attacks with pixel deflection

1. Overview

1.1. Motivation

1.2.1. Attack

1.2.2. Defense

2. Pixel Deflection

2.1. Algorithm

2.2. Distribution of Attacks

2.3. Targeted Pixel Deflection

2.3.1. Robust Activation Map

3. Wavelet Denoising

3.1. Hard Thresholding

3.2. Soft Thresholding

3.3. Adaptive Thresholding

4. Methods

5. Experiments

1. Overview

1.1. Motivation

1.2. Related Work

1.2.1. Attack

1.2.2. Defense

2. Pixel Deflection

2.1. Algorithm

2.2. Distribution of Attacks

2.3. Targeted Pixel Deflection

2.3.1. Robust Activation Map

3. Wavelet Denoising

3.1. Hard Thresholding

3.2. Soft Thresholding

3.3. Adaptive Thresholding

4. Methods

5. Experiments