0%

(ICLR 2018) Mitigating adversarial effects through randomization

Keyword [Ensemble]

Xie C, Wang J, Zhang Z, et al. Mitigating adversarial effects through randomization[J]. arXiv preprint arXiv:1711.01991, 2017.



1. Overview


In this paper, it exploited randomization at inference time to mitigate adversarial effects

  1. random resize
  2. random padding
  • can defense both single-step and iterative attacks
    • no additional training or fine-tuning
    • very few additional computations
    • compatible with other adversarial defense methods

1.1. Relative Work

  • FGSM
  • DeepFool
  • Carlini&Wagner (C&W)


  1. Z(x)_i. the logits output for class i
  2. k. control the confidence gap between theadversarial class and true class



2. Methods




  • resize. from 299x299 to MxM. M∈[299, 331)
  • pad. to 331x331
  • total 12528 patterns




3. Experiments


3.1. Attack Scenarios

  • vanilla attack. attacker do not know the existence of the randomization layers; target model is the original network
  • single-pattern attack. attacker know the existence of the randomization layers; target model is the original network + randomization layers with only one predefined pattern
  • ensemble-pattern attack. know; original network + randomization layers with an ensemble of predefined patterns

3.2. Accurate Drop



3.3. Vanilla Attack Scenario



  • mitigate both single-step and iterativ

3.4. Single-Pattern Attack Scenario



  • for single-step attacks, random layers are less effective on mitigating adversarial effects for a larger ε
  • for iterative attacks, random layers perform well

3.5. Ensemble-Pattern Attack Scenario



  • adversarial examples generated under ensemble-pattern attack scenario are much stronger

3.6. One Pixel Padding



  • from 330x330 to 331x331
  • adversarial examples generated by single-step attacks have strong transferability, but still cannot attack the defense model
  • adversarial examples generated by iterative attacks are much less transferable between different padding pattern

3.7. One Pixel Resizing



  • from 330x330 to 331x331
  • resize the imagesby only 1 pixel can effectively destroy the transferability of adversarial examples by both single-step and iterative step