BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - DEF CON 26 CAAD VILLAGE

Team members:
Chao Du
Yinpeng Dong
Xingxing Wei
Tianyu Pang (Me)
Fangzhou Liao

• Defense against Adversarial Attacks Using High-level
Representation Guided Denoiser (CVPR 2017)
Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Jun Zhu, and Xiaolin Hu
• Boosting Adversarial Attacks with Momentum (CVPR 2017)
Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Xiaolin Hu, Jianguo Li, and Jun Zhu
• Max-Mahalanobis Linear Discriminant Analysis Networks
(ICML 2018)
Tianyu Pang, Chao Du, and Jun Zhu
• Towards Robust Detection of Adversarial Examples (Under
review of NIPS 2018)
Tianyu Pang, Chao Du, Yinpeng Dong, and Jun Zhu

Boosting Adversarial Attacks
with Momentum
Yinpeng Dong1, Fangzhou Liao1, Tianyu Pang1,
Hang Su1, Jun Zhu1, Xiaolin Hu1, Jianguo Li2
1 Tsinghua University, 2 Intel Labs China

6
Adversarial Examples
Alps: 94.39% Dog: 99.99%
Puffer: 97.99% Crab: 100.00%
◼ Szegedy et al 2013: Intriguing properties of neural networks.

7
Overview (White-box Attacks)
• One-step FGSM (Goodfellow et al., 2015)
𝑥∗
= 𝑥 + 𝜖 ⋅ sign(𝛻𝑥 𝐿(𝑥, 𝑦))
• Iterative FGSM (I-FGSM, Kurakin et al., 2016)
𝑥0
∗
= 𝑥, 𝑥𝑡+1
∗
= clip(𝑥𝑡
∗
+ α ⋅ sign 𝛻𝑥 𝐿 𝑥𝑡
∗
, 𝑦 )
• Optimization-based methods (Carlini and Wagner, 2017)
min 𝑑 𝑥∗
, 𝑥 − 𝐿(𝑥∗
, 𝑦)
Constrained optimization of adversarial attacks:
argmax
𝑥∗
𝐿 𝑥∗
, 𝑦 𝑠. 𝑡. 𝑥∗
− 𝑥 ∞ ≤ 𝜖

8
Black-box Attacks (Transferability)
• Cross-model transferability (Liu et al., 2017)
• Cross-data transferability
(Moosavi-Dezfooli et al., 2017)

9
Limitations of Black-box Attacks (1)
• FGSM have poor white-box attack ability;
• Iterative FGSM have poor transferability;
• The trade-off between transferability and attack
ability, makes black-box attacks less effective.
Number of Iterations
1 2 3 4 5 6 7 8 9 10
SuccessRate(%)
0
20
40
60
80
100
Inc-v3 vs. I-FGSM
Inc-v4 vs. I-FGSM
IncRes-v2 vs. I-FGSM
Res-152 vs. I-FGSM
• Attack Inception V3;
• Evaluate the success rates of
attacks on Inception V3,
Inception V4, Inception
ResNet V2, ResNet v2-152;
• ϵ = 16;
• 1000 images from ImageNet.

10
Limitations of Black-box Attacks (2)
• Train a substitute network (Papernot et al., 2017) to
fully characterize the behavior of the black-box model
• Require full prediction confidence;
• Require tremendous queries;
• Hard to deploy for models trained on large-scale
dataset
• Impossible for cases without querying
• Our solution: alleviate the trade-off between
transferability and attack ability.

11
Optimization with Momentum
• Constrained optimization of adversarial attacks:
argmax
𝑥∗
𝐿 𝑥∗
, 𝑦 𝑠. 𝑡. 𝑥∗
− 𝑥 ∞ ≤ 𝜖
• Accelerate gradient descent;
• Escape from poor local minima and maxima;
• Stabilize update directions of stochastic gradient
descent;
• Momentum can be used for adversarial attacks
• It is still a white-box attack method but has strong black-
box attack ability (transferability)

12
Momentum Iterative FGSM
𝑥0
∗
= 𝑥, 𝑥𝑡+1
∗
= clip(𝑥 𝑡
∗
+ α ⋅ sign 𝛻𝑥 𝐿 𝑥 𝑡
∗
, 𝑦 )
𝑥0
∗
= 𝑥, 𝑔0 = 0
𝑔𝑡+1 = 𝜇 ⋅ 𝑔𝑡 +
𝛻𝑥 𝐿 𝑥𝑡
∗
, 𝑦
𝛻𝑥 𝐿 𝑥𝑡
∗
, 𝑦 1
𝑥𝑡+1
∗
= clip(𝑥𝑡
∗
+ 𝛼 ⋅ sign 𝑔𝑡+1 )
• 𝜇 is the decay factor;
• 𝑔𝑡 accumulates the gradient w.r.t. input space of the first t iterations;
• The current gradient is normalized.
Momentum

13
Non-targeted Results
• 𝜖 = 16, 𝜇 = 1.0, 10 iterations

14
Ablation Study
Number of Iterations
1 2 3 4 5 6 7 8 9 10
SuccessRate(%)
0
20
40
60
80
100
Inc-v3 vs. MI-FGSM
Inc-v3 vs. I-FGSM
Inc-v4 vs. MI-FGSM
Inc-v4 vs. I-FGSM
IncRes-v2 vs. MI-FGSM
IncRes-v2 vs. I-FGSM
Res-152 vs. MI-FGSM
Res-152 vs. I-FGSM
◼ Attack Inception V3 with 𝜖 = 16
The size of perturbation 0
1 4 7 10 13 16 19 22 25 28 31 34 37 40
SuccessRate(%)
0
10
20
30
40
50
60
70
80
90
100
Inc-v3 vs. MI-FGSM
Inc-v3 vs. I-FGSM
Inc-v3 vs. FGSM
Res-152 vs. MI-FGSM
Res-152 vs. I-FGSM
Res-152 vs. FGSM
◼ Attack Inception V3 with 𝛼 = 1

15
Attacking an Ensemble of Models
• If an adversarial example remain adversarial for
multiple models, it is more likely to be misclassified by
other black-box models.
• Ensemble in logits
𝑙 𝑥 = ෍
𝑖=1
𝐾
𝑤𝑖 𝑙𝑖(𝑥)
◼The loss is defined as
𝐽 𝑥, 𝑦 = −1 𝑦 ⋅ log(softmax(𝑙(𝑥)))
• Comparisons:
• Ensemble in predictions: 𝑝 𝑥 = σ𝑖=1
𝐾
𝑤𝑖 𝑝𝑖(𝑥)
• Ensemble in loss: 𝐽 𝑥, 𝑦 = σ𝑖=1
𝐾
𝑤𝑖 𝐽𝑖(𝑥, 𝑦)

16
Non-targeted Results (2)
◼ 𝜖 = 16, 𝜇 = 1.0, 20 iterations, equal ensemble weights

17
NIPS 2017: Non-targeted Attack
Competition – 1st
• Include 8 models:
• Inception V3 1 / 7.25
• Inception V4, 1 / 7.25
• Inception ResNet V2 1 / 7.25
• Resnet v2 101 1 / 7.25
• Adv Inception V3 0.25 / 7.25
• Ens3 Adv Inception V3 1 / 7.25
• Ens4 Adv Inception V3 1 / 7.25
• Ens Adv Inception ResNet V2 1 / 7.25
• 𝜇 = 1.0, Iterations = 10
https://github.com/dongyp13/Non-Targeted-Adversarial-Attacks

18
NIPS 2017: Targeted Attack
Competition – 1st
• We observed no transferability in targeted attacks
(For ImageNet);
• Only white-box attacks;
• 29 Inception V3; 9 Ens Adv Inception ResNet V2; 1
Adv Inception V3; …
• Include 5 models
• Inception V3 4 / 11
• Adv Inception V3 1 / 11
• Ens3 Adv Inception V3 1 / 11
• Ens4 Adv Inception V3 1 / 11
• Ens Adv Inception ResNet V2 4 / 11
• Several Tricks
https://github.com/dongyp13/Targeted-Adversarial-Attack

Max-Mahalanobis Linear
Discriminant Analysis Networks
Tianyu Pang, Chao Du and Jun Zhu
Department of Computer Science and Technology
Tsinghua University
ICML | 2018

21
Motivation one
• Almost all popular networks suffer from
adversarial attacks
From Dong et al. (2018)

22
Motivation two
• Paradigm of feed-forward deep nets
Non-linear
Transformation
Linear
Classifier
Input Output

23
Motivation two
Non-linear
Transformation
Linear
Classifier
Input Output
Active area of research
(AlexNet; VGG nets; ResNets;
GoogleNets; DenseNets;)

24
Motivation two
Non-linear
Transformation
Linear
Classifier
Input Output
Active area of research
(AlexNet; VGG nets; ResNets;
GoogleNets; DenseNets;)
Much less active
(Softmax regression)

25
• Design a new network architecture for
better performance in the adversarial setting.
Our goal

26
• Design a new network architecture for
better performance in the adversarial setting.
• Substitute a new linear classifier for softmax
regression (SR).
Our goal

27
Our Method
(MM-LDA networks)

28
• Efron et al.(1975) show that if the input distributes
as a mixture of Gaussian, then linear discriminant
analysis (LDA) is more efficient than logistic
regression (LR).
Inspiration one: LDA is more efficient than LR

29
regression (LR).
LDA needs less training data than LR to obtain certain error rate

30
regression (LR).
• However, in practice data points hardly distributes
as a mixture of Gaussian in the input space.
LDA needs less training data than LR to obtain certain error rate

31
Inspiration two: Neural networks are powerful

32
• Deep generative models (e.g., GANs) are
successful.
Deep generative models
Simple Distribution
(Gaussian/Mixture of Gaussian)
Complex Distribution
(Data distribution)
DNN

33
• Deep generative models (e.g., GANs) are
successful.
• The reverse direction should also be feasible.
Our Method
(MM-LDA networks)
Deep generative models
Simple Distribution
(Gaussian/Mixture of Gaussian)
Complex Distribution
(Data distribution)
DNN

34
The Solution
Our method
• Models the feature distribution in DNNs as a
mixture of Gaussian.
2018/10/11 MMLDA

35
The Solution
Our method
• Models the feature distribution in DNNs as a
mixture of Gaussian.
• Applies LDA on the feature to make predictions.
2018/10/11 MMLDA

36
How to treat the Gaussian
parameters?

37
parameters?
• Wan et al. (CVPR 2018) also model the feature
distribution as a mixture of Gaussian. However,
they treat the Gaussian parameters (𝜇𝑖 and Σ) as
extra trainable variables.

38
parameters?
• We treat them as hyperparameters calculated by
our algorithm, which can provide theoretical
guarantee on the robustness.

39
parameters?
• We treat them as hyperparameters calculated by
our algorithm, which can provide theoretical
guarantee on the robustness.
• The induced mixture of Gaussian model is named
Max Mahalanobis Distribution (MMD).

40
Max Mahalanobis Distribution (MMD)
• Making the minimal Mahalanobis distance
between two Gaussian components maximal.
𝜇1
𝜇2
𝜇1 𝜇2
𝜇3
𝜇1
𝜇2
𝜇3
𝜇4
𝐿 = 2
Straight Line
𝐿 = 3
Equilateral
Triangle
𝐿 = 4
Regular
Tetrahedron

41
Definition of Robustness
• The robustness on a point with label 𝑖 (Moosavi-
Dezfoolo et al. , CVPR 2016):
min
𝑗≠𝑖
𝑑𝑖,𝑗 ,
where 𝑑𝑖,𝑗 is the local minimal distance of a point
with label 𝑖 to an adversarial example with label 𝑗.

42
Definition of Robustness
• The robustness on a point with label 𝑖 (Moosavi-
Dezfoolo et al. , CVPR 2016):
min
𝑗≠𝑖
𝑑𝑖,𝑗 ,
where 𝑑𝑖,𝑗 is the local minimal distance of a point
with label 𝑖 to an adversarial example with label 𝑗.
• We further define the robustness of the classifier
as:
𝐑𝐁 = min
𝑖,𝑗∈[𝐿]
𝔼(𝑑𝑖,𝑗) .

43
Robustness w.r.t Gaussian
parameters
Theorem 1. The expectation of the distance 𝔼 𝑑𝑖,𝑗 is a function of the
Mahalanobis distance ∆𝑖,𝑗 as
𝔼 𝑑𝑖,𝑗 =
2
𝜋
exp −
∆𝑖,𝑗
2
8
+
1
2
∆𝑖,𝑗 1 − 2𝛷(−
∆𝑖,𝑗
2
)
where 𝛷 (∙) is the normal cumulative distribution function.

44
parameters
𝐑𝐁 ≈ 𝐑𝐁 =
𝟏
𝟐
min
𝑖,𝑗∈[𝐿]
∆𝑖,𝑗,
2
𝜋
exp −
∆𝑖,𝑗
2
8
+
1
2
∆𝑖,𝑗 1 − 2𝛷(−
∆𝑖,𝑗
2
)

45
parameters
𝐑𝐁 ≈ 𝐑𝐁 =
𝟏
𝟐
min
𝑖,𝑗∈[𝐿]
∆𝑖,𝑗,
Distributing as a MMD can maximize 𝐑𝐁.
2
𝜋
exp −
∆𝑖,𝑗
2
8
+
1
2
∆𝑖,𝑗 1 − 2𝛷(−
∆𝑖,𝑗
2
)

47
Performance on normal examples

48
More orderly distribution in the
feature space
SR networks MM-LDA networks

49
Better robustness on iterative-
based attacks

50
Better robustness on optimization-
based attack

51
Better robustness on optimization-
based attack
CIFAR-10
MNIST
Nor. examples
Adv. Noises (SR)
Adv. Noises (Ours)
Nor. examples
Adv. Noises (SR)
Adv. Noises (Ours)

52
Better performance on class-
biased datasets

53
Conclusion
• No extra computational cost

54
Conclusion
• With no loss of accuracy on normal
examples

55
Conclusion
• With no loss of accuracy on normal
examples
• Quite easy to implement

56
Conclusion
• With no loss of accuracy on normal examples
• Quite easy to implement
• Compatible with nearly all popular networks

BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - DEF CON 26 CAAD VILLAGE

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - DEF CON 26 CAAD VILLAGE

Ähnlich wie BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - DEF CON 26 CAAD VILLAGE (20)

Mehr von GeekPwn Keen

Mehr von GeekPwn Keen (13)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

BOOSTING ADVERSARIAL ATTACKS WITH MOMENTUM - Tianyu Pang and Chao Du, THU - DEF CON 26 CAAD VILLAGE