Youtube: https://www.youtube.com/watch?v=Pu2WQnU0GzA
Deep neural networks are vulnerable to adversarial examples, which poses security concerns on these algorithms due to the potentially severe consequences. Adversarial at- tacks serve as an important surrogate to evaluate the robustness of deep learning models before they are deployed. However, most of existing adversarial attacks can only fool a black-box model with a low success rate. To address this issue, we propose a broad class of momentum-based iterative algorithms to boost adversarial attacks. By integrating the momentum term into the iterative process for attacks, our methods can stabilize update directions and escape from poor local maxima during the iterations, resulting in more transferable adversarial examples. To further improve the success rates for black-box attacks, we apply momentum iterative algorithms to an ensemble of models, and show that the adversarially trained models with a strong defense ability are also vulnerable to our black-box attacks. We hope that the proposed methods will serve as a benchmark for evaluating the robustness of various deep models and defense methods. With this method, we won the first places in NIPS 2017 Non-targeted Adversarial Attack and Targeted Adversarial Attack competitions.
Tianyu Pang is a first-year Ph.D. student of TSAIL Group in the Department of Computer Science and Technology, Tsinghua University, advised by Prof. Jun Zhu. His research interest includes machine learning, deep learning and their applications in computer vision, especially robustness of deep learning.
3. • Defense against Adversarial Attacks Using High-level
Representation Guided Denoiser (CVPR 2017)
Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Jun Zhu, and Xiaolin Hu
• Boosting Adversarial Attacks with Momentum (CVPR 2017)
Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Xiaolin Hu, Jianguo Li, and Jun Zhu
• Max-Mahalanobis Linear Discriminant Analysis Networks
(ICML 2018)
Tianyu Pang, Chao Du, and Jun Zhu
• Towards Robust Detection of Adversarial Examples (Under
review of NIPS 2018)
Tianyu Pang, Chao Du, Yinpeng Dong, and Jun Zhu
4. • Defense against Adversarial Attacks Using High-level
Representation Guided Denoiser (CVPR 2017)
Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Jun Zhu, and Xiaolin Hu
• Boosting Adversarial Attacks with Momentum (CVPR 2017)
Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Xiaolin Hu, Jianguo Li, and Jun Zhu
• Max-Mahalanobis Linear Discriminant Analysis Networks
(ICML 2018)
Tianyu Pang, Chao Du, and Jun Zhu
• Towards Robust Detection of Adversarial Examples (Under
review of NIPS 2018)
Tianyu Pang, Chao Du, Yinpeng Dong, and Jun Zhu
5. Boosting Adversarial Attacks
with Momentum
Yinpeng Dong1, Fangzhou Liao1, Tianyu Pang1,
Hang Su1, Jun Zhu1, Xiaolin Hu1, Jianguo Li2
1 Tsinghua University, 2 Intel Labs China
9. 9
Limitations of Black-box Attacks (1)
• FGSM have poor white-box attack ability;
• Iterative FGSM have poor transferability;
• The trade-off between transferability and attack
ability, makes black-box attacks less effective.
Number of Iterations
1 2 3 4 5 6 7 8 9 10
SuccessRate(%)
0
20
40
60
80
100
Inc-v3 vs. I-FGSM
Inc-v4 vs. I-FGSM
IncRes-v2 vs. I-FGSM
Res-152 vs. I-FGSM
• Attack Inception V3;
• Evaluate the success rates of
attacks on Inception V3,
Inception V4, Inception
ResNet V2, ResNet v2-152;
• ϵ = 16;
• 1000 images from ImageNet.
10. 10
Limitations of Black-box Attacks (2)
• Train a substitute network (Papernot et al., 2017) to
fully characterize the behavior of the black-box model
• Require full prediction confidence;
• Require tremendous queries;
• Hard to deploy for models trained on large-scale
dataset
• Impossible for cases without querying
• Our solution: alleviate the trade-off between
transferability and attack ability.
11. 11
Optimization with Momentum
• Constrained optimization of adversarial attacks:
argmax
𝑥∗
𝐿 𝑥∗
, 𝑦 𝑠. 𝑡. 𝑥∗
− 𝑥 ∞ ≤ 𝜖
• Accelerate gradient descent;
• Escape from poor local minima and maxima;
• Stabilize update directions of stochastic gradient
descent;
• Momentum can be used for adversarial attacks
• It is still a white-box attack method but has strong black-
box attack ability (transferability)
12. 12
Momentum Iterative FGSM
𝑥0
∗
= 𝑥, 𝑥𝑡+1
∗
= clip(𝑥 𝑡
∗
+ α ⋅ sign 𝛻𝑥 𝐿 𝑥 𝑡
∗
, 𝑦 )
𝑥0
∗
= 𝑥, 𝑔0 = 0
𝑔𝑡+1 = 𝜇 ⋅ 𝑔𝑡 +
𝛻𝑥 𝐿 𝑥𝑡
∗
, 𝑦
𝛻𝑥 𝐿 𝑥𝑡
∗
, 𝑦 1
𝑥𝑡+1
∗
= clip(𝑥𝑡
∗
+ 𝛼 ⋅ sign 𝑔𝑡+1 )
• 𝜇 is the decay factor;
• 𝑔𝑡 accumulates the gradient w.r.t. input space of the first t iterations;
• The current gradient is normalized.
Momentum
14. 14
Ablation Study
Number of Iterations
1 2 3 4 5 6 7 8 9 10
SuccessRate(%)
0
20
40
60
80
100
Inc-v3 vs. MI-FGSM
Inc-v3 vs. I-FGSM
Inc-v4 vs. MI-FGSM
Inc-v4 vs. I-FGSM
IncRes-v2 vs. MI-FGSM
IncRes-v2 vs. I-FGSM
Res-152 vs. MI-FGSM
Res-152 vs. I-FGSM
◼ Attack Inception V3 with 𝜖 = 16
The size of perturbation 0
1 4 7 10 13 16 19 22 25 28 31 34 37 40
SuccessRate(%)
0
10
20
30
40
50
60
70
80
90
100
Inc-v3 vs. MI-FGSM
Inc-v3 vs. I-FGSM
Inc-v3 vs. FGSM
Res-152 vs. MI-FGSM
Res-152 vs. I-FGSM
Res-152 vs. FGSM
◼ Attack Inception V3 with 𝛼 = 1
15. 15
Attacking an Ensemble of Models
• If an adversarial example remain adversarial for
multiple models, it is more likely to be misclassified by
other black-box models.
• Ensemble in logits
𝑙 𝑥 =
𝑖=1
𝐾
𝑤𝑖 𝑙𝑖(𝑥)
◼The loss is defined as
𝐽 𝑥, 𝑦 = −1 𝑦 ⋅ log(softmax(𝑙(𝑥)))
• Comparisons:
• Ensemble in predictions: 𝑝 𝑥 = σ𝑖=1
𝐾
𝑤𝑖 𝑝𝑖(𝑥)
• Ensemble in loss: 𝐽 𝑥, 𝑦 = σ𝑖=1
𝐾
𝑤𝑖 𝐽𝑖(𝑥, 𝑦)
23. 23
Motivation two
• Paradigm of feed-forward deep nets
Non-linear
Transformation
Linear
Classifier
Input Output
Active area of research
(AlexNet; VGG nets; ResNets;
GoogleNets; DenseNets;)
24. 24
Motivation two
• Paradigm of feed-forward deep nets
Non-linear
Transformation
Linear
Classifier
Input Output
Active area of research
(AlexNet; VGG nets; ResNets;
GoogleNets; DenseNets;)
Much less active
(Softmax regression)
25. 25
• Design a new network architecture for
better performance in the adversarial setting.
Our goal
26. 26
• Design a new network architecture for
better performance in the adversarial setting.
• Substitute a new linear classifier for softmax
regression (SR).
Our goal
28. 28
• Efron et al.(1975) show that if the input distributes
as a mixture of Gaussian, then linear discriminant
analysis (LDA) is more efficient than logistic
regression (LR).
Inspiration one: LDA is more efficient than LR
29. 29
• Efron et al.(1975) show that if the input distributes
as a mixture of Gaussian, then linear discriminant
analysis (LDA) is more efficient than logistic
regression (LR).
LDA needs less training data than LR to obtain certain error rate
Inspiration one: LDA is more efficient than LR
30. 30
• Efron et al.(1975) show that if the input distributes
as a mixture of Gaussian, then linear discriminant
analysis (LDA) is more efficient than logistic
regression (LR).
• However, in practice data points hardly distributes
as a mixture of Gaussian in the input space.
LDA needs less training data than LR to obtain certain error rate
Inspiration one: LDA is more efficient than LR
32. 32
• Deep generative models (e.g., GANs) are
successful.
Deep generative models
Simple Distribution
(Gaussian/Mixture of Gaussian)
Complex Distribution
(Data distribution)
DNN
Inspiration two: Neural networks are powerful
33. 33
• Deep generative models (e.g., GANs) are
successful.
• The reverse direction should also be feasible.
Our Method
(MM-LDA networks)
Deep generative models
Simple Distribution
(Gaussian/Mixture of Gaussian)
Complex Distribution
(Data distribution)
DNN
Inspiration two: Neural networks are powerful
35. 35
The Solution
Our method
• Models the feature distribution in DNNs as a
mixture of Gaussian.
• Applies LDA on the feature to make predictions.
2018/10/11 MMLDA
37. 37
How to treat the Gaussian
parameters?
• Wan et al. (CVPR 2018) also model the feature
distribution as a mixture of Gaussian. However,
they treat the Gaussian parameters (𝜇𝑖 and Σ) as
extra trainable variables.
38. 38
How to treat the Gaussian
parameters?
• Wan et al. (CVPR 2018) also model the feature
distribution as a mixture of Gaussian. However,
they treat the Gaussian parameters (𝜇𝑖 and Σ) as
extra trainable variables.
• We treat them as hyperparameters calculated by
our algorithm, which can provide theoretical
guarantee on the robustness.
39. 39
How to treat the Gaussian
parameters?
• Wan et al. (CVPR 2018) also model the feature
distribution as a mixture of Gaussian. However,
they treat the Gaussian parameters (𝜇𝑖 and Σ) as
extra trainable variables.
• We treat them as hyperparameters calculated by
our algorithm, which can provide theoretical
guarantee on the robustness.
• The induced mixture of Gaussian model is named
Max Mahalanobis Distribution (MMD).
40. 40
Max Mahalanobis Distribution (MMD)
• Making the minimal Mahalanobis distance
between two Gaussian components maximal.
𝜇1
𝜇2
𝜇1 𝜇2
𝜇3
𝜇1
𝜇2
𝜇3
𝜇4
𝐿 = 2
Straight Line
𝐿 = 3
Equilateral
Triangle
𝐿 = 4
Regular
Tetrahedron
41. 41
Definition of Robustness
• The robustness on a point with label 𝑖 (Moosavi-
Dezfoolo et al. , CVPR 2016):
min
𝑗≠𝑖
𝑑𝑖,𝑗 ,
where 𝑑𝑖,𝑗 is the local minimal distance of a point
with label 𝑖 to an adversarial example with label 𝑗.
42. 42
Definition of Robustness
• The robustness on a point with label 𝑖 (Moosavi-
Dezfoolo et al. , CVPR 2016):
min
𝑗≠𝑖
𝑑𝑖,𝑗 ,
where 𝑑𝑖,𝑗 is the local minimal distance of a point
with label 𝑖 to an adversarial example with label 𝑗.
• We further define the robustness of the classifier
as:
𝐑𝐁 = min
𝑖,𝑗∈[𝐿]
𝔼(𝑑𝑖,𝑗) .
43. 43
Robustness w.r.t Gaussian
parameters
Theorem 1. The expectation of the distance 𝔼 𝑑𝑖,𝑗 is a function of the
Mahalanobis distance ∆𝑖,𝑗 as
𝔼 𝑑𝑖,𝑗 =
2
𝜋
exp −
∆𝑖,𝑗
2
8
+
1
2
∆𝑖,𝑗 1 − 2𝛷(−
∆𝑖,𝑗
2
)
where 𝛷 (∙) is the normal cumulative distribution function.
44. 44
Robustness w.r.t Gaussian
parameters
𝐑𝐁 ≈ 𝐑𝐁 =
𝟏
𝟐
min
𝑖,𝑗∈[𝐿]
∆𝑖,𝑗,
Theorem 1. The expectation of the distance 𝔼 𝑑𝑖,𝑗 is a function of the
Mahalanobis distance ∆𝑖,𝑗 as
𝔼 𝑑𝑖,𝑗 =
2
𝜋
exp −
∆𝑖,𝑗
2
8
+
1
2
∆𝑖,𝑗 1 − 2𝛷(−
∆𝑖,𝑗
2
)
where 𝛷 (∙) is the normal cumulative distribution function.
45. 45
Robustness w.r.t Gaussian
parameters
𝐑𝐁 ≈ 𝐑𝐁 =
𝟏
𝟐
min
𝑖,𝑗∈[𝐿]
∆𝑖,𝑗,
Distributing as a MMD can maximize 𝐑𝐁.
Theorem 1. The expectation of the distance 𝔼 𝑑𝑖,𝑗 is a function of the
Mahalanobis distance ∆𝑖,𝑗 as
𝔼 𝑑𝑖,𝑗 =
2
𝜋
exp −
∆𝑖,𝑗
2
8
+
1
2
∆𝑖,𝑗 1 − 2𝛷(−
∆𝑖,𝑗
2
)
where 𝛷 (∙) is the normal cumulative distribution function.
55. 55
Conclusion
• No extra computational cost
• With no loss of accuracy on normal
examples
• Quite easy to implement
56. 56
Conclusion
• No extra computational cost
• With no loss of accuracy on normal examples
• Quite easy to implement
• Compatible with nearly all popular networks