10. Self-Attention Generative Adversarial Networks
• arXiv preprint by Zhang et al. 論文
• Han Zhang, Ian Goodfellow, Dimitris Metaxas, Augustus Odena
• (Submitted on 21 May 2018)
• Han Zhang∗ Rutgers University Ian Goodfellow Google Brain
Dimitris Metaxas Rutgers University Augustus Odena
• Google Brain
10
12. • In this paper, we proposed Self-Attention Generative Adversarial
Networks (SAGANs), which incorporate a self-attention
mechanism into the GAN framework.
• The self-attention module is effective in modeling long-range
dependencies.
• In addition, we show that spectral normalization applied to the
generator stabilizes GAN training and that TTUR speeds up
training of regularized discriminators.
• SAGAN achieves the state-of-the-art performance on class-
conditional image generation on ImageNe
12
32. 実装にあたって
• 論文では、ImageNet 1400枚超の画像とタグのデータセットを
使っていますが、
• この実装では、celebFaces Attribute Dataset 200K (めいめい
40のattribute アノテーションを持つ)を使っています。
• 論文では、 SN on G/D+TTUR 1Mイノテーションを使って
重みを更新しています。で、FID 22.96を出しています。
• 論文に合わせて、 SN on G/D+TTUR で、1Mイノテーション
使って重みを更新しています。 Adv-lossとしてwgn−gpを使用
しています。
32
33. • Meta overview
• This repository provides a PyTorch implementation of SAGAN.
Both wgan-gp and wgan-hinge loss are ready, but note that
wgan-gp is somehow not compatible with the spectral
normalization. Remove all the spectral normalization at the
model for the adoption of wgan-gp.
• Self-attentions are applied to later two layers of both
discriminator and generator
33
65. GANs Trained by a Two Time-Scale Update Rule Converge to a
Local Nash Equilibrium
https://arxiv.org/abs/1706.08500 TTURの論文
この論文では、TTURトレーニングと、GANの評価方法として、
FIDの二つを紹介している。
尚、 TTURトレーニングの結果、DCGANs とImproved
Wasserstein GANs (WGAN-GP)を、改善したと述べている。
65
TTUR
66. 2つの時間スケールの更新ルールによって訓練されたGANは、ローカルナッ
シュ均衡に収束する
• Martin Heusel, Hubert Ramsauer, Thomas
Unterthiner, Bernhard Nessler, Sepp Hochreiter
• (Submitted on 26 Jun 2017 (v1), last revised 12 Jan 2018 (this
version, v6))
• GANの学習で、生成モデル側の学習率を識別モデルより小さく
するとODEの理論を使ってナッシュ均衡解に収束することが示
せる。また、生成画像の良さを測る、inception scoreより優れ
ているFIDを提案。
66
74. Adam Follows an HBF ODE and Ensures
TTUR Convergence
• In our experiments, we aim at using Adam stochastic
approximation to avoid mode collapsing. GANs suffer from
“mode collapsing” where large masses of probability are
mapped onto a few modes that cover only small regions.
While these regions represent meaningful samples, the
variety of the real world data is lost and only few prototype
samples are generated. Different methods have been
proposed to avoid mode collapsing [11, 43]. We obviate mode
collapsing by using Adam stochastic approximation [29].
Adam can be described as Heavy Ball with Friction (HBF)
(see below), since it averages over past gradients
74
75. Spectral Normalization
• https://arxiv.org/abs/1802.05957 Spectral Normalization
• の論文
• 新しいweight normalization technique called spectral
normalizationを提案して、training of the discriminatorの安定
性を更新したと言っている。TraininngにおけるDiscriminator
の制限をだけを言っている。
75
77. • we used the Adam optimizer
• the number of updates of the discriminator per one update of
the generator and (2) learning rate α and the first and
second order momentum parameters (β1, β2) of Adam
77
78. _
• pytorch-spectral-normalization-gan
• Main.py datasetsはCIFAR10を使用している。
• get resnet model working with wasserstein and hinge
losses
• Model.py DCGAN-like generator and discriinatorを作ってい
る。
• Model_resnet.py ResNet generator and discriminatorを作成。
• Spectral_normalization.py
• Spectral_normalization_nondiff.py
78
82. • この論文の唯一の太字箇所にこう書かれていますが、
• In no experiment did we see evidence of mode collapse
for the WGAN algorithm.
• 確かにWGANはmode collapseを回避できているように見えま
す
82
85. GANの評価指標(ICML2018)
• Assessing Generative Models via Precision and Recall
• Recent advances in generative modeling have led to an increased interest in the study
of statistical divergences as means of model comparison. Commonly used evaluation
methods, such as Fréchet Inception Distance (FID), correlate well with the perceived
quality of samples and are sensitive to mode dropping. However, these metrics are
unable to distinguish between different failure cases since they yield one-dimensional
scores. We propose a novel definition of precision and recall for distributions which
disentangles the divergence into two separate dimensions. The proposed notion is
intuitive, retains desirable properties, and naturally leads to an efficient algorithm that
can be used to evaluate generative models. We relate this notion to total variation as
well as to recent evaluation metrics such as Inception Score and FID. To demonstrate
the practical utility of the proposed approach we perform an empirical study on
several variants of Generative Adversarial Networks and the Variational Autoencoder.
In an extensive set of experiments we show that the proposed metric is able to
disentangle the quality of generated samples from the coverage of the target
distribution.
85
Precesion(品質) and Recall(多様性) Distrbution
86. GANの評価指標(ICML2018)
• Geometry Score: A Method For Comparing Generative Adversarial Networks
• Recent advances in generative modeling have led to an increased interest in the study
of statistical divergences as means of model comparison. Commonly used evaluation
methods, such as Fr¥'echet Inception Distance (FID), correlate well with the
perceived quality of samples and are sensitive to mode dropping. However, these
metrics are unable to distinguish between different failure cases since they yield one-
dimensional scores. We propose a novel definition of precision and recall for
distributions which disentangles the divergence into two separate dimensions. The
proposed notion is intuitive, retains desirable properties, and naturally leads to an
efficient algorithm that can be used to evaluate generative models. We relate this
notion to total variation as well as to recent evaluation metrics such as Inception
Score and FID. To demonstrate the practical utility of the proposed approach we
perform an empirical study on several variants of Generative Adversarial Networks
and the Variational Autoencoder. In an extensive set of experiments we show that the
proposed metric is able to disentangle the quality of generated samples from the
coverage of the target distribution.
86
パーシステントホモロジーを使って