Neural Network, GANs & Image Translation

Ⅰ. Neural Network
Ⅱ. Generative Adversarial Nets
Ⅲ. Image-to-Image Translation

1. How does Neural Network learn?
2. What do we have to decide?
3. Why it’s hard to decide a loss function?
Neural Network
Ⅰ

How does Neural Network learn?
Preparing input and target pairs.
inputs targets
Lion
Cat
map
0
1
1
0
0
1
One-hot
encoding
Dog 2
0
0
0
0
1

The weights of the network are arbitrarily set.
0.6
0.2
0.3
0.9
0.1

Feed Forward

Feed Forward
0.2
0.1
0.6
0.3
0.2
0.7
0.3
0.1
𝑠𝑢𝑚: 0.2 × 0.2 + 0.1 × 0.7 + 0.6 × 0.3 + 0.3 × 0.1 = 0.32
N21
𝑂𝑢𝑡𝑝𝑢𝑡 𝑜𝑓 𝑁21 = 𝑓 0.32 𝑓 𝑖𝑠 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝑁21
𝑂𝑢𝑡𝑝𝑢𝑡 𝑜𝑓 𝑁21 = 𝑓 0.32 = 0.1024. 𝑖𝑓 𝑓 𝑥 = 𝑥2

Calculate error
Sum of squares loss
Softmax loss
Cross entropy loss
Hinge loss

−
Sum of squares loss
Softmax loss
Cross entropy loss
Hinge loss
0.2
0.8
Sum of squares loss = 0.08
0.2
0.8
Output of ANN
0.0
1.0
Target value
= 0.04
0.04
( )
2

Feedback

What we have to decide?
Gradient Descent Optimization Algorithms
• Batch Gradient Descent
• Stochastic Gradient Descent (SGD)
• Momentum
• Nesterov Accelerated Gradient (NAG)
• Adagrad
• RMSProp
• AdaDelta
• Adam

What we have to decide?
Neural network structure
• VGG-19
• GoogLeNet
Training techniques
• Drop out
• sparse
Loss function and cost function
• Cross entropy
• Sum of squeares
Optimization algorithm
• Adam
• SDG

Why it’s hard to decide a loss function?
In classification.
Input
NN
Output of NN Target
Output of NN
Calculate NN output Calculate loss
loss
NN
Update weights
of NN using loss

In classification.
Output of NN Target
0.67
0.00
0.02
0.12
0.04
0.00
0.03
0.14
1.0
0.00
0.00
0.00
0.00
0.00
0.00
0.00
Loss
Sum of L1 norm Cross entropy
0.68 2.45

When an output of NN is image.
Input Ground truth L1
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”,
CVPR, 2016

If output form is a digit.
Multiple choice questions
Essay questions
Art practical exam
If output form is a image.

If output form is a digit.
Multiple choice questions
Essay questions
Art practical exam
If output form is a image.
A difficulty of assessment

1. Generative Adversarial Networks
2. Training Tip
Generative Adversarial Nets
Ⅱ

Leonardo Dicaprio:
a counterfeiter
Tom Hanks:
FBI – a counterfeit money
discriminator

Counterfeiter
(Generator)
FBI
(Discriminator)
50,000 won
Can you
discriminate it is
counterfeit or
not?
I made a
counterfeit
money!

Counterfeiter
(Generator)
FBI
(Discriminator)
Oh, no!
I can’t
discriminate it is
counterfeit or
not.
Maybe it’s
counterfeit
money with 55%
probability!
50,000 won

FBI
(Discriminator)
50,000 won
Compare target and
output of generator.
Output of generator
target

Counterfeiter
(Generator)
FBI
(Discriminator)
50,000 won
It’s counterfeit
money with
99.9%
probability!
loss

Counterfeiter
(Generator)
FBI
(Discriminator)
Can you
discriminate it is
counterfeit or
not?
I made a
counterfeit
money again!

Counterfeiter
(Generator)
FBI
(Discriminator)
It’s counterfeit
money with
70.5%
probability!
loss

FBI
(Discriminator)
Compare target and
output of generator.
Output of generator
target

Counterfeiter
(Generator)
FBI
(Discriminator)
Oh, no!
I can’t
discriminate it is
counterfeit or
not.
loss
Maybe it’s
counterfeit
money with 50%
probability!

D tries to make D(G(z)) near 0, G tries to make D(G(z)) near 1
This image is captured from Ian J. Goodfellow, et al., “Generative Adversarial Nets”.

Training Tip
min
𝐺
max
𝐷
𝑉(𝐷, 𝐺) = 𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )]
max
𝐺
𝑉(𝐷, 𝐺) = 𝔼 𝑧~𝑝 𝑧(𝑧)[log(𝐷 𝐺(𝑧) )]
max
𝐷
𝑉(𝐷, 𝐺) = 𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )]
min
𝐺
𝑉(𝐷, 𝐺) = −(𝔼 𝑧~𝑝 𝑧(𝑧)[log(𝐷 𝐺(𝑧) )])
min
𝐷
𝑉(𝐷, 𝐺) = −(𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )])

1. Introduce
2. Method
3. Experiments
Image to Image translation
Ⅲ

Introduce
Conditional adversarial nets are a general-purpose solution
for image-to-image translation.
Code: https://github.com/phillipi/pix2pix
CVPR, 2016

Method
GAN
G: z  y
Conditional GAN
G: {x, z}  y
CVPR, 2016

Method
ℒ 𝑐𝐺𝐴𝑁(𝐺, 𝐷) = 𝔼 𝑥,𝑦 log 𝐷 𝑥, 𝑦 + 𝔼 𝑥,𝑧[log(1 − 𝐷 𝑥, 𝐺(𝑥, 𝑧) )]
ℒ 𝐺𝐴𝑁(𝐺, 𝐷) = 𝔼 𝑦 log 𝐷 𝑦 + 𝔼 𝑥,𝑧[log(1 − 𝐷 𝐺(𝑥, 𝑧) )]
ℒ 𝐿1(𝐺) = 𝔼 𝑥,𝑦,𝑧 𝑦 − 𝐺(𝑥, 𝑧) 1
𝐺∗ = 𝑎𝑟𝑔 min
𝐺
max
𝐷
ℒ 𝑐𝐺𝐴𝑁 𝐺, 𝐷 + 𝜆 ℒ 𝐿1(𝐺)
Objective function for GAN
Objective function for cGAN
Final objective function

Method
Network architectures
Generator
Discriminator – Markovian discriminator (PatchGAN)
This discriminator effectively models the image as a Markov random field.
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Netowrks”,
CVPR, 2016

Method
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Nets”,
https://www.slideshare.net/xavigiro/imagetoimage-translation-with-conditional-adversarial-nets-upc-reading-group
This image is captured from http://ccvl.jhu.edu/datasets/

Experiments
CVPR, 2016

Experiments
Patch size variations.
This images are captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”,
CVPR, 2016

References
[1] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,
Sherjil Ozair, Aaron Courville, Yoshua Bengio, “Generative Adversarial Nets”, NIPS
2014
[2] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros, “Image-to-Image
Translation with Conditional Adversarial Networks”, CVPR 2016
[3] Kwangil Kim, “Artificial Neural Networks”, Multimedia system lecture of KHU,
2017
[4] DL4J, “A Beginner’s Guide to Recurrent Networks and LSTMs”, 2017,
https://deeplearning4j.org/lstm.html. Accessed, 2018-01-29
[5] Phillip Isola, Jun-Yan Zhu, Tinghui, “Image-to-Image translation with conditional
Adversarial Nets”, Nov 25, 2016,
https://www.slideshare.net/xavigiro/imagetoimage-translation-with-conditional-
adversarial-nets-upc-reading-group. Accessed, 2018-01-29
[6] CCVL, “Datasets: PASCAL Part Segmentation Challenge”, 2018
http://ccvl.jhu.edu/datasets/. Accessed, 2018-01-29

Neural Network, GANs & Image Translation

Neural Network, GANs & Image Translation

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Neural Network, GANs & Image Translation

Ähnlich wie Neural Network, GANs & Image Translation (20)

Mehr von San Kim

Mehr von San Kim (18)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Neural Network, GANs & Image Translation