SlideShare a Scribd company logo
1 of 23
Download to read offline
1
Published by T. Kim, M. Cha, H. Kim, J. K. Lee, and, J. Kim
Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017
Seongcheol Baek
Reading Circle Presentation @ Hikihara Lab
Department of Electrical Engineering, Kyoto University
2019/07/19
Learning to Discover Cross-Domain Relations
with Generative Adversarial Networks
Focus of this presentation
- Recently emerging issues around GAN
- Introduction of generative adversarial networks (GAN)
- What is DiscoGAN?
Problems of interest / model architecture / mode collapse problem /
experiments / summaries / comments
2
Recent generative technologies
3
2014 2015 2016 2017 2018 2019
Ian J. Goodfellow
invented
“generative
adversarial network”
Deep
Convolutional GAN
(DCGAN)
Least Squares GAN
(LSGAN)
Semi-Supervised GAN StackGAN,
Auxiliary
Classifier GAN
(ACGAN)
Jun. Oct. Nov. Oct. Mar. Aug. Sep. Oct. Sep. Mar. May
Samsung deepfake AI
fabricate a video from
a single profile picGauGAN (Source: Nvidia)
BW clips into color (Source: Nvidia)
CycleGAN
Original AlphaGo
beat a professional
Go player
DiscoGAN
Recent issues around deepfakes – security, art, etc.
4
A viral video that Obama insults
Donald Trump is fabricated
with FakeApp (Photo: Youtube)
A deepfake clip of Mark Zuckerberg
is being allowed to remain on Instagram
(Photo: Bill Poster UK)
- US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’
- At individual level, deepfakes can be used for cyberbullying, defamation and blackmail
Edmond de Belamy: The first piece
of AI-generated art
(created by GAN in 2018)
What is GAN?
5
- Two neural networks contest with each other in a game. Given a training set, GAN learns to
generate new data with the same statistics as the training set.
- Minimax two-player game (Generative model v.s. Discriminative model)
Minimax Problem of GAN
6
min
$
max
'
((*, ,) = /0~23454(0) log , 0 + /:~2;(;)[log(1 − ,(*(;)))]
( *, , = @
0
ABCDC 0 log , 0 d0 + @
;
AF (;) log(1 − , *(;) ) d;
Training of Generator – min
$
[1 − ,(* G )] = 0
Training of Discriminator max
'
,(I) = 1 max
'
[1 − ,(* G )] = 1
Discriminant for real data Discriminant for generated data
- ((,, *) has a saddle point at ,(* ; ) =
J
K
, ∈ [0, 1]
data is fake/real
Discover Cross-Domain Relations with GAN
7
Training of 2 different data sets
without explicitly paired labelling
Results of domain transfer
- Previous AI could also transfer data from one domain to another, preserving key attributes
- Previous training methods (~2016) require paired data, that is costly and hard to collect
- DiscoGAN requires training of 2 different data sets without any paired data, and its results
shows better performance with robustness to the mode collapse problem
(Domain A) (Domain B)
!"#
!#"
Network Models – DiscoGAN & Previous GANs
8
Standard GAN with GAN loss
GAN with a reconstruction loss & GAN loss
DiscoGAN
- Each generator consists of encoder-decoder pair (input and output are images)
- GAN loss (and the reconstruction) is to be minimized on training processes
- In DiscoGAN, 2 coupled GANs map each domain to its counterpart domain (bijective)
Problem Formulation (1)
9
- Reconstruction loss measures how well the original input is reconstructed after a sequence of two
generations: !"#$%&'
= )(+,-,, +,) such as !0, !1, or Huber loss
- GAN loss measures how realistic the generated image is in domain B: !2,$3
= −56'~8'(6) log <- +,-
- Relaxed constraints are considered to guarantee bijection and domain transition
- Bijection: ideally =,-
>0
= =-,
→ min
2'3
(!"#$%&'
), min
23'
(!"#$%&3
)
- Domain transition: ideally B,- ∈ ℝ-, B-, ∈ ℝ,
→ min
E3
(!E3
), min
E'
(!E'
)
Problem Formulation (2)
10
Training of Generator
(in case of !"#)
Training of Discriminator
(in case of $#)
Constraints Level
(a) Standard GAN with GAN
loss
%&'
= −*+~-+
[log &'(3+'(4+))]
%&'
= −*+~-+
[log &'(3+'(4+))] –
(b) GAN with a
reconstruction loss & GAN
loss
%3+'
= %3+7'
+ %9:7;<+
= −*+~-+
log &' 3+' 4+
+ =(4+'+, 4+)
%&'
= −*'~-'
[log &' 4' ]
− *+~-+
[log(1 − &'(3+'(4+)))]
doubled DOF
from (a),
weaker than (a)
(c) DiscoGAN %3 = %3+'
+ %@AB
= %3+7'
+ %9:7;<+
+ %3+7+
+ %9:7;<'
= −*+~-+
log &' 3+' 4+
+ =(4+'+, 4+)
− *'~-'
log &+ 3'+ 4'
+ =(4'+', 4')
%& = %&+
+ %&'
= −*+~-+
[log &+ 4+ ]
− *'~-'
[log(1 − &+(3'+(4')))]
− *'~-'
[log &' 4' ]
− *+~-+
[log(1 − &'(3+'(4+)))]
doubled DOF
from (b),
weaker than (b)
Architecture of Generator
11
- Each generator takes an image and feeds it through an encoder-decoder pair
- Number of layers ranges from 4 to 5 depending on the domain
Encoder
(convolution layer)
Decoder
(deconvolution layer)
Domain A (resp. B) Domain B (resp. A)
Architecture of Discriminator
12
- Each discriminator feeds an image through convolution layers
- Discriminator outputs a scalar output based on sigmoid, telling how real fed image is
Toy Experiment – Domain Transition Test
13
- In DiscoGAN, discriminator B is perfectly fooled by translated sampled from domain A
- DiscoGAN prevents mode-collapse by translating into distinct well-bounded regions that do
not overlap
Initial state Standard GAN GAN with !"#$%& DiscoGAN
'(
Colored points: samples in domain A
Black x’s: target modes in domain B
Mode Collapse Problem
14
The gradients are biased towards the mode from which
higher number of samples are drawn to form the real training data
- Generator outputs unintended images in different mode, which occurs prevalently in GANs
- Usually, GAN remedy this problem with losses, however it has not been resolved perfectly
- Other examples: communication system, cryptography, automaton, etc.
Why DiscoGAN is robust to mode-collapse?
15
- In DiscoGAN, two coupled models are trained together simultaneously. !"#’s and !#"’s
share parameters
- Constraints of coupled reconstruction losses lead to the strict bijection
Real Domain Experiment – Car to Car, Face to Face
16
Input data Standard GAN GAN with !"#$%& DiscoGAN
CartocarFacetoface
- Reconstruction tests
- Results in DiscoGAN show higher correlations, (robust to mode collapse)
–
Real Domain Experiment – Face Conversion
17
Translation of gender
Blond to black,
Black to blond hair
Glasses to non-glasses,
non-glasses to glasses
- DiscoGAN translates specific feature, preserving other facial features
Cross-Domain Experiment (1)
18
Chair to car Car to face
- Note that training is implemented without any paired data
- The main attribute (azimuth) is preserved
Cross-Domain Experiment (2)
19
- 1-to-N problem
Handbag to sketches
Sketches to shoes
Sketches to handbags
Cross-Domain Experiment (3)
20
- Same style is discovered
Handbag to shoes
Shoes to handbag
Summaries
21
- DiscoGAN is proposed as a learning method to discover cross-domain relations without any
pair labels
- Results showed better performance with robustness to mode-collapse. The symmetry
granted by coupling 2 GANs, is considered to be a key factor for the dynamical robustness
Comments
- The strategy to couple two GAN models reminded me of the symmetry of dynamics. Some
correlations could be drawn to handle the stability problem…?
- This paper is giving me many ideas. It is very pleasant.
22
Thank you!
- Source code for simulations
Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks" (Github) ... https://github.com/SKTBrain/DiscoGAN
- This presentation is also available on:
https://www.slideshare.net/SeongcheolBaek/introduction-of-discogan
References
23
- Crux of Presentation
T. Kim, et al., Learning to Discover Cross-Domain Relations with Generative Adversarial Networks (arXiv) ... https://arxiv.org/abs/1703.05192
- Recent generative technologies
Apple announces Animoji (The Verge) … https://www.theverge.com/2017/9/12/16290210/new-iphone-emoji-animated-animoji-apple-ios-11-update
AI Can Convert Black and White Clips into Color (Nvidia Developer) ... https://news.developer.nvidia.com/ai-can-convert-black-and-white-clips-into-color/
Nvidia’s latest AI software turns rough doodles into realistic landscapes (The Verge) ... https://www.theverge.com/2019/3/19/18272602/ai-art-generation-gan-nvidia-doodle-landscapes
Deepfakes are getting easier than ever to make (The Verge) … https://www.theverge.com/2019/5/23/18637373/deepfakes-samsung-ai-research-results-single-photo-algorithm
- Recent issues around deepfakes – security, art, etc.
A viral video that appeared to show Obama calling Trump a 'dips---' shows a disturbing new trend called 'deepfakes’ (Business Insider) … https://www.businessinsider.com/obama-
deepfake-video-insulting-trump-2018-4
New deepfake tech turns a single photo and audio file into a singing video portrait (The Verge) ... https://www.theverge.com/2019/6/20/18692671/deepfake-technology-singing-talking-
video-portrait-from-a-single-image-imperial-college-samsung
US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’ (The Verge) … https://www.theverge.com/2018/9/14/17859188/ai-deepfakes-national-security-
threat-lawmakers-letter-intelligence-community
Deepfakes: A Threat to Individuals and National Security (Lionbridge) … https://lionbridge.ai/articles/deepfakes-a-threat-to-individuals-and-national-security/
A deepfake clip of Mark Zuckerberg is being allowed to remain on Instagram (iNews) … https://inews.co.uk/news/technology/a-deepfake-clip-of-mark-zuckerberg-is-being-allowed-to-
remain-on-instagram/
Portrait of Edmond Belamy created by GAN (Wikipedia) … https://en.wikipedia.org/wiki/Edmond_de_Belamy
- Generative Adversarial Network
I. J. Goodfellow, Generative Adversarial Nets (arXiv) … https://arxiv.org/abs/1406.2661
Tutorial on Generative Adversarial Networks … https://www.slideshare.net/ckmarkohchang/generative-adversarial-networks
- Mode Collapse Problem
A. Ghosh, et al., Multi-Agent Diverse Generative Adversarial Networks (Research Gate) … https://www.researchgate.net/publication/315882247_Multi-
Agent_Diverse_Generative_Adversarial_Networks

More Related Content

What's hot

All pairs shortest path algorithm
All pairs shortest path algorithmAll pairs shortest path algorithm
All pairs shortest path algorithmSrikrishnan Suresh
 
Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion pathVitaly Bondar
 
IMAGE STEGANOGRAPHY JAVA PROJECT SYNOPSIS
IMAGE STEGANOGRAPHY JAVA PROJECT SYNOPSISIMAGE STEGANOGRAPHY JAVA PROJECT SYNOPSIS
IMAGE STEGANOGRAPHY JAVA PROJECT SYNOPSISShivam Porwal
 
Unsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANUnsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANShyam Krishna Khadka
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksDing Li
 
Evolution of the StyleGAN family
Evolution of the StyleGAN familyEvolution of the StyleGAN family
Evolution of the StyleGAN familyVitaly Bondar
 
9. chapter 8 np hard and np complete problems
9. chapter 8   np hard and np complete problems9. chapter 8   np hard and np complete problems
9. chapter 8 np hard and np complete problemsJyotsna Suryadevara
 
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Amol Patil
 
4 informed-search
4 informed-search4 informed-search
4 informed-searchMhd Sb
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSangwoo Mo
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Sujit Pal
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer VisionSungjoon Choi
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptxMAHMOUD729246
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMark Chang
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Universitat Politècnica de Catalunya
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic AlgorithmSHIMI S L
 
Steganography final report
Steganography final reportSteganography final report
Steganography final reportABHIJEET KHIRE
 

What's hot (20)

Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
 
All pairs shortest path algorithm
All pairs shortest path algorithmAll pairs shortest path algorithm
All pairs shortest path algorithm
 
Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion path
 
IMAGE STEGANOGRAPHY JAVA PROJECT SYNOPSIS
IMAGE STEGANOGRAPHY JAVA PROJECT SYNOPSISIMAGE STEGANOGRAPHY JAVA PROJECT SYNOPSIS
IMAGE STEGANOGRAPHY JAVA PROJECT SYNOPSIS
 
Unsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGANUnsupervised learning represenation with DCGAN
Unsupervised learning represenation with DCGAN
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Evolution of the StyleGAN family
Evolution of the StyleGAN familyEvolution of the StyleGAN family
Evolution of the StyleGAN family
 
9. chapter 8 np hard and np complete problems
9. chapter 8   np hard and np complete problems9. chapter 8   np hard and np complete problems
9. chapter 8 np hard and np complete problems
 
Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)Generative Adversarial Networks (GANs)
Generative Adversarial Networks (GANs)
 
4 informed-search
4 informed-search4 informed-search
4 informed-search
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
Deep Learning in Computer Vision
Deep Learning in Computer VisionDeep Learning in Computer Vision
Deep Learning in Computer Vision
 
Generative adversarial text to image synthesis
Generative adversarial text to image synthesisGenerative adversarial text to image synthesis
Generative adversarial text to image synthesis
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptx
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
Steganography final report
Steganography final reportSteganography final report
Steganography final report
 

Similar to Introduction of DiscoGAN

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
 
Tips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software EngineeringTips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software Engineeringjtdudley
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compilerGrokking VN
 
Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Saajid Akram
 
2 Years of Real World FP at REA
2 Years of Real World FP at REA2 Years of Real World FP at REA
2 Years of Real World FP at REAkenbot
 
VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]Pramati Technologies
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptDivyaGugulothu
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxVictor Rentea
 
Engine Terminology
Engine TerminologyEngine Terminology
Engine Terminologykamkill
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowOswald Campesato
 
Andriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsAndriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsOWASP Kyiv
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Prabindh Sundareson
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdfnyomans1
 
Regex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language InsteadRegex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language InsteadAll Things Open
 
Performance #5 cpu and battery
Performance #5  cpu and batteryPerformance #5  cpu and battery
Performance #5 cpu and batteryVitali Pekelis
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...Taeksoo Kim
 

Similar to Introduction of DiscoGAN (20)

Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 
Tips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software EngineeringTips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software Engineering
 
DiscoGAN
DiscoGANDiscoGAN
DiscoGAN
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Class[3][5th jun] [three js]
Class[3][5th jun] [three js]Class[3][5th jun] [three js]
Class[3][5th jun] [three js]
 
2 Years of Real World FP at REA
2 Years of Real World FP at REA2 Years of Real World FP at REA
2 Years of Real World FP at REA
 
VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]VitaFlow | Mageswaran Dhandapani [Pramati]
VitaFlow | Mageswaran Dhandapani [Pramati]
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.ppt
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptx
 
Generative AI for Reengineering Variants into Software Product Lines: An Expe...
Generative AI for Reengineering Variants into Software Product Lines: An Expe...Generative AI for Reengineering Variants into Software Product Lines: An Expe...
Generative AI for Reengineering Variants into Software Product Lines: An Expe...
 
Engine Terminology
Engine TerminologyEngine Terminology
Engine Terminology
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
 
Andriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsAndriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tips
 
ELAVARASAN.pdf
ELAVARASAN.pdfELAVARASAN.pdf
ELAVARASAN.pdf
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011
 
Evolution of Spark APIs
Evolution of Spark APIsEvolution of Spark APIs
Evolution of Spark APIs
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
 
Regex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language InsteadRegex Considered Harmful: Use Rosie Pattern Language Instead
Regex Considered Harmful: Use Rosie Pattern Language Instead
 
Performance #5 cpu and battery
Performance #5  cpu and batteryPerformance #5  cpu and battery
Performance #5 cpu and battery
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
 

Recently uploaded

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 

Recently uploaded (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 

Introduction of DiscoGAN

  • 1. 1 Published by T. Kim, M. Cha, H. Kim, J. K. Lee, and, J. Kim Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017 Seongcheol Baek Reading Circle Presentation @ Hikihara Lab Department of Electrical Engineering, Kyoto University 2019/07/19 Learning to Discover Cross-Domain Relations with Generative Adversarial Networks
  • 2. Focus of this presentation - Recently emerging issues around GAN - Introduction of generative adversarial networks (GAN) - What is DiscoGAN? Problems of interest / model architecture / mode collapse problem / experiments / summaries / comments 2
  • 3. Recent generative technologies 3 2014 2015 2016 2017 2018 2019 Ian J. Goodfellow invented “generative adversarial network” Deep Convolutional GAN (DCGAN) Least Squares GAN (LSGAN) Semi-Supervised GAN StackGAN, Auxiliary Classifier GAN (ACGAN) Jun. Oct. Nov. Oct. Mar. Aug. Sep. Oct. Sep. Mar. May Samsung deepfake AI fabricate a video from a single profile picGauGAN (Source: Nvidia) BW clips into color (Source: Nvidia) CycleGAN Original AlphaGo beat a professional Go player DiscoGAN
  • 4. Recent issues around deepfakes – security, art, etc. 4 A viral video that Obama insults Donald Trump is fabricated with FakeApp (Photo: Youtube) A deepfake clip of Mark Zuckerberg is being allowed to remain on Instagram (Photo: Bill Poster UK) - US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’ - At individual level, deepfakes can be used for cyberbullying, defamation and blackmail Edmond de Belamy: The first piece of AI-generated art (created by GAN in 2018)
  • 5. What is GAN? 5 - Two neural networks contest with each other in a game. Given a training set, GAN learns to generate new data with the same statistics as the training set. - Minimax two-player game (Generative model v.s. Discriminative model)
  • 6. Minimax Problem of GAN 6 min $ max ' ((*, ,) = /0~23454(0) log , 0 + /:~2;(;)[log(1 − ,(*(;)))] ( *, , = @ 0 ABCDC 0 log , 0 d0 + @ ; AF (;) log(1 − , *(;) ) d; Training of Generator – min $ [1 − ,(* G )] = 0 Training of Discriminator max ' ,(I) = 1 max ' [1 − ,(* G )] = 1 Discriminant for real data Discriminant for generated data - ((,, *) has a saddle point at ,(* ; ) = J K , ∈ [0, 1] data is fake/real
  • 7. Discover Cross-Domain Relations with GAN 7 Training of 2 different data sets without explicitly paired labelling Results of domain transfer - Previous AI could also transfer data from one domain to another, preserving key attributes - Previous training methods (~2016) require paired data, that is costly and hard to collect - DiscoGAN requires training of 2 different data sets without any paired data, and its results shows better performance with robustness to the mode collapse problem (Domain A) (Domain B) !"# !#"
  • 8. Network Models – DiscoGAN & Previous GANs 8 Standard GAN with GAN loss GAN with a reconstruction loss & GAN loss DiscoGAN - Each generator consists of encoder-decoder pair (input and output are images) - GAN loss (and the reconstruction) is to be minimized on training processes - In DiscoGAN, 2 coupled GANs map each domain to its counterpart domain (bijective)
  • 9. Problem Formulation (1) 9 - Reconstruction loss measures how well the original input is reconstructed after a sequence of two generations: !"#$%&' = )(+,-,, +,) such as !0, !1, or Huber loss - GAN loss measures how realistic the generated image is in domain B: !2,$3 = −56'~8'(6) log <- +,- - Relaxed constraints are considered to guarantee bijection and domain transition - Bijection: ideally =,- >0 = =-, → min 2'3 (!"#$%&' ), min 23' (!"#$%&3 ) - Domain transition: ideally B,- ∈ ℝ-, B-, ∈ ℝ, → min E3 (!E3 ), min E' (!E' )
  • 10. Problem Formulation (2) 10 Training of Generator (in case of !"#) Training of Discriminator (in case of $#) Constraints Level (a) Standard GAN with GAN loss %&' = −*+~-+ [log &'(3+'(4+))] %&' = −*+~-+ [log &'(3+'(4+))] – (b) GAN with a reconstruction loss & GAN loss %3+' = %3+7' + %9:7;<+ = −*+~-+ log &' 3+' 4+ + =(4+'+, 4+) %&' = −*'~-' [log &' 4' ] − *+~-+ [log(1 − &'(3+'(4+)))] doubled DOF from (a), weaker than (a) (c) DiscoGAN %3 = %3+' + %@AB = %3+7' + %9:7;<+ + %3+7+ + %9:7;<' = −*+~-+ log &' 3+' 4+ + =(4+'+, 4+) − *'~-' log &+ 3'+ 4' + =(4'+', 4') %& = %&+ + %&' = −*+~-+ [log &+ 4+ ] − *'~-' [log(1 − &+(3'+(4')))] − *'~-' [log &' 4' ] − *+~-+ [log(1 − &'(3+'(4+)))] doubled DOF from (b), weaker than (b)
  • 11. Architecture of Generator 11 - Each generator takes an image and feeds it through an encoder-decoder pair - Number of layers ranges from 4 to 5 depending on the domain Encoder (convolution layer) Decoder (deconvolution layer) Domain A (resp. B) Domain B (resp. A)
  • 12. Architecture of Discriminator 12 - Each discriminator feeds an image through convolution layers - Discriminator outputs a scalar output based on sigmoid, telling how real fed image is
  • 13. Toy Experiment – Domain Transition Test 13 - In DiscoGAN, discriminator B is perfectly fooled by translated sampled from domain A - DiscoGAN prevents mode-collapse by translating into distinct well-bounded regions that do not overlap Initial state Standard GAN GAN with !"#$%& DiscoGAN '( Colored points: samples in domain A Black x’s: target modes in domain B
  • 14. Mode Collapse Problem 14 The gradients are biased towards the mode from which higher number of samples are drawn to form the real training data - Generator outputs unintended images in different mode, which occurs prevalently in GANs - Usually, GAN remedy this problem with losses, however it has not been resolved perfectly - Other examples: communication system, cryptography, automaton, etc.
  • 15. Why DiscoGAN is robust to mode-collapse? 15 - In DiscoGAN, two coupled models are trained together simultaneously. !"#’s and !#"’s share parameters - Constraints of coupled reconstruction losses lead to the strict bijection
  • 16. Real Domain Experiment – Car to Car, Face to Face 16 Input data Standard GAN GAN with !"#$%& DiscoGAN CartocarFacetoface - Reconstruction tests - Results in DiscoGAN show higher correlations, (robust to mode collapse) –
  • 17. Real Domain Experiment – Face Conversion 17 Translation of gender Blond to black, Black to blond hair Glasses to non-glasses, non-glasses to glasses - DiscoGAN translates specific feature, preserving other facial features
  • 18. Cross-Domain Experiment (1) 18 Chair to car Car to face - Note that training is implemented without any paired data - The main attribute (azimuth) is preserved
  • 19. Cross-Domain Experiment (2) 19 - 1-to-N problem Handbag to sketches Sketches to shoes Sketches to handbags
  • 20. Cross-Domain Experiment (3) 20 - Same style is discovered Handbag to shoes Shoes to handbag
  • 21. Summaries 21 - DiscoGAN is proposed as a learning method to discover cross-domain relations without any pair labels - Results showed better performance with robustness to mode-collapse. The symmetry granted by coupling 2 GANs, is considered to be a key factor for the dynamical robustness Comments - The strategy to couple two GAN models reminded me of the symmetry of dynamics. Some correlations could be drawn to handle the stability problem…? - This paper is giving me many ideas. It is very pleasant.
  • 22. 22 Thank you! - Source code for simulations Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks" (Github) ... https://github.com/SKTBrain/DiscoGAN - This presentation is also available on: https://www.slideshare.net/SeongcheolBaek/introduction-of-discogan
  • 23. References 23 - Crux of Presentation T. Kim, et al., Learning to Discover Cross-Domain Relations with Generative Adversarial Networks (arXiv) ... https://arxiv.org/abs/1703.05192 - Recent generative technologies Apple announces Animoji (The Verge) … https://www.theverge.com/2017/9/12/16290210/new-iphone-emoji-animated-animoji-apple-ios-11-update AI Can Convert Black and White Clips into Color (Nvidia Developer) ... https://news.developer.nvidia.com/ai-can-convert-black-and-white-clips-into-color/ Nvidia’s latest AI software turns rough doodles into realistic landscapes (The Verge) ... https://www.theverge.com/2019/3/19/18272602/ai-art-generation-gan-nvidia-doodle-landscapes Deepfakes are getting easier than ever to make (The Verge) … https://www.theverge.com/2019/5/23/18637373/deepfakes-samsung-ai-research-results-single-photo-algorithm - Recent issues around deepfakes – security, art, etc. A viral video that appeared to show Obama calling Trump a 'dips---' shows a disturbing new trend called 'deepfakes’ (Business Insider) … https://www.businessinsider.com/obama- deepfake-video-insulting-trump-2018-4 New deepfake tech turns a single photo and audio file into a singing video portrait (The Verge) ... https://www.theverge.com/2019/6/20/18692671/deepfake-technology-singing-talking- video-portrait-from-a-single-image-imperial-college-samsung US lawmakers say AI deepfakes ‘have the potential to disrupt every facet of our society’ (The Verge) … https://www.theverge.com/2018/9/14/17859188/ai-deepfakes-national-security- threat-lawmakers-letter-intelligence-community Deepfakes: A Threat to Individuals and National Security (Lionbridge) … https://lionbridge.ai/articles/deepfakes-a-threat-to-individuals-and-national-security/ A deepfake clip of Mark Zuckerberg is being allowed to remain on Instagram (iNews) … https://inews.co.uk/news/technology/a-deepfake-clip-of-mark-zuckerberg-is-being-allowed-to- remain-on-instagram/ Portrait of Edmond Belamy created by GAN (Wikipedia) … https://en.wikipedia.org/wiki/Edmond_de_Belamy - Generative Adversarial Network I. J. Goodfellow, Generative Adversarial Nets (arXiv) … https://arxiv.org/abs/1406.2661 Tutorial on Generative Adversarial Networks … https://www.slideshare.net/ckmarkohchang/generative-adversarial-networks - Mode Collapse Problem A. Ghosh, et al., Multi-Agent Diverse Generative Adversarial Networks (Research Gate) … https://www.researchgate.net/publication/315882247_Multi- Agent_Diverse_Generative_Adversarial_Networks