SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Modeling documents with Generative
Adversarial Networks
John Glover
Overview
Learning representations of natural language documents
A brief introduction to Generative Adversarial Networks
Energy-based Generative Adversarial Networks
An adversarial document model
Future work & conclusion
Representation learning
The ability to learn robust, reusable feature representations
from unlabelled data has potential applications in a wide
variety of machine learning tasks, such as data retrieval
and classification.
One way to create such representations is to train deep
generative models that can learn to capture the complex
distributions of real-world data.
Representation learning
Document representations: LDA
The traditional approach to doing this is to use something
like LDA.
In LDA documents consist of a mixture of topics, with each
topic defining a probability distribution over the words in
the vocabulary.
Documents represented by a vector of mixture weights
over associated topics.
Document representations: LDA
α
β
z w N
M
θ
α is the parameter of the Dirichlet prior on the
per-document topic distributions, β is the parameter of the
Dirichlet prior on the per-topic word distribution, θm is the
topic distribution for document m, zmn is the topic for the
nth word in document m, and wmn is the specific word.
Document representations: beyond LDA
Replicated softmax (Salakhutdinov and Hinton, 2009).
DocNADE (Larochelle and Lauly, 2012).
Generative models: recent trends
Variational inference: Neural variational inference (Miao,
Yu, Blunsom, 2016).
Generative Adversarial Networks: ?
Generative Adversarial Networks
Generative Adversarial Networks (GANs) involve a
min-max adversarial game between a generative model G
and a discriminative model D.
G(z) is a neural network, that is trained to map samples z
from a prior noise distribution p(z) to the data space.
D(x) is another neural network that takes a data sample x
as input and outputs a single scalar value representing the
probability that x came from the data distribution instead of
G(z).
Generative Adversarial Networks
source: https://ishmaelbelghazi.github.io/ALI
Generative Adversarial Networks
D is trained to maximise the probability of assigning the
correct label to the input x.
G is trained to maximally confuse D, using the gradient of
D(x) with respect to x to update its parameters.
min
G
max
D
Ex∼p(data)[log D(x)] + Ez∼p(z)[log(1 − D(G(z)))]
GAN samples
Source: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
https://arxiv.org/abs/1511.06434v2
GAN samples
Source: ”Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”
https://arxiv.org/abs/1609.04802
Energy-based Generative Adversarial Networks
Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016.
Energy function: outputs low values on the data manifold,
higher values everywhere else.
Energy-based Generative Adversarial Networks
Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016.
Easy to push down energy of observed data via SGD.
How to choose where to push energy up?
Energy-based Generative Adversarial Networks
Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016.
Generator learns to pick points where the energy should
be increased.
Can view D as a learned objective function.
Energy-based Generative Adversarial Networks
The energy function is trained to push down on the energy
of real samples x, and to push up on the energy of
generated samples ˆx. (fD is the value to be minimised at
each iteration and m is a margin between positive and
negative energies):
fD(x, z) = D(x) + max(0, m − D(G(z)))
At each iteration, the generator G is trained adversarially
against D to minimize fG:
fG(z) = D(G(z))
Energy-based Generative Adversarial Networks
In practise, the energy-based GAN formulation seems to
be easier to train.
Empirical results in ”Energy-based Generative Adversarial
Network” (https://arxiv.org/abs/1609.03126) with more than
6500 experiments.
An adversarial document model
Can we use the GAN formulation to learn representations
of natural language documents?
Questions:
1. How to represent documents? GANs require everything to
be differentiable, but need to deal with discrete text.
2. How to get a representation? No explicit mapping back to
latent (z) space.
An adversarial document model
z
x
CG Enc
DecMSE
h
D
Using an Energy-Based GAN to learn document representations. G is the generator, Enc and Dec are DAE encoder
and decoder networks, C is a corruption process (bypassed at test time) and D is the discriminator.
Input to discriminator is the binary bag-of-words
representation of a document: x ∈ {0, 1}V
.
Energy-based GAN with Denoising Autoencoder
discriminator.
Document retrieval evaluation
0.0001 0.0002 0.0005 0.002 0.01 0.05 0.2 1.0
Recall
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Precision
ADM
ADM (AE)
DocNADE
DAE
Precision-recall curves for the document retrieval task on the 20 Newsgroups dataset. DocNADE is described in
(Larochelle and Lauly, 2012), ADM is the adversarial document model, ADM (AE) is the adversarial document
model with a standard Autoencoder as the discriminator (and so it similar to the Energy-Based GAN), and DAE is a
Denoising Autoencoder.
Qualitative evaluation: TSNE plot
t-SNE visualizations of the document representations learned by the adversarial document model on the held-out
test dataset of 20 Newsgroups. The documents belong to 20 different topics, which correspond to different coloured
points in the figure.
Future work
Understanding why the DAE in the GAN discriminator
appears to produce significantly better representations
than a standalone DAE.
Exploring the impact of applying additional constraints to
the representation layer.
Conclusion
Showed that a variation on the recently proposed
Energy-Based GAN can be used to learn document
representations in an unsupervised setting.
In the current formulation still short of state-of-the-art, but
still very early days for this line of research so likely that we
can push this a lot further.
Suggested some interesting areas for future research.
More information
Introduction to GANs: http://blog.aylien.com/introduction-
generative-adversarial-networks-code-tensorflow
Paper:
https://sites.google.com/site/nips2016adversarial/home/accepted-
papers

Weitere ähnliche Inhalte

Was ist angesagt?

5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
Bhaskar Mitra
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
Bhaskar Mitra
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
vini89
 

Was ist angesagt? (20)

Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modeling
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain Shift
 
Topics Modeling
Topics ModelingTopics Modeling
Topics Modeling
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 
Transfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningTransfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine Learning
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspace
 
Topic modeling using big data analytics
Topic modeling using big data analyticsTopic modeling using big data analytics
Topic modeling using big data analytics
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introduction
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic Models
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 

Ähnlich wie Modeling documents with Generative Adversarial Networks - John Glover

Adversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative modelAdversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative model
Loc Nguyen
 
Wsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemWsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problem
lolokikipipi
 
Language Model Information Retrieval with Document Expansion
Language Model Information Retrieval with Document ExpansionLanguage Model Information Retrieval with Document Expansion
Language Model Information Retrieval with Document Expansion
ashish_hzb
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2
Viral Gupta
 

Ähnlich wie Modeling documents with Generative Adversarial Networks - John Glover (20)

Enhancing Privacy of Confidential Data using K Anonymization
Enhancing Privacy of Confidential Data using K AnonymizationEnhancing Privacy of Confidential Data using K Anonymization
Enhancing Privacy of Confidential Data using K Anonymization
 
Canini09a
Canini09aCanini09a
Canini09a
 
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
 
Adversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative modelAdversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative model
 
Deep Domain Adaptation using Adversarial Learning and GAN
Deep Domain Adaptation using Adversarial Learning and GAN Deep Domain Adaptation using Adversarial Learning and GAN
Deep Domain Adaptation using Adversarial Learning and GAN
 
Wsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemWsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problem
 
Big data analytics_7_giants_public_24_sep_2013
Big data analytics_7_giants_public_24_sep_2013Big data analytics_7_giants_public_24_sep_2013
Big data analytics_7_giants_public_24_sep_2013
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanford
 
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
 
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingContext-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
 
Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExp
 
dmapply: A functional primitive to express distributed machine learning algor...
dmapply: A functional primitive to express distributed machine learning algor...dmapply: A functional primitive to express distributed machine learning algor...
dmapply: A functional primitive to express distributed machine learning algor...
 
NLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationNLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic Classification
 
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLabBeyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
 
Language Model Information Retrieval with Document Expansion
Language Model Information Retrieval with Document ExpansionLanguage Model Information Retrieval with Document Expansion
Language Model Information Retrieval with Document Expansion
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
IMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONIMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTION
 
Image Generation from Caption
Image Generation from Caption Image Generation from Caption
Image Generation from Caption
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2
 

Mehr von Sebastian Ruder

Mehr von Sebastian Ruder (17)

Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language Processing
 
On the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary InductionOn the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary Induction
 
Successes and Frontiers of Deep Learning
Successes and Frontiers of Deep LearningSuccesses and Frontiers of Deep Learning
Successes and Frontiers of Deep Learning
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Human Evaluation: Why do we need it? - Dr. Sheila Castilho
Human Evaluation: Why do we need it? - Dr. Sheila CastilhoHuman Evaluation: Why do we need it? - Dr. Sheila Castilho
Human Evaluation: Why do we need it? - Dr. Sheila Castilho
 
Machine intelligence in HR technology: resume analysis at scale - Adrian Mihai
Machine intelligence in HR technology: resume analysis at scale - Adrian MihaiMachine intelligence in HR technology: resume analysis at scale - Adrian Mihai
Machine intelligence in HR technology: resume analysis at scale - Adrian Mihai
 
Hashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana Ifrim
Hashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana IfrimHashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana Ifrim
Hashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana Ifrim
 
Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language Processing
 
Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...
 
Spoken Dialogue Systems and Social Talk - Emer Gilmartin
Spoken Dialogue Systems and Social Talk - Emer GilmartinSpoken Dialogue Systems and Social Talk - Emer Gilmartin
Spoken Dialogue Systems and Social Talk - Emer Gilmartin
 
NIPS 2016 Highlights - Sebastian Ruder
NIPS 2016 Highlights - Sebastian RuderNIPS 2016 Highlights - Sebastian Ruder
NIPS 2016 Highlights - Sebastian Ruder
 
Multi-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer CalixtoMulti-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer Calixto
 
Funded PhD/MSc. Opportunities at AYLIEN
Funded PhD/MSc. Opportunities at AYLIENFunded PhD/MSc. Opportunities at AYLIEN
Funded PhD/MSc. Opportunities at AYLIEN
 
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)
 
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
 
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
A Hierarchical Model of Reviews for Aspect-based Sentiment AnalysisA Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
 
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
 

Kürzlich hochgeladen

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
Lokesh Kothari
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 

Kürzlich hochgeladen (20)

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 

Modeling documents with Generative Adversarial Networks - John Glover

  • 1. Modeling documents with Generative Adversarial Networks John Glover
  • 2. Overview Learning representations of natural language documents A brief introduction to Generative Adversarial Networks Energy-based Generative Adversarial Networks An adversarial document model Future work & conclusion
  • 3. Representation learning The ability to learn robust, reusable feature representations from unlabelled data has potential applications in a wide variety of machine learning tasks, such as data retrieval and classification. One way to create such representations is to train deep generative models that can learn to capture the complex distributions of real-world data.
  • 5. Document representations: LDA The traditional approach to doing this is to use something like LDA. In LDA documents consist of a mixture of topics, with each topic defining a probability distribution over the words in the vocabulary. Documents represented by a vector of mixture weights over associated topics.
  • 6. Document representations: LDA α β z w N M θ α is the parameter of the Dirichlet prior on the per-document topic distributions, β is the parameter of the Dirichlet prior on the per-topic word distribution, θm is the topic distribution for document m, zmn is the topic for the nth word in document m, and wmn is the specific word.
  • 7. Document representations: beyond LDA Replicated softmax (Salakhutdinov and Hinton, 2009). DocNADE (Larochelle and Lauly, 2012).
  • 8. Generative models: recent trends Variational inference: Neural variational inference (Miao, Yu, Blunsom, 2016). Generative Adversarial Networks: ?
  • 9. Generative Adversarial Networks Generative Adversarial Networks (GANs) involve a min-max adversarial game between a generative model G and a discriminative model D. G(z) is a neural network, that is trained to map samples z from a prior noise distribution p(z) to the data space. D(x) is another neural network that takes a data sample x as input and outputs a single scalar value representing the probability that x came from the data distribution instead of G(z).
  • 10. Generative Adversarial Networks source: https://ishmaelbelghazi.github.io/ALI
  • 11. Generative Adversarial Networks D is trained to maximise the probability of assigning the correct label to the input x. G is trained to maximally confuse D, using the gradient of D(x) with respect to x to update its parameters. min G max D Ex∼p(data)[log D(x)] + Ez∼p(z)[log(1 − D(G(z)))]
  • 12. GAN samples Source: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks https://arxiv.org/abs/1511.06434v2
  • 13. GAN samples Source: ”Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network” https://arxiv.org/abs/1609.04802
  • 14. Energy-based Generative Adversarial Networks Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016. Energy function: outputs low values on the data manifold, higher values everywhere else.
  • 15. Energy-based Generative Adversarial Networks Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016. Easy to push down energy of observed data via SGD. How to choose where to push energy up?
  • 16. Energy-based Generative Adversarial Networks Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016. Generator learns to pick points where the energy should be increased. Can view D as a learned objective function.
  • 17. Energy-based Generative Adversarial Networks The energy function is trained to push down on the energy of real samples x, and to push up on the energy of generated samples ˆx. (fD is the value to be minimised at each iteration and m is a margin between positive and negative energies): fD(x, z) = D(x) + max(0, m − D(G(z))) At each iteration, the generator G is trained adversarially against D to minimize fG: fG(z) = D(G(z))
  • 18. Energy-based Generative Adversarial Networks In practise, the energy-based GAN formulation seems to be easier to train. Empirical results in ”Energy-based Generative Adversarial Network” (https://arxiv.org/abs/1609.03126) with more than 6500 experiments.
  • 19. An adversarial document model Can we use the GAN formulation to learn representations of natural language documents? Questions: 1. How to represent documents? GANs require everything to be differentiable, but need to deal with discrete text. 2. How to get a representation? No explicit mapping back to latent (z) space.
  • 20. An adversarial document model z x CG Enc DecMSE h D Using an Energy-Based GAN to learn document representations. G is the generator, Enc and Dec are DAE encoder and decoder networks, C is a corruption process (bypassed at test time) and D is the discriminator. Input to discriminator is the binary bag-of-words representation of a document: x ∈ {0, 1}V . Energy-based GAN with Denoising Autoencoder discriminator.
  • 21. Document retrieval evaluation 0.0001 0.0002 0.0005 0.002 0.01 0.05 0.2 1.0 Recall 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Precision ADM ADM (AE) DocNADE DAE Precision-recall curves for the document retrieval task on the 20 Newsgroups dataset. DocNADE is described in (Larochelle and Lauly, 2012), ADM is the adversarial document model, ADM (AE) is the adversarial document model with a standard Autoencoder as the discriminator (and so it similar to the Energy-Based GAN), and DAE is a Denoising Autoencoder.
  • 22. Qualitative evaluation: TSNE plot t-SNE visualizations of the document representations learned by the adversarial document model on the held-out test dataset of 20 Newsgroups. The documents belong to 20 different topics, which correspond to different coloured points in the figure.
  • 23. Future work Understanding why the DAE in the GAN discriminator appears to produce significantly better representations than a standalone DAE. Exploring the impact of applying additional constraints to the representation layer.
  • 24. Conclusion Showed that a variation on the recently proposed Energy-Based GAN can be used to learn document representations in an unsupervised setting. In the current formulation still short of state-of-the-art, but still very early days for this line of research so likely that we can push this a lot further. Suggested some interesting areas for future research.
  • 25. More information Introduction to GANs: http://blog.aylien.com/introduction- generative-adversarial-networks-code-tensorflow Paper: https://sites.google.com/site/nips2016adversarial/home/accepted- papers