SlideShare ist ein Scribd-Unternehmen logo
1 von 2
Downloaden Sie, um offline zu lesen
A Note on TopicRNN
Tomonari MASADA @ Nagasaki University
July 13, 2017
1 Model
TopicRNN is a generative model proposed by [1], whose generative story for a particular document x1:T
is given as below.
1. Draw a topic vector θ ∼ N(0, I).
2. Given word y1:t−1, for the tth word yt in the document,
(a) Compute hidden state ht = fW (xt, ht−1), where we let xt yt−1.
(b) Draw stop word indicator lt ∼ Bernoulli(σ(Γ ht)), with σ the sigmoid function.
(c) Draw word yt ∼ p(yt|ht, θ, lt, B), where
p(yt = i|ht, θ, lt, B) ∝ exp(vi ht + (1 − lt)bi θ) .
2 Lower bound
The log marginal likelihood of the word sequence y1:T and the stop word indicators l1:T is
log p(y1:T , l1:T |h1:T ) = log p(θ)
T
t=1
p(yt|ht, lt, θ; W)p(lt|ht; Γ)dθ (1)
A lower bound can be obtained as follows:
log p(y1:T , l1:T |h1:T ) = log p(θ)
T
t=1
p(yt|ht, lt, θ; W)p(lt|ht; Γ)dθ
= log q(θ)
p(θ)
T
t=1 p(yt|ht, lt, θ; W)p(lt|ht; Γ)
q(θ)
dθ
≥ q(θ) log
p(θ)
T
t=1 p(yt|ht, lt, θ; W)p(lt|ht; Γ)
q(θ)
dθ
= q(θ) log p(θ)dθ +
T
t=1
q(θ) log p(yt|ht, lt, θ; W)dθ +
T
t=1
q(θ) log p(lt|ht; Γ)dθ − q(θ) log q(θ)dθ
L(y1:T , l1:T |q(θ), Θ) (2)
3 Approximate posterior
The form of q(θ) is chosen to be an inference network using a feed-forward neural network. Each expec-
tation in Eq. (2) is approximated with the samples from q(θ|Xc), where Xc denotes the term-frequency
representation of y1:T excluding stop words. The density of the approximate posterior q(θ|Xc) is specified
as follows:
q(θ|Xc) = N(θ; µ(Xc), diag(σ2
(Xc))), (3)
µ(Xc) = W1g(Xc) + a1, (4)
log σ(Xc) = W2g(Xc) + a2, (5)
where g(·) denotes the feed-forward neural network. Eq. (3) gives the reparameterization of θk as θk =
µk(Xc) + kσk(Xc) for k = 1, . . . , K, where k is a sample from the standard normal distribution N(0, 1).
1
4 Monte Carlo integration
We can now rewrite each term of the lower bound L(y1:T , l1:T |q(θ), Θ) in Eq. (2) as below, where the θ(s)
s
denote the samples drawn from the approximate posterior q(θ|Xc).
The first term:
q(θ) log p(θ)dθ ≈
1
S
S
s=1
log p(θ(s)
) =
1
S
S
s=1
K
k=1
log
1
√
2π
exp −
θ
(s)
k
2
2
= −
K log(2π)
2
−
1
2
K
k=1
s θ
(s)
k
2
S
(6)
Each addend of the second term:
q(θ) log p(yt|ht, lt, θ; W)dθ ≈
1
S
S
s=1
log
exp(vyt
ht + (1 − lt)byt
θ(s)
)
C
j=1 exp(vj ht + (1 − lt)bj θ(s))
= vyt
ht + (1 − lt)byt
S
s=1 θ(s)
S
−
1
S
S
s=1
log
C
j=1
exp vj ht + (1 − lt)bj θ(s)
(7)
Each addend of the third term:
q(θ) log p(lt|ht; Γ)dθ = lt log(σ(Γ ht)) + (1 − lt) log(1 − σ(Γ ht)) (8)
The fourth term:
q(θ) log q(θ)dθ ≈
1
S
S
s=1
K
k=1
log
1
2πσ2
k(Xc)
exp −
(θ
(s)
k − µk(Xc))2
2σ2
k(Xc)
= −
K log(2π)
2
−
K
k=1
log(σk(Xc)) −
1
S
S
s=1
K
k=1
θ
(s)
k − µk(Xc)
2
2σ2
k(Xc)
(9)
5 Objective to be maximized
Each of the s samples (i.e., θ(s)
for s = 1, . . . , S) is obtained as θ(s)
= µ(Xc)+ (s)
◦σ(Xc) via the reparam-
eterization, where the
(s)
k s are drawn from the standard normal, and ◦ is the element-wise multiplication.
Consequently, the lower bound L(y1:T , l1:T |q(θ), Θ) to be maximized is obtained as follows:
L(y1:T , l1:T |q(θ), Θ) = −
1
2
K
k=1
s µk(Xc) +
(s)
k σk(Xc)
2
S
+
T
t=1
vyt
ht +
1
S
S
s=1
T
t=1
(1 − lt)byt
µ(Xc) + (s)
◦ σ(Xc)
−
T
t=1
1
S
S
s=1
log
C
j=1
exp vj ht + (1 − lt)bj µ(Xc) + (s)
◦ σ(Xc)
+
T
t=1
lt log(σ(Γ ht)) + (1 − lt) log(1 − σ(Γ ht))
+
K
k=1
log(σk(Xc)) + const. (10)
References
[1] Adji Bousso Dieng, Chong Wang, Jianfeng Gao, and John Paisley. TopicRNN: A Recurrent Neural
Network with Long-Range Semantic Dependency. ICLR, 2017.
2

Weitere ähnliche Inhalte

Was ist angesagt?

Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)MeetupDataScienceRoma
 
Specific Finite Groups(General)
Specific Finite Groups(General)Specific Finite Groups(General)
Specific Finite Groups(General)Shane Nicklas
 
lecture 4
lecture 4lecture 4
lecture 4sajinsc
 
A One-Pass Triclustering Approach: Is There any Room for Big Data?
A One-Pass Triclustering Approach: Is There any Room for Big Data?A One-Pass Triclustering Approach: Is There any Room for Big Data?
A One-Pass Triclustering Approach: Is There any Room for Big Data?Dmitrii Ignatov
 
Goldberg-Coxeter construction for 3- or 4-valent plane maps
Goldberg-Coxeter construction for 3- or 4-valent plane mapsGoldberg-Coxeter construction for 3- or 4-valent plane maps
Goldberg-Coxeter construction for 3- or 4-valent plane mapsMathieu Dutour Sikiric
 
Specific Finite Groups(General)
Specific Finite Groups(General)Specific Finite Groups(General)
Specific Finite Groups(General)Shane Nicklas
 
Specific Finite Groups(General)
Specific Finite Groups(General)Specific Finite Groups(General)
Specific Finite Groups(General)Shane Nicklas
 
On maximal and variational Fourier restriction
On maximal and variational Fourier restrictionOn maximal and variational Fourier restriction
On maximal and variational Fourier restrictionVjekoslavKovac1
 
Bayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse ProblemsBayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse ProblemsMatt Moores
 
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...Matt Moores
 
Low-rank tensor approximation (Introduction)
Low-rank tensor approximation (Introduction)Low-rank tensor approximation (Introduction)
Low-rank tensor approximation (Introduction)Alexander Litvinenko
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesRakuten Group, Inc.
 
Prim's Algorithm on minimum spanning tree
Prim's Algorithm on minimum spanning treePrim's Algorithm on minimum spanning tree
Prim's Algorithm on minimum spanning treeoneous
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Alexander Litvinenko
 
On the-approximate-solution-of-a-nonlinear-singular-integral-equation
On the-approximate-solution-of-a-nonlinear-singular-integral-equationOn the-approximate-solution-of-a-nonlinear-singular-integral-equation
On the-approximate-solution-of-a-nonlinear-singular-integral-equationCemal Ardil
 
2.6 all pairsshortestpath
2.6 all pairsshortestpath2.6 all pairsshortestpath
2.6 all pairsshortestpathKrish_ver2
 
A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable FunctionsA Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable FunctionsMatt Parker
 
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingFast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingRakuten Group, Inc.
 

Was ist angesagt? (20)

Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
Quantum Machine Learning and QEM for Gaussian mixture models (Alessandro Luongo)
 
Specific Finite Groups(General)
Specific Finite Groups(General)Specific Finite Groups(General)
Specific Finite Groups(General)
 
lecture 4
lecture 4lecture 4
lecture 4
 
A One-Pass Triclustering Approach: Is There any Room for Big Data?
A One-Pass Triclustering Approach: Is There any Room for Big Data?A One-Pass Triclustering Approach: Is There any Room for Big Data?
A One-Pass Triclustering Approach: Is There any Room for Big Data?
 
Goldberg-Coxeter construction for 3- or 4-valent plane maps
Goldberg-Coxeter construction for 3- or 4-valent plane mapsGoldberg-Coxeter construction for 3- or 4-valent plane maps
Goldberg-Coxeter construction for 3- or 4-valent plane maps
 
Specific Finite Groups(General)
Specific Finite Groups(General)Specific Finite Groups(General)
Specific Finite Groups(General)
 
Specific Finite Groups(General)
Specific Finite Groups(General)Specific Finite Groups(General)
Specific Finite Groups(General)
 
On maximal and variational Fourier restriction
On maximal and variational Fourier restrictionOn maximal and variational Fourier restriction
On maximal and variational Fourier restriction
 
Bayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse ProblemsBayesian Inference and Uncertainty Quantification for Inverse Problems
Bayesian Inference and Uncertainty Quantification for Inverse Problems
 
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
 
Kumegawa russia
Kumegawa russiaKumegawa russia
Kumegawa russia
 
Low-rank tensor approximation (Introduction)
Low-rank tensor approximation (Introduction)Low-rank tensor approximation (Introduction)
Low-rank tensor approximation (Introduction)
 
Faster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select DictionariesFaster Practical Block Compression for Rank/Select Dictionaries
Faster Practical Block Compression for Rank/Select Dictionaries
 
Prim's Algorithm on minimum spanning tree
Prim's Algorithm on minimum spanning treePrim's Algorithm on minimum spanning tree
Prim's Algorithm on minimum spanning tree
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 
On the-approximate-solution-of-a-nonlinear-singular-integral-equation
On the-approximate-solution-of-a-nonlinear-singular-integral-equationOn the-approximate-solution-of-a-nonlinear-singular-integral-equation
On the-approximate-solution-of-a-nonlinear-singular-integral-equation
 
Fdtd ppt for mine
Fdtd ppt   for mineFdtd ppt   for mine
Fdtd ppt for mine
 
2.6 all pairsshortestpath
2.6 all pairsshortestpath2.6 all pairsshortestpath
2.6 all pairsshortestpath
 
A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable FunctionsA Commutative Alternative to Fractional Calculus on k-Differentiable Functions
A Commutative Alternative to Fractional Calculus on k-Differentiable Functions
 
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingFast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
 

Ähnlich wie TopicRNN: A Recurrent Model for Documents

On Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular IntegralsOn Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular IntegralsVjekoslavKovac1
 
MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化Akira Tanimoto
 
Hybrid Atlas Models of Financial Equity Market
Hybrid Atlas Models of Financial Equity MarketHybrid Atlas Models of Financial Equity Market
Hybrid Atlas Models of Financial Equity Markettomoyukiichiba
 
Tele4653 l1
Tele4653 l1Tele4653 l1
Tele4653 l1Vin Voro
 
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMSSOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMSTahia ZERIZER
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Daisuke Yoneoka
 
A Note on BPTT for LSTM LM
A Note on BPTT for LSTM LMA Note on BPTT for LSTM LM
A Note on BPTT for LSTM LMTomonari Masada
 
Univariate Financial Time Series Analysis
Univariate Financial Time Series AnalysisUnivariate Financial Time Series Analysis
Univariate Financial Time Series AnalysisAnissa ATMANI
 
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Chiheb Ben Hammouda
 
Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Alexander Litvinenko
 
Métodos computacionales para el estudio de modelos epidemiológicos con incer...
Métodos computacionales para el estudio de modelos  epidemiológicos con incer...Métodos computacionales para el estudio de modelos  epidemiológicos con incer...
Métodos computacionales para el estudio de modelos epidemiológicos con incer...Facultad de Informática UCM
 
Tele4653 l5
Tele4653 l5Tele4653 l5
Tele4653 l5Vin Voro
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Alexander Litvinenko
 
Response Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty QuantificationResponse Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty QuantificationAlexander Litvinenko
 
2014 spring crunch seminar (SDE/levy/fractional/spectral method)
2014 spring crunch seminar (SDE/levy/fractional/spectral method)2014 spring crunch seminar (SDE/levy/fractional/spectral method)
2014 spring crunch seminar (SDE/levy/fractional/spectral method)Zheng Mengdi
 
22nd BSS meeting poster
22nd BSS meeting poster 22nd BSS meeting poster
22nd BSS meeting poster Samuel Gbari
 

Ähnlich wie TopicRNN: A Recurrent Model for Documents (20)

On Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular IntegralsOn Twisted Paraproducts and some other Multilinear Singular Integrals
On Twisted Paraproducts and some other Multilinear Singular Integrals
 
MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化MLP輪読スパース8章 トレースノルム正則化
MLP輪読スパース8章 トレースノルム正則化
 
Hybrid Atlas Models of Financial Equity Market
Hybrid Atlas Models of Financial Equity MarketHybrid Atlas Models of Financial Equity Market
Hybrid Atlas Models of Financial Equity Market
 
Tele4653 l1
Tele4653 l1Tele4653 l1
Tele4653 l1
 
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMSSOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
 
residue
residueresidue
residue
 
A Note on BPTT for LSTM LM
A Note on BPTT for LSTM LMA Note on BPTT for LSTM LM
A Note on BPTT for LSTM LM
 
Univariate Financial Time Series Analysis
Univariate Financial Time Series AnalysisUnivariate Financial Time Series Analysis
Univariate Financial Time Series Analysis
 
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
Seminar Talk: Multilevel Hybrid Split Step Implicit Tau-Leap for Stochastic R...
 
Presentation OCIP2014
Presentation OCIP2014Presentation OCIP2014
Presentation OCIP2014
 
Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...Low rank tensor approximation of probability density and characteristic funct...
Low rank tensor approximation of probability density and characteristic funct...
 
Métodos computacionales para el estudio de modelos epidemiológicos con incer...
Métodos computacionales para el estudio de modelos  epidemiológicos con incer...Métodos computacionales para el estudio de modelos  epidemiológicos con incer...
Métodos computacionales para el estudio de modelos epidemiológicos con incer...
 
Tele4653 l5
Tele4653 l5Tele4653 l5
Tele4653 l5
 
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
 
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
 
Response Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty QuantificationResponse Surface in Tensor Train format for Uncertainty Quantification
Response Surface in Tensor Train format for Uncertainty Quantification
 
2014 spring crunch seminar (SDE/levy/fractional/spectral method)
2014 spring crunch seminar (SDE/levy/fractional/spectral method)2014 spring crunch seminar (SDE/levy/fractional/spectral method)
2014 spring crunch seminar (SDE/levy/fractional/spectral method)
 
22nd BSS meeting poster
22nd BSS meeting poster 22nd BSS meeting poster
22nd BSS meeting poster
 
Mid term solution
Mid term solutionMid term solution
Mid term solution
 

Mehr von Tomonari Masada

Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説Tomonari Masada
 
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説Tomonari Masada
 
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingContext-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingTomonari Masada
 
A note on the density of Gumbel-softmax
A note on the density of Gumbel-softmaxA note on the density of Gumbel-softmax
A note on the density of Gumbel-softmaxTomonari Masada
 
トピックモデルの基礎と応用
トピックモデルの基礎と応用トピックモデルの基礎と応用
トピックモデルの基礎と応用Tomonari Masada
 
Expectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationExpectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationTomonari Masada
 
Mini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingMini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingTomonari Masada
 
A note on variational inference for the univariate Gaussian
A note on variational inference for the univariate GaussianA note on variational inference for the univariate Gaussian
A note on variational inference for the univariate GaussianTomonari Masada
 
Document Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsDocument Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsTomonari Masada
 
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka CompositionLDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka CompositionTomonari Masada
 
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic ModelA Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic ModelTomonari Masada
 
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationA Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationTomonari Masada
 
Word count in Husserliana Volumes 1 to 28
Word count in Husserliana Volumes 1 to 28Word count in Husserliana Volumes 1 to 28
Word count in Husserliana Volumes 1 to 28Tomonari Masada
 
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationA Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationTomonari Masada
 
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...Tomonari Masada
 
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...Tomonari Masada
 
A Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDAA Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDATomonari Masada
 
ChronoSAGE: Diversifying Topic Modeling Chronologically
ChronoSAGE: Diversifying Topic Modeling ChronologicallyChronoSAGE: Diversifying Topic Modeling Chronologically
ChronoSAGE: Diversifying Topic Modeling ChronologicallyTomonari Masada
 

Mehr von Tomonari Masada (20)

Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説Learning Latent Space Energy Based Prior Modelの解説
Learning Latent Space Energy Based Prior Modelの解説
 
Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説Denoising Diffusion Probabilistic Modelsの重要な式の解説
Denoising Diffusion Probabilistic Modelsの重要な式の解説
 
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingContext-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
 
A note on the density of Gumbel-softmax
A note on the density of Gumbel-softmaxA note on the density of Gumbel-softmax
A note on the density of Gumbel-softmax
 
トピックモデルの基礎と応用
トピックモデルの基礎と応用トピックモデルの基礎と応用
トピックモデルの基礎と応用
 
Expectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocationExpectation propagation for latent Dirichlet allocation
Expectation propagation for latent Dirichlet allocation
 
Mini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic ModelingMini-batch Variational Inference for Time-Aware Topic Modeling
Mini-batch Variational Inference for Time-Aware Topic Modeling
 
A note on variational inference for the univariate Gaussian
A note on variational inference for the univariate GaussianA note on variational inference for the univariate Gaussian
A note on variational inference for the univariate Gaussian
 
Document Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsDocument Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior Distributions
 
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka CompositionLDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
LDA-Based Scoring of Sequences Generated by RNN for Automatic Tanka Composition
 
A Note on ZINB-VAE
A Note on ZINB-VAEA Note on ZINB-VAE
A Note on ZINB-VAE
 
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic ModelA Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model
 
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationA Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
 
Word count in Husserliana Volumes 1 to 28
Word count in Husserliana Volumes 1 to 28Word count in Husserliana Volumes 1 to 28
Word count in Husserliana Volumes 1 to 28
 
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet AllocationA Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation
 
FDSE2015
FDSE2015FDSE2015
FDSE2015
 
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
A derivation of the sampling formulas for An Entity-Topic Model for Entity Li...
 
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
The detailed derivation of the derivatives in Table 2 of Marginalized Denoisi...
 
A Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDAA Note on PCVB0 for HDP-LDA
A Note on PCVB0 for HDP-LDA
 
ChronoSAGE: Diversifying Topic Modeling Chronologically
ChronoSAGE: Diversifying Topic Modeling ChronologicallyChronoSAGE: Diversifying Topic Modeling Chronologically
ChronoSAGE: Diversifying Topic Modeling Chronologically
 

Kürzlich hochgeladen

在线办理UM毕业证迈阿密大学毕业证成绩单留信学历认证
在线办理UM毕业证迈阿密大学毕业证成绩单留信学历认证在线办理UM毕业证迈阿密大学毕业证成绩单留信学历认证
在线办理UM毕业证迈阿密大学毕业证成绩单留信学历认证nhjeo1gg
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
办美国加州大学伯克利分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
办美国加州大学伯克利分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree办美国加州大学伯克利分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
办美国加州大学伯克利分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
在线办理WLU毕业证罗瑞尔大学毕业证成绩单留信学历认证
在线办理WLU毕业证罗瑞尔大学毕业证成绩单留信学历认证在线办理WLU毕业证罗瑞尔大学毕业证成绩单留信学历认证
在线办理WLU毕业证罗瑞尔大学毕业证成绩单留信学历认证nhjeo1gg
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
办理(UC毕业证书)堪培拉大学毕业证成绩单原版一比一
办理(UC毕业证书)堪培拉大学毕业证成绩单原版一比一办理(UC毕业证书)堪培拉大学毕业证成绩单原版一比一
办理(UC毕业证书)堪培拉大学毕业证成绩单原版一比一z xss
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 

Kürzlich hochgeladen (20)

在线办理UM毕业证迈阿密大学毕业证成绩单留信学历认证
在线办理UM毕业证迈阿密大学毕业证成绩单留信学历认证在线办理UM毕业证迈阿密大学毕业证成绩单留信学历认证
在线办理UM毕业证迈阿密大学毕业证成绩单留信学历认证
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
办美国加州大学伯克利分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
办美国加州大学伯克利分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree办美国加州大学伯克利分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
办美国加州大学伯克利分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
在线办理WLU毕业证罗瑞尔大学毕业证成绩单留信学历认证
在线办理WLU毕业证罗瑞尔大学毕业证成绩单留信学历认证在线办理WLU毕业证罗瑞尔大学毕业证成绩单留信学历认证
在线办理WLU毕业证罗瑞尔大学毕业证成绩单留信学历认证
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
办理(UC毕业证书)堪培拉大学毕业证成绩单原版一比一
办理(UC毕业证书)堪培拉大学毕业证成绩单原版一比一办理(UC毕业证书)堪培拉大学毕业证成绩单原版一比一
办理(UC毕业证书)堪培拉大学毕业证成绩单原版一比一
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 

TopicRNN: A Recurrent Model for Documents

  • 1. A Note on TopicRNN Tomonari MASADA @ Nagasaki University July 13, 2017 1 Model TopicRNN is a generative model proposed by [1], whose generative story for a particular document x1:T is given as below. 1. Draw a topic vector θ ∼ N(0, I). 2. Given word y1:t−1, for the tth word yt in the document, (a) Compute hidden state ht = fW (xt, ht−1), where we let xt yt−1. (b) Draw stop word indicator lt ∼ Bernoulli(σ(Γ ht)), with σ the sigmoid function. (c) Draw word yt ∼ p(yt|ht, θ, lt, B), where p(yt = i|ht, θ, lt, B) ∝ exp(vi ht + (1 − lt)bi θ) . 2 Lower bound The log marginal likelihood of the word sequence y1:T and the stop word indicators l1:T is log p(y1:T , l1:T |h1:T ) = log p(θ) T t=1 p(yt|ht, lt, θ; W)p(lt|ht; Γ)dθ (1) A lower bound can be obtained as follows: log p(y1:T , l1:T |h1:T ) = log p(θ) T t=1 p(yt|ht, lt, θ; W)p(lt|ht; Γ)dθ = log q(θ) p(θ) T t=1 p(yt|ht, lt, θ; W)p(lt|ht; Γ) q(θ) dθ ≥ q(θ) log p(θ) T t=1 p(yt|ht, lt, θ; W)p(lt|ht; Γ) q(θ) dθ = q(θ) log p(θ)dθ + T t=1 q(θ) log p(yt|ht, lt, θ; W)dθ + T t=1 q(θ) log p(lt|ht; Γ)dθ − q(θ) log q(θ)dθ L(y1:T , l1:T |q(θ), Θ) (2) 3 Approximate posterior The form of q(θ) is chosen to be an inference network using a feed-forward neural network. Each expec- tation in Eq. (2) is approximated with the samples from q(θ|Xc), where Xc denotes the term-frequency representation of y1:T excluding stop words. The density of the approximate posterior q(θ|Xc) is specified as follows: q(θ|Xc) = N(θ; µ(Xc), diag(σ2 (Xc))), (3) µ(Xc) = W1g(Xc) + a1, (4) log σ(Xc) = W2g(Xc) + a2, (5) where g(·) denotes the feed-forward neural network. Eq. (3) gives the reparameterization of θk as θk = µk(Xc) + kσk(Xc) for k = 1, . . . , K, where k is a sample from the standard normal distribution N(0, 1). 1
  • 2. 4 Monte Carlo integration We can now rewrite each term of the lower bound L(y1:T , l1:T |q(θ), Θ) in Eq. (2) as below, where the θ(s) s denote the samples drawn from the approximate posterior q(θ|Xc). The first term: q(θ) log p(θ)dθ ≈ 1 S S s=1 log p(θ(s) ) = 1 S S s=1 K k=1 log 1 √ 2π exp − θ (s) k 2 2 = − K log(2π) 2 − 1 2 K k=1 s θ (s) k 2 S (6) Each addend of the second term: q(θ) log p(yt|ht, lt, θ; W)dθ ≈ 1 S S s=1 log exp(vyt ht + (1 − lt)byt θ(s) ) C j=1 exp(vj ht + (1 − lt)bj θ(s)) = vyt ht + (1 − lt)byt S s=1 θ(s) S − 1 S S s=1 log C j=1 exp vj ht + (1 − lt)bj θ(s) (7) Each addend of the third term: q(θ) log p(lt|ht; Γ)dθ = lt log(σ(Γ ht)) + (1 − lt) log(1 − σ(Γ ht)) (8) The fourth term: q(θ) log q(θ)dθ ≈ 1 S S s=1 K k=1 log 1 2πσ2 k(Xc) exp − (θ (s) k − µk(Xc))2 2σ2 k(Xc) = − K log(2π) 2 − K k=1 log(σk(Xc)) − 1 S S s=1 K k=1 θ (s) k − µk(Xc) 2 2σ2 k(Xc) (9) 5 Objective to be maximized Each of the s samples (i.e., θ(s) for s = 1, . . . , S) is obtained as θ(s) = µ(Xc)+ (s) ◦σ(Xc) via the reparam- eterization, where the (s) k s are drawn from the standard normal, and ◦ is the element-wise multiplication. Consequently, the lower bound L(y1:T , l1:T |q(θ), Θ) to be maximized is obtained as follows: L(y1:T , l1:T |q(θ), Θ) = − 1 2 K k=1 s µk(Xc) + (s) k σk(Xc) 2 S + T t=1 vyt ht + 1 S S s=1 T t=1 (1 − lt)byt µ(Xc) + (s) ◦ σ(Xc) − T t=1 1 S S s=1 log C j=1 exp vj ht + (1 − lt)bj µ(Xc) + (s) ◦ σ(Xc) + T t=1 lt log(σ(Γ ht)) + (1 − lt) log(1 − σ(Γ ht)) + K k=1 log(σk(Xc)) + const. (10) References [1] Adji Bousso Dieng, Chong Wang, Jianfeng Gao, and John Paisley. TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency. ICLR, 2017. 2