2. word2vec (Google, 2013)
• Use documents to train a neural network model
maximizing the conditional probability of context given the
word
• Apply the trained model to each word to get its
corresponding vector
• Calculate the vector of sentences by averaging the vector
of their words
• Construct the similarity matrix between sentences
• Use Pagerank to score the sentences in graph
3. 1. Use documents to train a neural
network model maximizing the conditional
probability of context given the word
The goal is to optimize the parameters (Θ) maximizing the
conditional probability of context (c) given the word (w). D is the set
of all (w, c) pairs
For example: I ate a “????” at McDonald last night is more likely
given Big Mac
4. 2. Apply the model to each word
to get its corresponding vector
word vector
(0.12, 0.23, 0.56)
(0.24, 0.65, 0.72)
(0.38, 0.42, 0.12)
(0.57, 0.01, 0.02)
(0.53, 0.68, 0.91)
(0.11, 0.27, 0.45)
(0.01, 0.05, 0.62)
The
Cardinals
will
win
the
world
series
5. 3. Calculate the vector of sentences
by averaging the vector of their words
word vector
(0.12, 0.23, 0.56)
(0.24, 0.65, 0.72)
(0.38, 0.42, 0.12)
(0.57, 0.01, 0.02)
(0.53, 0.68, 0.91)
(0.11, 0.27, 0.45)
(0.01, 0.05, 0.62)
The
Cardinals
will
win
the
world
series
sentence vector
(0.28, 0.33, 0.49)
7. 5. Use Pagerank to score the
sentences in graph
• Rank the sentences
with underlying
assumption that
“summary sentences”
are similar to most
other sentences