SlideShare ist ein Scribd-Unternehmen logo
1 von 161
Downloaden Sie, um offline zu lesen
A word is worth a
thousand vectors
(word2vec, lda, and introducing lda2vec)
Christopher Moody
@ Stitch Fix
About
@chrisemoody
Caltech Physics
PhD. in astrostats supercomputing
sklearn t-SNE contributor
Data Labs at Stitch Fix
github.com/cemoody
Gaussian Processes t-SNE
chainer
deep learning
Tensor Decomposition
Credit
Large swathes of this talk are from
previous presentations by:
‱ Tomas Mikolov
‱ David Blei
‱ Christopher Olah
‱ Radim Rehurek
‱ Omer Levy & Yoav Goldberg
‱ Richard Socher
‱ Xin Rong
‱ Tim Hopper
word2vec
lda
1
2
lda2vec
1. king - man + woman = queen
2. Huge splash in NLP world
3. Learns from raw text
4. Pretty simple algorithm
5. Comes pretrained
word2vec
word2vec
1. Set up an objective function
2. Randomly initialize vectors
3. Do gradient descent
w
ord2vec
word2vec: learn word vector vin
from it’s surrounding context
vin
w
ord2vec
“The fox jumped over the lazy dog”
Maximize the likelihood of seeing the words given the word over.
P(the|over)
P(fox|over)
P(jumped|over)
P(the|over)
P(lazy|over)
P(dog|over)

instead of maximizing the likelihood of co-occurrence counts.
w
ord2vec
P(fox|over)
What should this be?
w
ord2vec
P(vfox|vover)
Should depend on the word vectors.
P(fox|over)
w
ord2vec
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
“The fox jumped over the lazy dog”
P(vOUT|vIN)
w
ord2vec
“The fox jumped over the lazy dog”
vIN
P(vOUT|vIN)
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
P(vOUT|vIN)
“The fox jumped over the lazy dog”
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
w
ord2vec
“The fox jumped over the lazy dog”
vOUT
P(vOUT|vIN)
vIN
Twist: we have two vectors for every word.
Should depend on whether it’s the input or the output.
Also a context window around every input word.
objective
Measure loss between
vIN and vOUT?
vin . vout
How should we deïŹne P(vOUT|vIN)?
w
ord2vec
vin . vout ~ 1
objective
vin
vout
w
ord2vec
objective
vin
vout
vin . vout ~ 0
w
ord2vec
objective
vin
vout
vin . vout ~ -1
w
ord2vec
vin . vout ∈ [-1,1]
objective
w
ord2vec
But we’d like to measure a probability.
vin . vout ∈ [-1,1]
objective
w
ord2vec
But we’d like to measure a probability.
softmax(vin . vout ∈ [-1,1])
objective
∈ [0,1]
w
ord2vec
But we’d like to measure a probability.
softmax(vin . vout ∈ [-1,1])
Probability of choosing 1 of N discrete items.
Mapping from vector space to a multinomial over words.
objective
w
ord2vec
But we’d like to measure a probability.
exp(vin . vout ∈ [0,1])softmax ~
objective
w
ord2vec
But we’d like to measure a probability.
exp(vin . vout ∈ [-1,1])
ÎŁexp(vin . vk)
softmax =
objective
Normalization term over all words
k ∈ V
w
ord2vec
But we’d like to measure a probability.
exp(vin . vout ∈ [-1,1])
ÎŁexp(vin . vk)
softmax = = P(vout|vin)
objective
k ∈ V
w
ord2vec
Learn by gradient descent on the softmax prob.
For every example we see update vin
vin := vin + P(vout|vin)
objective
vout := vout + P(vout|vin)
word2vec
word2vec
ITEM_3469 + ‘Pregnant’
+ ‘Pregnant’
= ITEM_701333
= ITEM_901004
= ITEM_800456
what about?LDA?
LDA
on Client Item
Descriptions
LDA
on Item
Descriptions
(with Jay)
LDA
on Item
Descriptions
(with Jay)
LDA
on Item
Descriptions
(with Jay)
LDA
on Item
Descriptions
(with Jay)
LDA
on Item
Descriptions
(with Jay)
Latent style vectors from text
Pairwise gamma correlation
from style ratings
Diversity from ratings Diversity from text
lda vs word2vec
word2vec is local:
one word predicts a nearby word
“I love ïŹnding new designer brands for jeans”
“I love ïŹnding new designer brands for jeans”
But text is usually organized.
“I love ïŹnding new designer brands for jeans”
But text is usually organized.
“I love ïŹnding new designer brands for jeans”
In LDA, documents globally predict words.
doc 7681
[ -0.75, -1.25, -0.55, -0.12, +2.2] [ 0%, 9%, 78%, 11%]
typical word2vec vector typical LDA document vector
typical word2vec vector
[ 0%, 9%, 78%, 11%]
typical LDA document vector
[ -0.75, -1.25, -0.55, -0.12, +2.2]
All sum to 100%All real values
5D word2vec vector
[ 0%, 9%, 78%, 11%]
5D LDA document vector
[ -0.75, -1.25, -0.55, -0.12, +2.2]
Sparse
All sum to 100%
Dimensions are absolute
Dense
All real values
Dimensions relative
100D word2vec vector
[ 0%0%0%0%0% 
 0%, 9%, 78%, 11%]
100D LDA document vector
[ -0.75, -1.25, -0.55, -0.27, -0.94, 0.44, 0.05, 0.31 
 -0.12, +2.2]
Sparse
All sum to 100%
Dimensions are absolute
Dense
All real values
Dimensions relative
dense sparse
100D word2vec vector
[ 0%0%0%0%0% 
 0%, 9%, 78%, 11%]
100D LDA document vector
[ -0.75, -1.25, -0.55, -0.27, -0.94, 0.44, 0.05, 0.31 
 -0.12, +2.2]
Similar in fewer ways
(more interpretable)
Similar in 100D ways
(very ïŹ‚exible)
+mixture
+sparse
can we do both? lda2vec
The goal:
Use all of this context to learn
interpretable topics.
P(vOUT |vIN)word2vec
@chrisemoody
word2vec
LDA P(vOUT |vDOC)
The goal:
Use all of this context to learn
interpretable topics.
this document is
80% high fashion
this document is
60% style
@chrisemoody
word2vec
LDA
The goal:
Use all of this context to learn
interpretable topics.
this zip code is
80% hot climate
this zip code is
60% outdoors wear
@chrisemoody
word2vec
LDA
The goal:
Use all of this context to learn
interpretable topics.
this client is
80% sporty
this client is
60% casual wear
@chrisemoody
lda2vec
word2vec predicts locally:
one word predicts a nearby word
P(vOUT |vIN)
vIN vOUT
“PS! Thank you for such an awesome top”
lda2vec
LDA predicts a word from a global context
doc_id=1846
P(vOUT |vDOC)
vOUT
vDOC
“PS! Thank you for such an awesome top”
lda2vec
doc_id=1846
vIN vOUT
vDOC
can we predict a word both locally and globally ?
“PS! Thank you for such an awesome top”
lda2vec
“PS! Thank you for such an awesome top”doc_id=1846
vIN vOUT
vDOC
can we predict a word both locally and globally ?
P(vOUT |vIN+ vDOC)
lda2vec
doc_id=1846
vIN vOUT
vDOC
*very similar to the Paragraph Vectors / doc2vec
can we predict a word both locally and globally ?
“PS! Thank you for such an awesome top”
P(vOUT |vIN+ vDOC)
lda2vec
This works! 😀 But vDOC isn’t as
interpretable as the LDA topic vectors. 😔
lda2vec
This works! 😀 But vDOC isn’t as
interpretable as the LDA topic vectors. 😔
lda2vec
This works! 😀 But vDOC isn’t as
interpretable as the LDA topic vectors. 😔
lda2vec
This works! 😀 But vDOC isn’t as
interpretable as the LDA topic vectors. 😔
We’re missing mixtures & sparsity.
lda2vec
This works! 😀 But vDOC isn’t as
interpretable as the LDA topic vectors. 😔
Let’s make vDOC into a mixture

lda2vec
Let’s make vDOC into a mixture

vDOC = a vtopic1 + b vtopic2 +
 (up to k topics)
lda2vec
Let’s make vDOC into a mixture

vDOC = a vtopic1 + b vtopic2 +

Trinitarian
baptismal
Pentecostals
Bede
schismatics
excommunication
lda2vec
Let’s make vDOC into a mixture

vDOC = a vtopic1 + b vtopic2 +

topic 1 = “religion”
Trinitarian
baptismal
Pentecostals
Bede
schismatics
excommunication
lda2vec
Let’s make vDOC into a mixture

vDOC = a vtopic1 + b vtopic2 +

Milosevic
absentee
Indonesia
Lebanese
Isrealis
Karadzic
topic 1 = “religion”
Trinitarian
baptismal
Pentecostals
Bede
schismatics
excommunication
lda2vec
Let’s make vDOC into a mixture

vDOC = a vtopic1 + b vtopic2 +

topic 1 = “religion”
Trinitarian
baptismal
Pentecostals
bede
schismatics
excommunication
topic 2 = “politics”
Milosevic
absentee
Indonesia
Lebanese
Isrealis
Karadzic
lda2vec
Let’s make vDOC into a mixture

vDOC = 10% religion + 89% politics +

topic 2 = “politics”
Milosevic
absentee
Indonesia
Lebanese
Isrealis
Karadzic
topic 1 = “religion”
Trinitarian
baptismal
Pentecostals
bede
schismatics
excommunication
lda2vec
Let’s make vDOC sparse
[ -0.75, -1.25, 
]
vDOC = a vreligion + b vpolitics +

lda2vec
Let’s make vDOC sparse
vDOC = a vreligion + b vpolitics +

lda2vec
Let’s make vDOC sparse
vDOC = a vreligion + b vpolitics +

lda2vec
Let’s make vDOC sparse
{a, b, c
} ~ dirichlet(alpha)
vDOC = a vreligion + b vpolitics +

lda2vec
Let’s make vDOC sparse
{a, b, c
} ~ dirichlet(alpha)
vDOC = a vreligion + b vpolitics +

word2vec
LDA
P(vOUT |vIN + vDOC)lda2vec
The goal:
Use all of this context to learn
interpretable topics.
@chrisemoody
this document is
80% high fashion
this document is
60% style
word2vec
LDA
P(vOUT |vIN+ vDOC + vZIP)lda2vec
The goal:
Use all of this context to learn
interpretable topics.
@chrisemoody
word2vec
LDA
P(vOUT |vIN+ vDOC + vZIP)lda2vec
The goal:
Use all of this context to learn
interpretable topics.
this zip code is
80% hot climate
this zip code is
60% outdoors wear
@chrisemoody
word2vec
LDA
P(vOUT |vIN+ vDOC + vZIP +vCLIENTS)lda2vec
The goal:
Use all of this context to learn
interpretable topics.
this client is
80% sporty
this client is
60% casual wear
@chrisemoody
word2vec
LDA
P(vOUT |vIN+ vDOC + vZIP +vCLIENTS)
P(sold | vCLIENTS)
lda2vec
The goal:
Use all of this context to learn
interpretable topics.
@chrisemoody
Can also make the topics
supervised so that they predict
an outcome.
github.com/cemoody/lda2vec
uses pyldavis
API Ref docs (no narrative docs)
GPU
Decent test coverage
@chrisemoody
“PS! Thank you for such an awesome idea”
@chrisemoody
doc_id=1846
Can we model topics to sentences?
lda2lstm
“PS! Thank you for such an awesome idea”
@chrisemoody
doc_id=1846
Can we represent the internal LSTM
states as a dirichlet mixture?
Can we model topics to sentences?
lda2lstm
“PS! Thank you for such an awesome idea”doc_id=1846
@chrisemoody
Can we model topics to images?
lda2ae
TJ Torres
?@chrisemoody
Multithreaded
Stitch Fix
Bonus slides
Crazy
Approaches
Paragraph Vectors
(Just extend the context window)
Content dependency
(Change the window grammatically)
Social word2vec (deepwalk)
(Sentence is a walk on the graph)
Spotify
(Sentence is a playlist of song_ids)
Stitch Fix
(Sentence is a shipment of ïŹve items)
CBOW
“The fox jumped over the lazy dog”
Guess the word
given the context
~20x faster.
(this is the alternative.)
vOUT
vIN vINvIN vINvIN vIN
SkipGram
“The fox jumped over the lazy dog”
vOUT vOUT
vIN
vOUT vOUT vOUTvOUT
Guess the context
given the word
Better at syntax.
(this is the one we went over)
LDA
Results
context
H
istory
I loved every choice in this ïŹx!! Great job!
Great Stylist Perfect
LDA
Results
context
H
istory
Body Fit
My measurements are 36-28-32. If that helps.
I like wearing some clothing that is ïŹtted.
Very hard for me to ïŹnd pants that ïŹt right.
LDA
Results
context
H
istory
Sizing
Really enjoyed the experience and the
pieces, sizing for tops was too big.
Looking forward to my next box!
Excited for next
LDA
Results
context
H
istory
Almost Bought
It was a great ïŹx. Loved the two items I
kept and the three I sent back were close!
Perfect
What I didn’t mention
A lot of text (only if you have a specialized vocabulary)
Cleaning the text
Memory & performance
Traditional databases aren’t well-suited
False positives
and now for something completely crazy
All of the following ideas will change what
‘words’ and ‘context’ represent.
paragraph
vector
What about summarizing documents?
On the day he took ofïŹce, President Obama reached out to America’s enemies,
offering in his ïŹrst inaugural address to extend a hand if you are willing to unclench
your ïŹst. More than six years later, he has arrived at a moment of truth in testing that
On the day he took ofïŹce, President Obama reached out to America’s enemies,
offering in his ïŹrst inaugural address to extend a hand if you are willing to unclench
your ïŹst. More than six years later, he has arrived at a moment of truth in testing that
The framework nuclear agreement he reached with Iran on Thursday did not provide
the deïŹnitive answer to whether Mr. Obama’s audacious gamble will pay off. The ïŹst
Iran has shaken at the so-called Great Satan since 1979 has not completely relaxed.
paragraph
vector
Normal skipgram extends C words before, and C words after.
IN
OUT OUT
On the day he took ofïŹce, President Obama reached out to America’s enemies,
offering in his ïŹrst inaugural address to extend a hand if you are willing to unclench
your ïŹst. More than six years later, he has arrived at a moment of truth in testing that
The framework nuclear agreement he reached with Iran on Thursday did not provide
the deïŹnitive answer to whether Mr. Obama’s audacious gamble will pay off. The ïŹst
Iran has shaken at the so-called Great Satan since 1979 has not completely relaxed.
paragraph
vector
A document vector simply extends the context to the whole document.
IN
OUT OUT
OUT OUTdoc_1347
from	gensim.models	import	Doc2Vec		
fn	=	“item_document_vectors”		
model	=	Doc2Vec.load(fn)		
model.most_similar('pregnant')		
matches	=	list(filter(lambda	x:	'SENT_'	in	x[0],	matches))			
#	['...I	am	currently	23	weeks	pregnant...',		
#		'...I'm	now	10	weeks	pregnant...',		
#		'...not	showing	too	much	yet...',		
#		'...15	weeks	now.	Baby	bump...',		
#		'...6	weeks	post	partum!...',		
#		'...12	weeks	postpartum	and	am	nursing...',		
#		'...I	have	my	baby	shower	that...',		
#		'...am	still	breastfeeding...',		
#		'...I	would	love	an	outfit	for	a	baby	shower...']
sentence
search
translation
(using just a rotation
matrix)
M
ikolov
2013
English
Spanish
Matrix
Rotation
context
dependent
Levy
&
G
oldberg
2014
Australian scientist discovers star with telescope
context +/- 2 words
context
dependent
context
Australian scientist discovers star with telescope
Levy
&
G
oldberg
2014
context
dependent
context
Australian scientist discovers star with telescope
context
Levy
&
G
oldberg
2014
context
dependent
context
BoW DEPS
topically-similar vs ‘functionally’ similar
Levy
&
G
oldberg
2014
context
dependent
context
Levy
&
G
oldberg
2014
Also show that SGNS is simply factorizing:
w * c = PMI(w, c) - log k
This is completely amazing!
Intuition: positive associations (canada, snow)
stronger in humans than negative associations
(what is the opposite of Canada?)
deepwalk
Perozzi
etal2014
learn word vectors from
sentences
“The fox jumped over the lazy dog”
vOUT vOUT vOUT vOUT vOUTvOUT
‘words’ are graph vertices
‘sentences’ are random walks on the
graph
word2vec
Playlists at
Spotify
context
sequence
learning
‘words’ are songs
‘sentences’ are playlists
Playlists at
Spotify
context
Erik
Bernhardsson
Great performance on ‘related artists’
Fixes at
Stitch Fix
sequence
learning
Let’s try:
‘words’ are styles
‘sentences’ are ïŹxes
Fixes at
Stitch Fix
context
Learn similarity between styles
because they co-occur
Learn ‘coherent’ styles
sequence
learning
Fixes at
Stitch Fix?
context
sequence
learning
Got lots of structure!
Fixes at
Stitch Fix?
context
sequence
learning
Fixes at
Stitch Fix?
context
sequence
learning
Nearby regions are
consistent ‘closets’
A speciïŹc lda2vec model
Our text blob is a comment that comes from a region_id and a style_id
lda2vec
Let’s make vDOC into a mixture

vDOC = 10% religion + 89% politics +

topic 2 = “politics”
Milosevic
absentee
Indonesia
Lebanese
Isrealis
Karadzic
topic 1 = “religion”
Trinitarian
baptismal
Pentecostals
bede
schismatics
excommunication

Weitere Àhnliche Inhalte

Was ist angesagt?

Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AISemantic Web Company
 
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vecword2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec👋 Christopher Moody
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet AllocationSangwoo Mo
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEDatabricks
 
Lecture 6
Lecture 6Lecture 6
Lecture 6hunglq
 
A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)Shuntaro Yada
 
Transformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroTransformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroBill Liu
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferRoelof Pieters
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph IntroductionSören Auer
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Prakash Pimpale
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for RetrievalBhaskar Mitra
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?Traian Rebedea
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion ModelsSangwoo Mo
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNHye-min Ahn
 
Building an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowBuilding an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowDatabricks
 

Was ist angesagt? (20)

Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
 
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vecword2vec, LDA, and introducing a new hybrid algorithm: lda2vec
word2vec, LDA, and introducing a new hybrid algorithm: lda2vec
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
 
Lecture 6
Lecture 6Lecture 6
Lecture 6
 
A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)
 
Transformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to HeroTransformers in Vision: From Zero to Hero
Transformers in Vision: From Zero to Hero
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transfer
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
Text Similarity
Text SimilarityText Similarity
Text Similarity
 
Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics Natural Language Toolkit (NLTK), Basics
Natural Language Toolkit (NLTK), Basics
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 
Shap
ShapShap
Shap
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion Models
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
 
Building an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowBuilding an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflow
 

Andere mochten auch

[SmartNews] Globally Scalable Web Document Classification Using Word2Vec
[SmartNews] Globally Scalable Web Document Classification Using Word2Vec[SmartNews] Globally Scalable Web Document Classification Using Word2Vec
[SmartNews] Globally Scalable Web Document Classification Using Word2VecKouhei Nakaji
 
Topic extraction using machine learning
Topic extraction using machine learningTopic extraction using machine learning
Topic extraction using machine learningSanjib Basak
 
Vsm lsi
Vsm lsiVsm lsi
Vsm lsiRyan Wang
 
Topic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and ApplicationsTopic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and ApplicationsAyush Jain
 
ECO_TEXT_CLUSTERING
ECO_TEXT_CLUSTERINGECO_TEXT_CLUSTERING
ECO_TEXT_CLUSTERINGGeorge Simov
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalSudarsun Santhiappan
 
An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"sandinmyjoints
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsBuhwan Jeong
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 

Andere mochten auch (10)

[SmartNews] Globally Scalable Web Document Classification Using Word2Vec
[SmartNews] Globally Scalable Web Document Classification Using Word2Vec[SmartNews] Globally Scalable Web Document Classification Using Word2Vec
[SmartNews] Globally Scalable Web Document Classification Using Word2Vec
 
Topic extraction using machine learning
Topic extraction using machine learningTopic extraction using machine learning
Topic extraction using machine learning
 
Vsm lsi
Vsm lsiVsm lsi
Vsm lsi
 
Topic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and ApplicationsTopic Modelling: Tutorial on Usage and Applications
Topic Modelling: Tutorial on Usage and Applications
 
ECO_TEXT_CLUSTERING
ECO_TEXT_CLUSTERINGECO_TEXT_CLUSTERING
ECO_TEXT_CLUSTERING
 
Latent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information RetrievalLatent Semantic Indexing For Information Retrieval
Latent Semantic Indexing For Information Retrieval
 
An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 

Ähnlich wie word2vec, LDA, and introducing a new hybrid algorithm: lda2vec

Yoav Goldberg: Word Embeddings What, How and Whither
Yoav Goldberg: Word Embeddings What, How and WhitherYoav Goldberg: Word Embeddings What, How and Whither
Yoav Goldberg: Word Embeddings What, How and WhitherMLReview
 
Recipe2Vec: Or how does my robot know what’s tasty
Recipe2Vec: Or how does my robot know what’s tastyRecipe2Vec: Or how does my robot know what’s tasty
Recipe2Vec: Or how does my robot know what’s tastyPyData
 
Word embeddings
Word embeddingsWord embeddings
Word embeddingsShruti kar
 
Word2 vec
Word2 vecWord2 vec
Word2 vecankit_ppt
 
Skip gram and cbow
Skip gram and cbowSkip gram and cbow
Skip gram and cbowhyunyoung Lee
 
Sequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningSequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningRoberto Pereira Silveira
 
Semantic similarity between two sentences in arabic
Semantic similarity between two sentences in arabicSemantic similarity between two sentences in arabic
Semantic similarity between two sentences in arabicKhadija Mohamad
 
From grep to BERT
From grep to BERTFrom grep to BERT
From grep to BERTQAware GmbH
 
CS571: Distributional semantics
CS571: Distributional semanticsCS571: Distributional semantics
CS571: Distributional semanticsJinho Choi
 
Deep learning Malaysia presentation 12/4/2017
Deep learning Malaysia presentation 12/4/2017Deep learning Malaysia presentation 12/4/2017
Deep learning Malaysia presentation 12/4/2017Brian Ho
 
The Neural Search Frontier - Doug Turnbull, OpenSource Connections
The Neural Search Frontier - Doug Turnbull, OpenSource ConnectionsThe Neural Search Frontier - Doug Turnbull, OpenSource Connections
The Neural Search Frontier - Doug Turnbull, OpenSource ConnectionsLucidworks
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptxGowrySailaja
 
Introduction to word embeddings with Python
Introduction to word embeddings with PythonIntroduction to word embeddings with Python
Introduction to word embeddings with PythonPavel Kalaidin
 
DF1 - Py - Kalaidin - Introduction to Word Embeddings with Python
DF1 - Py - Kalaidin - Introduction to Word Embeddings with PythonDF1 - Py - Kalaidin - Introduction to Word Embeddings with Python
DF1 - Py - Kalaidin - Introduction to Word Embeddings with PythonMoscowDataFest
 
Paper dissected glove_ global vectors for word representation_ explained _ ...
Paper dissected   glove_ global vectors for word representation_ explained _ ...Paper dissected   glove_ global vectors for word representation_ explained _ ...
Paper dissected glove_ global vectors for word representation_ explained _ ...Nikhil Jaiswal
 
Data Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA
 

Ähnlich wie word2vec, LDA, and introducing a new hybrid algorithm: lda2vec (20)

Lda2vec text by the bay 2016 with notes
Lda2vec text by the bay 2016 with notesLda2vec text by the bay 2016 with notes
Lda2vec text by the bay 2016 with notes
 
lda2vec Text by the Bay 2016
lda2vec Text by the Bay 2016lda2vec Text by the Bay 2016
lda2vec Text by the Bay 2016
 
Yoav Goldberg: Word Embeddings What, How and Whither
Yoav Goldberg: Word Embeddings What, How and WhitherYoav Goldberg: Word Embeddings What, How and Whither
Yoav Goldberg: Word Embeddings What, How and Whither
 
Recipe2Vec: Or how does my robot know what’s tasty
Recipe2Vec: Or how does my robot know what’s tastyRecipe2Vec: Or how does my robot know what’s tasty
Recipe2Vec: Or how does my robot know what’s tasty
 
Word2vec and Friends
Word2vec and FriendsWord2vec and Friends
Word2vec and Friends
 
Word embeddings
Word embeddingsWord embeddings
Word embeddings
 
Word2 vec
Word2 vecWord2 vec
Word2 vec
 
Skip gram and cbow
Skip gram and cbowSkip gram and cbow
Skip gram and cbow
 
Sequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learningSequence to sequence (encoder-decoder) learning
Sequence to sequence (encoder-decoder) learning
 
Semantic similarity between two sentences in arabic
Semantic similarity between two sentences in arabicSemantic similarity between two sentences in arabic
Semantic similarity between two sentences in arabic
 
From grep to BERT
From grep to BERTFrom grep to BERT
From grep to BERT
 
CS571: Distributional semantics
CS571: Distributional semanticsCS571: Distributional semantics
CS571: Distributional semantics
 
Deep learning Malaysia presentation 12/4/2017
Deep learning Malaysia presentation 12/4/2017Deep learning Malaysia presentation 12/4/2017
Deep learning Malaysia presentation 12/4/2017
 
The Neural Search Frontier - Doug Turnbull, OpenSource Connections
The Neural Search Frontier - Doug Turnbull, OpenSource ConnectionsThe Neural Search Frontier - Doug Turnbull, OpenSource Connections
The Neural Search Frontier - Doug Turnbull, OpenSource Connections
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
 
Transformers 101
Transformers 101 Transformers 101
Transformers 101
 
Introduction to word embeddings with Python
Introduction to word embeddings with PythonIntroduction to word embeddings with Python
Introduction to word embeddings with Python
 
DF1 - Py - Kalaidin - Introduction to Word Embeddings with Python
DF1 - Py - Kalaidin - Introduction to Word Embeddings with PythonDF1 - Py - Kalaidin - Introduction to Word Embeddings with Python
DF1 - Py - Kalaidin - Introduction to Word Embeddings with Python
 
Paper dissected glove_ global vectors for word representation_ explained _ ...
Paper dissected   glove_ global vectors for word representation_ explained _ ...Paper dissected   glove_ global vectors for word representation_ explained _ ...
Paper dissected glove_ global vectors for word representation_ explained _ ...
 
Data Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLP
 

KĂŒrzlich hochgeladen

Manassas R - Parkside Middle School đŸŒŽđŸ«
Manassas R - Parkside Middle School đŸŒŽđŸ«Manassas R - Parkside Middle School đŸŒŽđŸ«
Manassas R - Parkside Middle School đŸŒŽđŸ«qfactory1
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
User Guide: Magellan MXℱ Weather Station
User Guide: Magellan MXℱ Weather StationUser Guide: Magellan MXℱ Weather Station
User Guide: Magellan MXℱ Weather StationColumbia Weather Systems
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsCharlene Llagas
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSĂ©rgio Sacani
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
Organic farming with special reference to vermiculture
Organic farming with special reference to vermicultureOrganic farming with special reference to vermiculture
Organic farming with special reference to vermicultureTakeleZike1
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxuniversity
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 

KĂŒrzlich hochgeladen (20)

Manassas R - Parkside Middle School đŸŒŽđŸ«
Manassas R - Parkside Middle School đŸŒŽđŸ«Manassas R - Parkside Middle School đŸŒŽđŸ«
Manassas R - Parkside Middle School đŸŒŽđŸ«
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
User Guide: Magellan MXℱ Weather Station
User Guide: Magellan MXℱ Weather StationUser Guide: Magellan MXℱ Weather Station
User Guide: Magellan MXℱ Weather Station
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and Functions
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
Organic farming with special reference to vermiculture
Organic farming with special reference to vermicultureOrganic farming with special reference to vermiculture
Organic farming with special reference to vermiculture
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 

word2vec, LDA, and introducing a new hybrid algorithm: lda2vec