2. (bpm
125)
Recommend me
Bpm
some music! 90!
Discover groups
of similar songs…
Only my
railgun (bpm
Bach Sonata 120)
#1 (bpm 60) My Music Collection
2
3. (bpm
125)
Recommend me
some music!
bpm
Discover groups 120
of similar songs…
Only my
railgun (bpm
Bach Sonata 120)
#1 (bpm 60) My Music Collection
bpm 60
3
22. K
p(X) = ∑ π k N(X | µk ,Σ k )
k=1
Example:
K =2
€
€ 22
23. K-means is a Mixture of
classifier Gaussians is a
probability model
We can USE it as a
“soft” classifier
23
24. K-means is a Mixture of
classifier Gaussians is a
probability model
We can USE it as a
“soft” classifier
24
25. K-means is a Mixture of
classifier Gaussians is a
probability model
We can USE it as a
“soft” classifier
Parameter to fit to data: Parameters to fit to data:
• Mean µk • Mean µk
• Covariance Σ k
• Mixing coefficient π k
€ € 25
€
27. 1. Initialize means µk 1 0
2. E Step: Assign each point to a cluster
3. M Step: Given clusters, refine mean µk of each
cluster k
4. Stop when change in means is small
€
€
27
28. 1. Initialize Gaussian* parameters: means µk ,
covariances Σ k and mixing coefficients π k
2. E Step: Assign each point Xn an assignment
score γ (znk ) for each cluster k 0.5 0.5
3. M Step: Given scores, adjust µk ,€ k ,Σ k
π
for€each cluster k €
4. Evaluate
€ likelihood. If likelihood or
parameters converge, stop.
€ € €
*There are k Gaussians
28
29. 1. Initialize µk , Σk
π k , one for each
Gaussian k
€ π2 Σ2
Tip! Use K-means
€ € result to initialize: µ2
µk ← µk
Σk ← cov(cluster(K)) € €
π k ← Number of pointspoints
in k €
Total number of
29
€
30. Latent variable
2. E Step: For each .7 .3
point Xn, determine
its assignment score
to each Gaussian k:
is called a “responsibility”: how much is this Gaussian k
γ (znk ) responsible for this point Xn?
30
31. 3. M Step: For each
Gaussian k, update
parameters using
new γ (znk )
Responsibility
for this Xn
Mean of Gaussian k
€
Find the mean that “fits” the assignment scores best
31
32. 3. M Step: For each
Gaussian k, update
parameters using
new γ (znk )
Covariance matrix
€
of Gaussian k
Just calculated this!
32
33. 3. M Step: For each
Gaussian k, update
parameters using
new γ (znk )
Mixing Coefficient
€
eg. 105.6/200
for Gaussian k
Total # of
points
33
34. 4. Evaluate log likelihood. If likelihood or
parameters converge, stop. Else go to Step
2 (E step).
Likelihood is the probability that the data X
was generated by the parameters you found.
ie. Correctness!
34
36. old Hidden
1. Initialize parameters θ variables
old
2. E Step: Evaluate p(Z | X,θ )
3. M Step: Evaluate Observed
variables
€
€ Likelihood
where
4. Evaluate log likelihood. If likelihood or
parameters converge, stop. Else θ old ← θ new
and go to E Step.
36
37. K-means can be formulated as EM
EM for Gaussian Mixtures
EM for Bernoulli Mixtures
EM for Bayesian Linear Regression
37
38. “Expectation”
Calculated the fixed, data-dependent
parameters of the function Q.
“Maximization”
Once the parameters of Q are known, it is fully
determined, so now we can maximize Q.
38
39. We learned how to cluster data in an
unsupervised manner
Gaussian Mixture Models are useful for
modeling data with “soft” cluster
assignments
Expectation Maximization is a method used
when we have a model with latent variables
(values we don’t know, but estimate with
each step) 0.5 0.5
39