SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Optimal interval clustering: Application to
Bregman clustering and statistical mixture
learning
Frank Nielsen1 Richard Nock2
1Sony Computer Science Laboratories/Ecole Polytechnique
2UAG-CEREGMIA/NICTA
May 2014
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 1/26
Hard clustering: Partitioning the data set
Partition X = {x1, ..., xn} ⊂ X into k clusters
C1 ⊂ X, ..., Ck ⊂ X:
X =
k
i=1
Ci
Center-based (prototype) hard clustering: k-means [2],
k-medians, k-center, r -center [10], etc.
Model-based hard clustering: statistical mixtures maximizing
the complete likelihood (prototype=model paramter).
k-means: NP-hard when d > 1 and k > 1 [11, 7, 1].
k-medians and k-centers: NP-hard [12] (1984)
In 1D, k-means is polynomial [3, 15]: O(n2k).
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 2/26
Euclidean 1D k-means
1D k-means [8] has contiguous partition.
Solved by enumerating all n−1
k−1 partitions in 1D (1958).
Better than Stirling numbers of the second kind S(n, k) that
count all partitions.
Polynomial in time O(n2k) using Dynamic Programming
(DP) [3] (sketched in 1973 in two pages).
R package Ckmeans.1d.dp [15] (2011).
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 3/26
Interval clustering: Structure
Sort X ∈ X with respect to total order < on X in O(n log n).
Output represented by:
k intervals Ii = [xli
, xri
] such that Ci = Ii ∩ X.
or better k − 1 delimiters li (i ∈ {2, ..., k}) since ri = li+1 − 1
(i < k and rk = n) and l1 = 1.
[x1...xl2−1]
C1
[xl2
...xl3−1]
C2
... [xlk
...xn]
Ck
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 4/26
Objective function for interval clustering
Scalars x1 < ... < xn are partitioned contiguously
into k clusters: C1 < ... < Ck.
Clustering objective function:
min ek(X) =
k
j=1
e1(Cj )
c1(·): intra-cluster cost/energy
⊕: inter-cluster cost/energy (commutative, associative)
n = kp + 1 1D points equally distributed → k different optimal
clustering partitions
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 5/26
Examples of objective functions
In arbitrary dimension X = Rd :
r -clustering (r ≥ 1): =
e1(Cj ) = min
p∈X
⎛
⎝
x∈Cj
d(x, p)r
⎞
⎠
(argmin=prototype pj is the same whether we take power of 1
r
of sum or not)
Euclidean r -clustering: r = 1 median, r = 2 means.
k-center (limr→∞): = max
e1(Ci ) = min
p∈X
max
x∈Cj
d(x, p)
Discrete clustering: Search space in min is Cj instead of X.
Note that in 1D, s-norm distance is always d(p, q) = |p − q|,
independent of s ≥ 1.
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 6/26
Optimal interval clustering by Dynamic Programming
Xj,i = {xj , ..., xi } (j ≤ i)
Xi = X1,i = {x1, ..., xi }
E = [ei,j]: n × k cost matrix, O(n × k) memory
ei,m = em(Xi )
Optimality equation:
ei,m = min
m≤j≤i
{ej−1,m−1 ⊕ e1(Xj,i )}
Associative/commutative operator ⊕ (+ or max).
Initialize with ci,1 = c1(Xi )
E: compute from left to right column, from bottom to top.
Best clustering solution cost is at en,k.
Time: n × k × O(n) × T1(n)=O(n2kT1(n)), O(nk) memory
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 7/26
Retrieving the solution: Backtracking
Use an auxiliary matrix S = [si.j] for storing the argmin.
Backtrack in O(k) time.
Left index lk of Ck stored at sn,k: lk = sn,k.
Iteratively retrieve the previous left interval indexes at entries
lj−1 = slj −1,i for j = k − 1, ..., j = 1.
Note that lj − 1 = n − k
l=j nl and lj − 1 = j−1
l=1 nl .
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 8/26
Optimizing time with a Look Up Table (LUT)
Save time when computing e1(Xj,i ) since we perform n × k × O(n)
such computations.
Look Up Table (LUT): Add extra n × n matrix E1 with
E1[j][i] = e1(Xj,i ).
Build in O(n2T1(n))...
Then DP in O(n2k)=O(n2T1(n))).
→ quadratic amount of memory (n > 10000...)
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 9/26
DP solver with cluster size constraints
n−
i and n+
i : lower/upper bound constraints on ni = |Ci |
k
l=1 = n−
i ≤ n and k
l=1 = n+
i ≥ n.
When no constraints: add dummy constraints n−
i = 1 and
n+
i = n − k + 1.
nm = |Cm| = i − j + 1 such that n−
m ≤ nm ≤ n+
m.
→ j ≤ i + 1 − n−
m and j ≥ i + 1 − n+
m.
ei,m = min
max{1+ m−1
l=1 n−
l ,i+1−n+
m}≤j
j≤i+1−n−
m
{ej−1,m−1 ⊕ e1(Xj,i )} ,
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 10/26
Model selection from the DP table
m(k) = ek (X)
e1(X) decreases with k and reaches minimum when k = n.
Model selection: trade-off choose best model among all the models
with k ∈ [1, n].
Regularized objective function: ek (X) = ek(X) + f (k), f (k)
related to model complexity.
Compute the DP table for k = n, ..., 1 and avoids redundant
computations.
Then compute the criterion for the last line (indexed by n) and
choose the argmin of ek.
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 11/26
A Voronoi cell condition for DP optimality
elements → interval clusters → prototypes
interval clusters ← prototypes
Voronoi diagram:
Partition X wrt. P = {p1, ..., pk }.
Voronoi cell:
V (pj) = {x ∈ X : dr
(x, pj ) ≤ dr
(x, pl ) ∀l ∈ {1, ..., k}}.
xr is a monotonically increasing function on R+, equivalent to
V (pj ) = {x ∈ X : d(x : pj ) < d(x : pl )}
DP guarantees optimal clustering when ∀P, V (pj ) is an interval
2-clustering exhibits the Voronoi bisector.
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 12/26
1-mean (centroid): O(n) time
min
p
n
i=1
(xi − p)2
D(x, p) = (x − p)2
, D (x, p) = 2(x − p), D (x, p) = 2
Convex optimization (existence and unique solution)
n
i=1
D (x, p) = 0 ⇒
n
i=1
xi − np = 0
Center of mass p = 1
n
n
i=1 xi − (barycenter)
Extends to Bregman divergence:
DF (x, p) = F(x) − F(p) − (x − p)F (p)
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 13/26
2-means: O(n log n) time
Find xl2
(n − 1 potential locations for xl : from x2 to xn):
min
xl2
{e1(C1) + e1(C2)}
Browse from left to right l2 = x2, ...., xn.
Update cost in constant time E2(l + 1) from E2(l) (SATs also
O(1)):
E2(l) = e2(x1...xl−1|xl ...xn)
μ1(l + 1) =
(l − 1)μ1(l) + xl
l
, μ2(l + 1) =
(n − l + 1)μ2(l) − xl
n − l
v1(l + 1) =
l
i=1
(xi − μ1(l + 1))2
=
l
i=1
x2
i − lμ2
1(l + 1)
ΔE2(l) =
l − 1
l
μ1(l) − xl
2
+
n − l + 1
n − l
μ2(l) − xl
2
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 14/26
2-means: Experiments
Intel Win7 i7-4800
n Brute force SAT Incremental
300000 155.022 0.010 0.0091
1000000 1814.44 0.018 0.015
Do we need sorting and Ω(n log n) time? (k = 1 is linear time)
Note that MaxGap does not yield the separator (because
centroid is sum of squared distance minimizer)
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 15/26
Optimal 1D Bregman k-means
Bregman information [2] e1 (generalizes cluster variance):
e1(Cj ) = min
xl ∈Cj
wl BF (xl : pj ). (1)
Expressed as [14]:
e1(Cj ) = xl ∈Cj
wl (pj F (pj ) − F(pj )) + xl ∈Cj
wl F(xl ) −
F (pj ) x∈Cj
wl x
process using Summed Area Tables [6] (SATs)
S1(j) = j
l=1 wl , S2(j) = j
l=1 wl xl , and S3(j) = j
l=1 wl F(xl ) in
O(n) time at preprocessing stage.
Evaluate the Bregman information e1(Xj,i ) in constant time O(1).
For example, i
l=j wl F(xl ) = S3(i) − S3(j − 1) with S3(0) = 0.
Bregman Voronoi diagrams have connected cells [4] thus DP yields
optimal interval clustering.
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 16/26
Exponential families in statistics
Family of probability distributions:
F = {pF (x; θ) : θ ∈ Θ}
Exponential families [13]:
pF (x|θ) = exp (t(x)θ − F(θ) + k(x)) ,
For example:
univariate Rayleigh R(σ), t(x) = x2, k(x) = log x, θ = − 1
2σ2 ,
η = −1
θ , F(θ) = log − 1
2θ and F∗(η) = −1 + log 2
η .
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 17/26
Uniorder exponential families: MLE
Maximum Likelihood Estimator (MLE) [13]:
e1(Xj,i ) = ˆl(xj , ..., xi ) = F∗
(ˆηj,i ) +
1
i − j + 1
i
l=j
k(xl ).
with ˆηj,i = 1
i−j+1
i
l=j t(xl ).
By making a change of variable yl = t(xl ), and not accounting the
k(xl ) terms that are constant for any clustering, we get
e1(Xj,i ) ≡ F∗
⎛
⎝ 1
i − j + 1
i
l=j
yl
⎞
⎠
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 18/26
Hard clustering for learning statistical mixtures
Expectation-Maximization learns monotonically from an
initialization by maximizing the incomplete log-likelihood.
Mixture maximizing the complete log-likelihood:
lc(X; L, Ω) =
n
i=1
log(αli
p(xi ; θli
)),
L = {li }i : hidden labels.
max lc ≡ min
θ1,...,θk
n
i=1
k
min
j=1
(− log p(xi ; θj ) − log αj ).
Given fixed α and − log pF (x; θ) amounts to a dual Bregman
divergence[2].
Run Bregman k-means and DP yields optimal partition since
additively-weighted Bregman Voronoi diagrams are interval [4].
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 19/26
Hard clustering for learning statistical mixtures
Location families:
F = {f (x; μ) =
1
σ
f0(
x − μ
σ
), μ ∈ R}
f0 standard density, σ > 0 fixed. Cauchy or Laplacian families have
density graphs intersecting in exactly one point.
→ singly-connected Maximum Likelihood Voronoi cells.
Model selection: Akaike Information Criterion [5] (AIC):
AIC(x1, ..., xn) = −2l(x1, ..., xn) + 2k +
2k(k + 1)
n − k − 1
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 20/26
Experiments with: Gaussian Mixture Models (GMMs)
gmm1 score = −3.0754314021966658 (Euclidean k-means) gmm2
score = −3.038795325884112 (Bregman k-means, better)
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 21/26
Conclusion
Generic DP for solving interval clustering:
O(n2
kT1(n))-time using O(nk) memory
O(n2
T1(n)) time using O(n2
) memory
Refine DP by adding minimum/maximum cluster size
constraints
Model selection from DP table
Two applications:
1D Bregman r -clustering. 1D Bregman k-means in O(n2
k)
time using O(nk) memory using Summed Area Tables (SATs)
Mixture learning maximizing the complete likelihood:
For uni-order exponential families amount to a dual Bregman
k-means on Y = {yi = t(xi )}i
For location families with density graph intersecting pairwise
in one point (Cauchy, Laplacian: ∈ exponential families)
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 22/26
Perspectives
Ω(n log n) for sorting.
Hierarchical center-based clustering with single-linkage: clustering
tree.
Best k-partition pruning using DP [?]:
Optimal for α = 2 +
√
3-perturbation resilient instances.
Time O(nk2 + nT1(n))
Question: How to maintain dynamically an optimal contiguous
clustering? (core-set approximation in the streaming model [9])
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 23/26
Bibliography I
Daniel Aloise, Amit Deshpande, Pierre Hansen, and Preyas Popat.
Np-hardness of Euclidean sum-of-squares clustering.
Machine Learning, 75(2):245–248, 2009.
Arindam Banerjee, Srujana Merugu, Inderjit S. Dhillon, and Joydeep Ghosh.
Clustering with Bregman divergences.
Journal of Machine Learning Research, 6:1705–1749, 2005.
Richard Bellman.
A note on cluster analysis and dynamic programming.
Mathematical Biosciences, 18(3-4):311 – 312, 1973.
Jean-Daniel Boissonnat, Frank Nielsen, and Richard Nock.
Bregman Voronoi diagrams.
Discrete Computational Geometry, 44(2):281–307, September 2010.
J. Cavanaugh.
Unifying the derivations for the Akaike and corrected Akaike information criteria.
Statistics & Probability Letters, 33(2):201–208, April 1997.
Franklin C. Crow.
Summed-area tables for texture mapping.
In Proceedings of the 11th Annual Conference on Computer Graphics and Interactive Techniques,
SIGGRAPH ’84, pages 207–212, New York, NY, USA, 1984. ACM.
Sanjoy Dasgupta.
The hardness of k-means clustering.
Technical Report CS2008-0916.
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 24/26
Bibliography II
Walter D Fisher.
On grouping for maximum homogeneity.
Journal of the American Statistical Association, 53(284):789–798, 1958.
Sariel Har-Peled and Akash Kushal.
Smaller coresets for k-median and k-means clustering.
In Proceedings of the Twenty-first Annual Symposium on Computational Geometry, SCG ’05, pages
126–134, New York, NY, USA, 2005. ACM.
Meizhu Liu, Baba C. Vemuri, Shun ichi Amari, and Frank Nielsen.
Shape retrieval using hierarchical total Bregman soft clustering.
IEEE Trans. Pattern Anal. Mach. Intell., 34(12):2407–2419, 2012.
Meena Mahajan, Prajakta Nimbhorkar, and Kasturi R. Varadarajan.
The planar k-means problem is NP-hard.
Theoretical Computer Science, 442:13–21, 2012.
Nimrod Megiddo and Kenneth J Supowit.
On the complexity of some common geometric location problems.
SIAM journal on computing, 13(1):182–196, 1984.
Frank Nielsen.
k-mle: A fast algorithm for learning statistical mixture models.
CoRR, abs/1203.5181, 2012.
Frank Nielsen and Richard Nock.
Sided and symmetrized Bregman centroids.
IEEE Transactions on Information Theory, 55(6):2882–2904, 2009.
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 25/26
Bibliography III
Haizhou Wang and Mingzhou Song.
Ckmeans.1d.dp: Optimal k-means clustering in one dimension by dynamic programming.
R Journal, 3(2), 2011.
c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 26/26

Weitere ähnliche Inhalte

Was ist angesagt?

Problem Understanding through Landscape Theory
Problem Understanding through Landscape TheoryProblem Understanding through Landscape Theory
Problem Understanding through Landscape Theoryjfrchicanog
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksValentin De Bortoli
 
Continuous and Discrete-Time Analysis of SGD
Continuous and Discrete-Time Analysis of SGDContinuous and Discrete-Time Analysis of SGD
Continuous and Discrete-Time Analysis of SGDValentin De Bortoli
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsNAVER Engineering
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaAlexander Litvinenko
 
Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)Frank Nielsen
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distancesChristian Robert
 
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresLinear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresAnmol Dwivedi
 
Optimal L-shaped matrix reordering, aka graph's core-periphery
Optimal L-shaped matrix reordering, aka graph's core-peripheryOptimal L-shaped matrix reordering, aka graph's core-periphery
Optimal L-shaped matrix reordering, aka graph's core-peripheryFrancesco Tudisco
 
Tutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian NetworksTutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian NetworksAnmol Dwivedi
 

Was ist angesagt? (20)

Problem Understanding through Landscape Theory
Problem Understanding through Landscape TheoryProblem Understanding through Landscape Theory
Problem Understanding through Landscape Theory
 
QMC: Transition Workshop - Reduced Component-by-Component Constructions of (P...
QMC: Transition Workshop - Reduced Component-by-Component Constructions of (P...QMC: Transition Workshop - Reduced Component-by-Component Constructions of (P...
QMC: Transition Workshop - Reduced Component-by-Component Constructions of (P...
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
 
Continuous and Discrete-Time Analysis of SGD
Continuous and Discrete-Time Analysis of SGDContinuous and Discrete-Time Analysis of SGD
Continuous and Discrete-Time Analysis of SGD
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representations
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
A nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formulaA nonlinear approximation of the Bayesian Update formula
A nonlinear approximation of the Bayesian Update formula
 
Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresLinear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
 
Optimal L-shaped matrix reordering, aka graph's core-periphery
Optimal L-shaped matrix reordering, aka graph's core-peripheryOptimal L-shaped matrix reordering, aka graph's core-periphery
Optimal L-shaped matrix reordering, aka graph's core-periphery
 
Tutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian NetworksTutorial on Belief Propagation in Bayesian Networks
Tutorial on Belief Propagation in Bayesian Networks
 

Andere mochten auch

Divergence clustering
Divergence clusteringDivergence clustering
Divergence clusteringFrank Nielsen
 
Universidad fermin toro
Universidad fermin toroUniversidad fermin toro
Universidad fermin torogabrieling
 
Biz op 30 min nov 9 2013
Biz op 30 min nov 9 2013Biz op 30 min nov 9 2013
Biz op 30 min nov 9 2013MemberEducation
 
Open house nov 7 2013 m&m team
Open house nov 7 2013 m&m teamOpen house nov 7 2013 m&m team
Open house nov 7 2013 m&m teamMemberEducation
 
Bregman Voronoi Diagrams (SODA 2007)
Bregman Voronoi Diagrams (SODA 2007)  Bregman Voronoi Diagrams (SODA 2007)
Bregman Voronoi Diagrams (SODA 2007) Frank Nielsen
 
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...Frank Nielsen
 
How To Make Homemade Potato Chips
How To Make Homemade Potato ChipsHow To Make Homemade Potato Chips
How To Make Homemade Potato ChipsKristy Kay
 
Unit 4 home work
Unit 4 home workUnit 4 home work
Unit 4 home worktamma07
 
Gabrielgonzalez tarea
Gabrielgonzalez tareaGabrielgonzalez tarea
Gabrielgonzalez tareagabrieling
 
My View: Everything I need to know, I learned in music class – Schools of Tho...
My View: Everything I need to know, I learned in music class – Schools of Tho...My View: Everything I need to know, I learned in music class – Schools of Tho...
My View: Everything I need to know, I learned in music class – Schools of Tho...chillysemester360
 
Potential Genre 2
Potential Genre 2Potential Genre 2
Potential Genre 2idil moali
 
Props for media
Props for mediaProps for media
Props for mediaidil moali
 
Drumstick Bars Recipe
Drumstick Bars RecipeDrumstick Bars Recipe
Drumstick Bars RecipeKristy Kay
 
無料Androidアプリ[locket]ご紹介
無料Androidアプリ[locket]ご紹介無料Androidアプリ[locket]ご紹介
無料Androidアプリ[locket]ご紹介Keisuke Matsuo
 

Andere mochten auch (20)

Divergence clustering
Divergence clusteringDivergence clustering
Divergence clustering
 
Universidad fermin toro
Universidad fermin toroUniversidad fermin toro
Universidad fermin toro
 
2011年9月3日 bop business
2011年9月3日 bop business2011年9月3日 bop business
2011年9月3日 bop business
 
Biz op 30 min nov 9 2013
Biz op 30 min nov 9 2013Biz op 30 min nov 9 2013
Biz op 30 min nov 9 2013
 
Open house nov 7 2013 m&m team
Open house nov 7 2013 m&m teamOpen house nov 7 2013 m&m team
Open house nov 7 2013 m&m team
 
Slow PC? Find out the reasons and solve it!
Slow PC? Find out the reasons and solve it!Slow PC? Find out the reasons and solve it!
Slow PC? Find out the reasons and solve it!
 
Bregman Voronoi Diagrams (SODA 2007)
Bregman Voronoi Diagrams (SODA 2007)  Bregman Voronoi Diagrams (SODA 2007)
Bregman Voronoi Diagrams (SODA 2007)
 
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
 
How To Make Homemade Potato Chips
How To Make Homemade Potato ChipsHow To Make Homemade Potato Chips
How To Make Homemade Potato Chips
 
Ppt oral
Ppt oralPpt oral
Ppt oral
 
Cic synthesys
Cic synthesysCic synthesys
Cic synthesys
 
Unit 4 home work
Unit 4 home workUnit 4 home work
Unit 4 home work
 
Gabrielgonzalez tarea
Gabrielgonzalez tareaGabrielgonzalez tarea
Gabrielgonzalez tarea
 
My View: Everything I need to know, I learned in music class – Schools of Tho...
My View: Everything I need to know, I learned in music class – Schools of Tho...My View: Everything I need to know, I learned in music class – Schools of Tho...
My View: Everything I need to know, I learned in music class – Schools of Tho...
 
matematicas
matematicasmatematicas
matematicas
 
Newspapers
NewspapersNewspapers
Newspapers
 
Potential Genre 2
Potential Genre 2Potential Genre 2
Potential Genre 2
 
Props for media
Props for mediaProps for media
Props for media
 
Drumstick Bars Recipe
Drumstick Bars RecipeDrumstick Bars Recipe
Drumstick Bars Recipe
 
無料Androidアプリ[locket]ご紹介
無料Androidアプリ[locket]ご紹介無料Androidアプリ[locket]ご紹介
無料Androidアプリ[locket]ご紹介
 

Ähnlich wie Optimal interval clustering and statistical mixture learning via dynamic programming

Divergence center-based clustering and their applications
Divergence center-based clustering and their applicationsDivergence center-based clustering and their applications
Divergence center-based clustering and their applicationsFrank Nielsen
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Frank Nielsen
 
On approximating the Riemannian 1-center
On approximating the Riemannian 1-centerOn approximating the Riemannian 1-center
On approximating the Riemannian 1-centerFrank Nielsen
 
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Frank Nielsen
 
On Clustering Histograms with k-Means by Using Mixed α-Divergences
 On Clustering Histograms with k-Means by Using Mixed α-Divergences On Clustering Histograms with k-Means by Using Mixed α-Divergences
On Clustering Histograms with k-Means by Using Mixed α-DivergencesFrank Nielsen
 
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Huang Po Chun
 
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...Frank Nielsen
 
Slides: A glance at information-geometric signal processing
Slides: A glance at information-geometric signal processingSlides: A glance at information-geometric signal processing
Slides: A glance at information-geometric signal processingFrank Nielsen
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesFrank Nielsen
 
Slides: The dual Voronoi diagrams with respect to representational Bregman di...
Slides: The dual Voronoi diagrams with respect to representational Bregman di...Slides: The dual Voronoi diagrams with respect to representational Bregman di...
Slides: The dual Voronoi diagrams with respect to representational Bregman di...Frank Nielsen
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Meanstthonet
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
Slides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histogramsSlides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histogramsFrank Nielsen
 
On the smallest enclosing information disk
 On the smallest enclosing information disk On the smallest enclosing information disk
On the smallest enclosing information diskFrank Nielsen
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfAlexander Litvinenko
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPK Lehre
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPer Kristian Lehre
 
SPDE presentation 2012
SPDE presentation 2012SPDE presentation 2012
SPDE presentation 2012Zheng Mengdi
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Alexander Litvinenko
 
Automatic Bayesian method for Numerical Integration
Automatic Bayesian method for Numerical Integration Automatic Bayesian method for Numerical Integration
Automatic Bayesian method for Numerical Integration Jagadeeswaran Rathinavel
 

Ähnlich wie Optimal interval clustering and statistical mixture learning via dynamic programming (20)

Divergence center-based clustering and their applications
Divergence center-based clustering and their applicationsDivergence center-based clustering and their applications
Divergence center-based clustering and their applications
 
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
Slides: On the Chi Square and Higher-Order Chi Distances for Approximating f-...
 
On approximating the Riemannian 1-center
On approximating the Riemannian 1-centerOn approximating the Riemannian 1-center
On approximating the Riemannian 1-center
 
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
 
On Clustering Histograms with k-Means by Using Mixed α-Divergences
 On Clustering Histograms with k-Means by Using Mixed α-Divergences On Clustering Histograms with k-Means by Using Mixed α-Divergences
On Clustering Histograms with k-Means by Using Mixed α-Divergences
 
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
 
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
 
Slides: A glance at information-geometric signal processing
Slides: A glance at information-geometric signal processingSlides: A glance at information-geometric signal processing
Slides: A glance at information-geometric signal processing
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective Divergences
 
Slides: The dual Voronoi diagrams with respect to representational Bregman di...
Slides: The dual Voronoi diagrams with respect to representational Bregman di...Slides: The dual Voronoi diagrams with respect to representational Bregman di...
Slides: The dual Voronoi diagrams with respect to representational Bregman di...
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Means
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
Slides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histogramsSlides: Jeffreys centroids for a set of weighted histograms
Slides: Jeffreys centroids for a set of weighted histograms
 
On the smallest enclosing information disk
 On the smallest enclosing information disk On the smallest enclosing information disk
On the smallest enclosing information disk
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdf
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
SPDE presentation 2012
SPDE presentation 2012SPDE presentation 2012
SPDE presentation 2012
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 
Automatic Bayesian method for Numerical Integration
Automatic Bayesian method for Numerical Integration Automatic Bayesian method for Numerical Integration
Automatic Bayesian method for Numerical Integration
 

Kürzlich hochgeladen

Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)itwameryclare
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxSimeonChristian
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 

Kürzlich hochgeladen (20)

Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 

Optimal interval clustering and statistical mixture learning via dynamic programming

  • 1. Optimal interval clustering: Application to Bregman clustering and statistical mixture learning Frank Nielsen1 Richard Nock2 1Sony Computer Science Laboratories/Ecole Polytechnique 2UAG-CEREGMIA/NICTA May 2014 c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 1/26
  • 2. Hard clustering: Partitioning the data set Partition X = {x1, ..., xn} ⊂ X into k clusters C1 ⊂ X, ..., Ck ⊂ X: X = k i=1 Ci Center-based (prototype) hard clustering: k-means [2], k-medians, k-center, r -center [10], etc. Model-based hard clustering: statistical mixtures maximizing the complete likelihood (prototype=model paramter). k-means: NP-hard when d > 1 and k > 1 [11, 7, 1]. k-medians and k-centers: NP-hard [12] (1984) In 1D, k-means is polynomial [3, 15]: O(n2k). c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 2/26
  • 3. Euclidean 1D k-means 1D k-means [8] has contiguous partition. Solved by enumerating all n−1 k−1 partitions in 1D (1958). Better than Stirling numbers of the second kind S(n, k) that count all partitions. Polynomial in time O(n2k) using Dynamic Programming (DP) [3] (sketched in 1973 in two pages). R package Ckmeans.1d.dp [15] (2011). c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 3/26
  • 4. Interval clustering: Structure Sort X ∈ X with respect to total order < on X in O(n log n). Output represented by: k intervals Ii = [xli , xri ] such that Ci = Ii ∩ X. or better k − 1 delimiters li (i ∈ {2, ..., k}) since ri = li+1 − 1 (i < k and rk = n) and l1 = 1. [x1...xl2−1] C1 [xl2 ...xl3−1] C2 ... [xlk ...xn] Ck c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 4/26
  • 5. Objective function for interval clustering Scalars x1 < ... < xn are partitioned contiguously into k clusters: C1 < ... < Ck. Clustering objective function: min ek(X) = k j=1 e1(Cj ) c1(·): intra-cluster cost/energy ⊕: inter-cluster cost/energy (commutative, associative) n = kp + 1 1D points equally distributed → k different optimal clustering partitions c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 5/26
  • 6. Examples of objective functions In arbitrary dimension X = Rd : r -clustering (r ≥ 1): = e1(Cj ) = min p∈X ⎛ ⎝ x∈Cj d(x, p)r ⎞ ⎠ (argmin=prototype pj is the same whether we take power of 1 r of sum or not) Euclidean r -clustering: r = 1 median, r = 2 means. k-center (limr→∞): = max e1(Ci ) = min p∈X max x∈Cj d(x, p) Discrete clustering: Search space in min is Cj instead of X. Note that in 1D, s-norm distance is always d(p, q) = |p − q|, independent of s ≥ 1. c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 6/26
  • 7. Optimal interval clustering by Dynamic Programming Xj,i = {xj , ..., xi } (j ≤ i) Xi = X1,i = {x1, ..., xi } E = [ei,j]: n × k cost matrix, O(n × k) memory ei,m = em(Xi ) Optimality equation: ei,m = min m≤j≤i {ej−1,m−1 ⊕ e1(Xj,i )} Associative/commutative operator ⊕ (+ or max). Initialize with ci,1 = c1(Xi ) E: compute from left to right column, from bottom to top. Best clustering solution cost is at en,k. Time: n × k × O(n) × T1(n)=O(n2kT1(n)), O(nk) memory c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 7/26
  • 8. Retrieving the solution: Backtracking Use an auxiliary matrix S = [si.j] for storing the argmin. Backtrack in O(k) time. Left index lk of Ck stored at sn,k: lk = sn,k. Iteratively retrieve the previous left interval indexes at entries lj−1 = slj −1,i for j = k − 1, ..., j = 1. Note that lj − 1 = n − k l=j nl and lj − 1 = j−1 l=1 nl . c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 8/26
  • 9. Optimizing time with a Look Up Table (LUT) Save time when computing e1(Xj,i ) since we perform n × k × O(n) such computations. Look Up Table (LUT): Add extra n × n matrix E1 with E1[j][i] = e1(Xj,i ). Build in O(n2T1(n))... Then DP in O(n2k)=O(n2T1(n))). → quadratic amount of memory (n > 10000...) c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 9/26
  • 10. DP solver with cluster size constraints n− i and n+ i : lower/upper bound constraints on ni = |Ci | k l=1 = n− i ≤ n and k l=1 = n+ i ≥ n. When no constraints: add dummy constraints n− i = 1 and n+ i = n − k + 1. nm = |Cm| = i − j + 1 such that n− m ≤ nm ≤ n+ m. → j ≤ i + 1 − n− m and j ≥ i + 1 − n+ m. ei,m = min max{1+ m−1 l=1 n− l ,i+1−n+ m}≤j j≤i+1−n− m {ej−1,m−1 ⊕ e1(Xj,i )} , c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 10/26
  • 11. Model selection from the DP table m(k) = ek (X) e1(X) decreases with k and reaches minimum when k = n. Model selection: trade-off choose best model among all the models with k ∈ [1, n]. Regularized objective function: ek (X) = ek(X) + f (k), f (k) related to model complexity. Compute the DP table for k = n, ..., 1 and avoids redundant computations. Then compute the criterion for the last line (indexed by n) and choose the argmin of ek. c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 11/26
  • 12. A Voronoi cell condition for DP optimality elements → interval clusters → prototypes interval clusters ← prototypes Voronoi diagram: Partition X wrt. P = {p1, ..., pk }. Voronoi cell: V (pj) = {x ∈ X : dr (x, pj ) ≤ dr (x, pl ) ∀l ∈ {1, ..., k}}. xr is a monotonically increasing function on R+, equivalent to V (pj ) = {x ∈ X : d(x : pj ) < d(x : pl )} DP guarantees optimal clustering when ∀P, V (pj ) is an interval 2-clustering exhibits the Voronoi bisector. c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 12/26
  • 13. 1-mean (centroid): O(n) time min p n i=1 (xi − p)2 D(x, p) = (x − p)2 , D (x, p) = 2(x − p), D (x, p) = 2 Convex optimization (existence and unique solution) n i=1 D (x, p) = 0 ⇒ n i=1 xi − np = 0 Center of mass p = 1 n n i=1 xi − (barycenter) Extends to Bregman divergence: DF (x, p) = F(x) − F(p) − (x − p)F (p) c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 13/26
  • 14. 2-means: O(n log n) time Find xl2 (n − 1 potential locations for xl : from x2 to xn): min xl2 {e1(C1) + e1(C2)} Browse from left to right l2 = x2, ...., xn. Update cost in constant time E2(l + 1) from E2(l) (SATs also O(1)): E2(l) = e2(x1...xl−1|xl ...xn) μ1(l + 1) = (l − 1)μ1(l) + xl l , μ2(l + 1) = (n − l + 1)μ2(l) − xl n − l v1(l + 1) = l i=1 (xi − μ1(l + 1))2 = l i=1 x2 i − lμ2 1(l + 1) ΔE2(l) = l − 1 l μ1(l) − xl 2 + n − l + 1 n − l μ2(l) − xl 2 c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 14/26
  • 15. 2-means: Experiments Intel Win7 i7-4800 n Brute force SAT Incremental 300000 155.022 0.010 0.0091 1000000 1814.44 0.018 0.015 Do we need sorting and Ω(n log n) time? (k = 1 is linear time) Note that MaxGap does not yield the separator (because centroid is sum of squared distance minimizer) c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 15/26
  • 16. Optimal 1D Bregman k-means Bregman information [2] e1 (generalizes cluster variance): e1(Cj ) = min xl ∈Cj wl BF (xl : pj ). (1) Expressed as [14]: e1(Cj ) = xl ∈Cj wl (pj F (pj ) − F(pj )) + xl ∈Cj wl F(xl ) − F (pj ) x∈Cj wl x process using Summed Area Tables [6] (SATs) S1(j) = j l=1 wl , S2(j) = j l=1 wl xl , and S3(j) = j l=1 wl F(xl ) in O(n) time at preprocessing stage. Evaluate the Bregman information e1(Xj,i ) in constant time O(1). For example, i l=j wl F(xl ) = S3(i) − S3(j − 1) with S3(0) = 0. Bregman Voronoi diagrams have connected cells [4] thus DP yields optimal interval clustering. c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 16/26
  • 17. Exponential families in statistics Family of probability distributions: F = {pF (x; θ) : θ ∈ Θ} Exponential families [13]: pF (x|θ) = exp (t(x)θ − F(θ) + k(x)) , For example: univariate Rayleigh R(σ), t(x) = x2, k(x) = log x, θ = − 1 2σ2 , η = −1 θ , F(θ) = log − 1 2θ and F∗(η) = −1 + log 2 η . c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 17/26
  • 18. Uniorder exponential families: MLE Maximum Likelihood Estimator (MLE) [13]: e1(Xj,i ) = ˆl(xj , ..., xi ) = F∗ (ˆηj,i ) + 1 i − j + 1 i l=j k(xl ). with ˆηj,i = 1 i−j+1 i l=j t(xl ). By making a change of variable yl = t(xl ), and not accounting the k(xl ) terms that are constant for any clustering, we get e1(Xj,i ) ≡ F∗ ⎛ ⎝ 1 i − j + 1 i l=j yl ⎞ ⎠ c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 18/26
  • 19. Hard clustering for learning statistical mixtures Expectation-Maximization learns monotonically from an initialization by maximizing the incomplete log-likelihood. Mixture maximizing the complete log-likelihood: lc(X; L, Ω) = n i=1 log(αli p(xi ; θli )), L = {li }i : hidden labels. max lc ≡ min θ1,...,θk n i=1 k min j=1 (− log p(xi ; θj ) − log αj ). Given fixed α and − log pF (x; θ) amounts to a dual Bregman divergence[2]. Run Bregman k-means and DP yields optimal partition since additively-weighted Bregman Voronoi diagrams are interval [4]. c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 19/26
  • 20. Hard clustering for learning statistical mixtures Location families: F = {f (x; μ) = 1 σ f0( x − μ σ ), μ ∈ R} f0 standard density, σ > 0 fixed. Cauchy or Laplacian families have density graphs intersecting in exactly one point. → singly-connected Maximum Likelihood Voronoi cells. Model selection: Akaike Information Criterion [5] (AIC): AIC(x1, ..., xn) = −2l(x1, ..., xn) + 2k + 2k(k + 1) n − k − 1 c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 20/26
  • 21. Experiments with: Gaussian Mixture Models (GMMs) gmm1 score = −3.0754314021966658 (Euclidean k-means) gmm2 score = −3.038795325884112 (Bregman k-means, better) c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 21/26
  • 22. Conclusion Generic DP for solving interval clustering: O(n2 kT1(n))-time using O(nk) memory O(n2 T1(n)) time using O(n2 ) memory Refine DP by adding minimum/maximum cluster size constraints Model selection from DP table Two applications: 1D Bregman r -clustering. 1D Bregman k-means in O(n2 k) time using O(nk) memory using Summed Area Tables (SATs) Mixture learning maximizing the complete likelihood: For uni-order exponential families amount to a dual Bregman k-means on Y = {yi = t(xi )}i For location families with density graph intersecting pairwise in one point (Cauchy, Laplacian: ∈ exponential families) c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 22/26
  • 23. Perspectives Ω(n log n) for sorting. Hierarchical center-based clustering with single-linkage: clustering tree. Best k-partition pruning using DP [?]: Optimal for α = 2 + √ 3-perturbation resilient instances. Time O(nk2 + nT1(n)) Question: How to maintain dynamically an optimal contiguous clustering? (core-set approximation in the streaming model [9]) c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 23/26
  • 24. Bibliography I Daniel Aloise, Amit Deshpande, Pierre Hansen, and Preyas Popat. Np-hardness of Euclidean sum-of-squares clustering. Machine Learning, 75(2):245–248, 2009. Arindam Banerjee, Srujana Merugu, Inderjit S. Dhillon, and Joydeep Ghosh. Clustering with Bregman divergences. Journal of Machine Learning Research, 6:1705–1749, 2005. Richard Bellman. A note on cluster analysis and dynamic programming. Mathematical Biosciences, 18(3-4):311 – 312, 1973. Jean-Daniel Boissonnat, Frank Nielsen, and Richard Nock. Bregman Voronoi diagrams. Discrete Computational Geometry, 44(2):281–307, September 2010. J. Cavanaugh. Unifying the derivations for the Akaike and corrected Akaike information criteria. Statistics & Probability Letters, 33(2):201–208, April 1997. Franklin C. Crow. Summed-area tables for texture mapping. In Proceedings of the 11th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’84, pages 207–212, New York, NY, USA, 1984. ACM. Sanjoy Dasgupta. The hardness of k-means clustering. Technical Report CS2008-0916. c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 24/26
  • 25. Bibliography II Walter D Fisher. On grouping for maximum homogeneity. Journal of the American Statistical Association, 53(284):789–798, 1958. Sariel Har-Peled and Akash Kushal. Smaller coresets for k-median and k-means clustering. In Proceedings of the Twenty-first Annual Symposium on Computational Geometry, SCG ’05, pages 126–134, New York, NY, USA, 2005. ACM. Meizhu Liu, Baba C. Vemuri, Shun ichi Amari, and Frank Nielsen. Shape retrieval using hierarchical total Bregman soft clustering. IEEE Trans. Pattern Anal. Mach. Intell., 34(12):2407–2419, 2012. Meena Mahajan, Prajakta Nimbhorkar, and Kasturi R. Varadarajan. The planar k-means problem is NP-hard. Theoretical Computer Science, 442:13–21, 2012. Nimrod Megiddo and Kenneth J Supowit. On the complexity of some common geometric location problems. SIAM journal on computing, 13(1):182–196, 1984. Frank Nielsen. k-mle: A fast algorithm for learning statistical mixture models. CoRR, abs/1203.5181, 2012. Frank Nielsen and Richard Nock. Sided and symmetrized Bregman centroids. IEEE Transactions on Information Theory, 55(6):2882–2904, 2009. c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 25/26
  • 26. Bibliography III Haizhou Wang and Mingzhou Song. Ckmeans.1d.dp: Optimal k-means clustering in one dimension by dynamic programming. R Journal, 3(2), 2011. c 2014 Frank Nielsen, Sony Computer Science Laboratories, Inc. 26/26