Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
On Approximating the Smallest Enclosing Bregman Balls
1. On Approximating the Smallest Enclosing
Bregman Balls
Frank Nielsen1
Richard Nock2
1 Sony
Computer Science Laboratories, Inc.
Fundamental Research Laboratory
Frank.Nielsen@acm.org
2 University
of Antilles-Guyanne
DSI-GRIMAAG
Richard.Nock@martinique.univ-ag.fr
March 2006
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
2. Smallest Enclosing Balls
Problem
Given S = {s1 , ..., sn }, compute a simplified description, called
the center, that fits well S (i.e., summarizes S).
Two optimization criteria:
M IN AVG Find a center c∗ which minimizes the average
distortion w.r.t S: c∗ = argminc i d(c, si ).
M IN M AX Find a center c∗ which minimizes the maximal
distortion w.r.t S: c∗ = argminc maxi d(c, si ).
Investigated in Applied Mathematics:
Computational geometry (1-center problem),
Computational statistics (1-point estimator),
Machine learning (1-class classification),
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
3. Smallest Enclosing Balls in Computational Geometry
Distortion measure d(·, ·) is the geometric distance:
Euclidean distance L2 .
c∗ is the circumcenter of S for M IN M AX,
Squared Euclidean distance L2 .
2
c∗ is the centroid of S for M IN AVG (→ k -means),
Euclidean distance L2 .
c∗ is the Fermat-Weber point for M IN AVG.
Centroid
M IN AVG L2
2
Circumcenter
M IN M AX L2
F. Nielsen and R. Nock
Fermat-Weber
M IN AVG L2
On Approximating the Smallest Enclosing Bregman Balls
4. Core-sets for M IN M AX Ball
˘
→ Introduced by Badoiu and Clarkson [BC’02]
Approximating M IN M AX [BC’02]: ||c − c∗ || ≤ r ∗
A -approximation for the M IN M AX ball can be found in O( dn )
2
time using algorithm BC. [for point/ball sets]
Algorithm BC(S, T )
Input: S = {s1 , s2 , ..., sn }
Output: Circumcenter c such that ||c − c∗ || ≤
Choose at random c ∈ S
for t = 1, 2, ..., T do
Find furthest point
s ← arg maxs ∈S c − s
Update circumcenter
t
1
c ← t+1 c + t+1 s
r∗
√
T
2
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
5. Demo Algorithm BC: Initialization
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
6. Demo Algorithm BC: Iteration 1
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
7. Demo Algorithm BC: Iteration 2
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
8. Demo Algorithm BC: Iteration 3
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
9. Demo Algorithm BC: Iteration 4
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
10. Demo Algorithm BC: Iteration 5
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
11. Demo Algorithm BC: Iteration 6
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
12. Demo Algorithm BC: Iteration 7
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
13. Demo Algorithm BC: Iteration 8
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
14. Demo Algorithm BC: Iteration 9
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
15. Demo Algorithm BC: Summary
After 10 iterations, we visualize the ball traces and the core-set.
Ball traces
Core-set
Core-set’s size is independent of the dimension d.
(depends only on 1 )
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
16. Distortions: Bregman Divergences
Definition
Bregman divergences are parameterized (F ) families of
distortions.
Let F : X −→ R, such that F is strictly convex and differentiable
on int(X ), for a convex domain X ⊆ Rd .
Bregman divergence DF :
DF (x, y) = F (x) − F (y) − x − y,
F
·, ·
:
:
F (y)
.
gradient operator of F
Inner product (dot product)
(→ DF is the tail of a Taylor expansion of F )
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
17. Visualizing F and DF
F (·)
DF (x, y)
x − y,
y
F (y)
x
DF (x, y) = F (x) − F (y) − x − y,
F (y)
.
(→ DF is the tail of a Taylor expansion of F )
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
18. Bregman Balls
Two Bregman balls:
Bc,r = {x ∈ X : DF ( c , x) ≤ r }, Bc,r = {x ∈ X : DF (x, c ) ≤ r }
Euclidean Ball: Bc,r = {x ∈ X : x − c
(r : squared radius.
L2 :
2
2
2
≤ r } = Bc,r
Bregman divergence F (x) =
d
i=1
xi2 )
Lemma [NN’05]
The smallest enclosing Bregman ball Bc∗ ,r ∗ of S is unique.
Theorem [BMDG’04]
The M IN AVG Ball for Bregman divergences is the centroid .
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
19. Applying BC for divergences yields poor result
−→ design a tailored algorithm for divergences.
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
20. Bregman BC Algorithm
BBC generalizes BC to Bregman divergences
(analysis in [NN’05]).
Algorithm BBC(S, T )
Choose at random c ∈ S
for t = 1, 2, ..., T do
Furthest point w.r.t. DF
s ← arg maxs ∈S DF (c, s )
Circumcenter update
t
1
c ← −1 t+1 F (c) + t+1
F
F (s)
Observations
BBC(L2 ) is BC.
2
DF (c, x) is convex in c but not necessarily the ball’s
boundary ∂Bc,r
(depends on x given c; see Itakura-Saito ball).
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
21. Demo BBC (Itakura-Saito): Initialization
d
DF (p, q) =
(
i=1
pi
p
− log i − 1), [F (x) = −
qi
qi
F. Nielsen and R. Nock
d
log xi ]
i=1
On Approximating the Smallest Enclosing Bregman Balls
22. Demo BBC (Itakura-Saito): Iteration 1
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
23. Demo BBC (Itakura-Saito): Iteration 2
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
24. Demo BBC (Itakura-Saito): Iteration 3
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
25. Demo BBC (Itakura-Saito): Iteration 4
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
26. Demo BBC (Itakura-Saito): Iteration 5
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
27. Demo BBC (Itakura-Saito): Iteration 6
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
28. Demo BBC (Itakura-Saito): Iteration 7
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
29. Demo BBC (Itakura-Saito): Iteration 8
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
30. Demo BBC (Itakura-Saito): Iteration 9
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
31. Demo BBC(Itakura-Saito): Summary
n = 100 points (d = 2)
Sampling Ball
All iterations
F. Nielsen and R. Nock
Core-set
On Approximating the Smallest Enclosing Bregman Balls
32. Demo BBC (Kullbach-Leibler): Initialization
d
DF (p, q) =
i=1
p
(pi log i − pi + qi ), [F (x) = −
qi
F. Nielsen and R. Nock
d
xi log xi ]
i=1
On Approximating the Smallest Enclosing Bregman Balls
33. Demo BBC (Kullbach-Leibler): Iteration 1
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
34. Demo BBC (Kullbach-Leibler): Iteration 2
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
35. Demo BBC (Kullbach-Leibler): Iteration 3
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
36. Demo BBC (Kullbach-Leibler): Iteration 4
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
37. Demo BBC (Kullbach-Leibler): Iteration 5
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
38. Demo BBC (Kullbach-Leibler): Iteration 6
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
39. Demo BBC (Kullbach-Leibler): Iteration 7
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
40. Demo BBC (Kullbach-Leibler): Iteration 8
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
41. Demo BBC (Kullbach-Leibler): Iteration 9
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
42. Demo BBC(Kullbach-Leibler): Summary
n = 100, d = 2.
Sampling Ball
All iterations
F. Nielsen and R. Nock
Core-set
On Approximating the Smallest Enclosing Bregman Balls
43. Fitting Bregman Balls
For a same dataset (drawn from a 2D Gaussian distribution)
Euclidean ball (L2 ),
2
Itakura-Saito ball,
Kullbach-Leibler ball (as known as Information ball).
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
44. BBC: Iteration 1
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
45. BBC: Iteration 2
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
46. BBC: Iteration 3
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
47. BBC: Iteration 4
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
48. BBC: Iteration 5
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
49. BBC: Iteration 6
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
50. BBC: Iteration 7
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
51. BBC: Iteration 8
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
52. BBC: Iteration 9
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
53. BBC: After 10 iterations
Squared Euclidean
Itakura-Saito
F. Nielsen and R. Nock
Kullbach-Leibler
On Approximating the Smallest Enclosing Bregman Balls
54. Experiments with BBC
Kullbach-Leibler ball, 100 runs
(n = 1000, T = 200 on the plane: d = 2)
0.5
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0
20
40
60
∗
80 100 120 140 160 180 200
∗
Plain curves (BBC): DF (c ,c)+DF (c,c ) .
2
Dashed curves: Upperbound ||c − c∗ || from [BC’02]:
F. Nielsen and R. Nock
1
T
for L2 .
2
On Approximating the Smallest Enclosing Bregman Balls
55. Bregman divergences ↔ Functional averages
Bijection (core-sets) [NN’05]
F (s)
DF (c, s)
cj (1 ≤ j ≤ d)
Rd
I
Pd
L2 norm
2
Pd
2
j=1 (cj − sj )
Arithmetic mean
Pm
i=1 αi si,j
(I +,∗ )d
R
/ d-simplex
Pd
Information/Kullbach-Leibler divergence
Pd
j=1 cj log(cj /sj ) − cj + sj
Geometric mean
Qm αi
i=1 si,j
(I +,∗ )d
R
−
Rd
I
sT Cov−1 s
p ∈ N {0, 1}
I
Itakura-Saito divergence
Pd
j=1 (cj /sj ) − log(cj /sj ) − 1
Mahalanobis divergence
(c − s)T Cov−1 (c − s)
Harmonic mean
P
1/ m (αi /si,j )
i=1
Arithmetic mean
Pm
i=1 αi si,j
Weighted power mean
Domain
R d /I +
I
R
2
j=1 sj
d
j=1 sj
Pd
j=1
(1/p)
log sj − sj
log sj
Pd
p
j=1 sj
Pd
p
c
j
j=1 p
(p−1)s
+
p
p
j
p−1
− cj sj
Pm
i=1
p−1 1/(p−1)
αi si,j
Bregman divergences ↔ Family of exponential distributions
[BMDG’04].
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
56. References
˘
[BC’02] M. Badoiu and K. L. Clarkson, Core-sets for balls,
Manuscript, 2002.
(M IN M AX core-set and (1 + )-approximation)
[NN’04] F. Nielsen and R. Nock, Approximating Smallest
Enclosing Balls, ICCSA 2004.
(Survey on M IN M AX)
[BMDG’04] A. Banerjee, S. Merugu, I. S. Dhillon, J. Ghosh,
Clustering with Bregman Divergences, SDM 2004.
(Bregman divergences and k-means)
[NN’05] R. Nock and F. Nielsen, Fitting the Smallest
Enclosing Bregman Ball, ECML 2005.
(Bregman Balls)
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls
57. Linux R and Windows R codes
Available from our Web pages:
http://www.csl.sony.co.jp/person/nielsen/
BregmanBall/
http://www.univ-ag.fr/∼rnock/BregmanBall/
Command line executable: cluster
-h
-B
display this help
choose Bregman divergence
(0: Itakura-Saito, 1: Euclidean, 2: Kullbach-Leibler (simplex), 3: Kullbach-Leibler (general), 4: Csiszar)
-d
-n
-k
-g
dimension
number of points
number of theoretical clusters
type of clusters
(0: Gaussian, 1: Ring Gaussian, 2: Uniform, 3: Gaussian with small σ, 4: Uniform Bregman)
-S
...
number of runs
...
F. Nielsen and R. Nock
On Approximating the Smallest Enclosing Bregman Balls