Talk icml

Online Dictionary Learning for
Sparse Coding

Julien Mairal1 Francis Bach1
Jean Ponce2 Guillermo Sapiro3
1
INRIA–Willow project 2 Ecole Normale Supérieure
3
University of Minnesota

ICML, Montréal, June 2008

What this talk is about
Learning eﬃciently dictionaries (basis set) for sparse
coding.
Solving a large-scale matrix factorization problem.
Making some large-scale image processing problems
tractable.
Proposing an algorithm which extends to NMF, sparse
PCA,. . .

1 The Dictionary Learning Problem

2 Online Dictionary Learning

3 Extensions

The Dictionary Learning Problem

y = xorig + w
measurements original image noise

[Elad & Aharon (’06)]

Solving the denoising problem
Extract all overlapping 8 × 8 patches xi .
Solve a matrix factorization problem:
n 1
min ||x − Dαi ||2 + λ||αi ||1 ,
αi ,D∈C i=1 2 i 2
sparsity
reconstruction

with n > 100, 000
Average the reconstruction of each patch.

[Mairal, Bach, Ponce, Sapiro & Zisserman (’09)]

Denoising result

[Mairal, Sapiro & Elad (’08)]

Image completion example

What does D look like?


n 1
min ||xi − Dαi ||2 + λ||αi ||1
2
α∈R i=1 2
k×n
D∈C

C = {D ∈ Rm×k s.t. ∀j = 1, . . . , k, ||dj ||2 ≤ 1}.

Classical optimization alternates between D
and α.
Good results, but very slow!

Online Dictionary Learning

Classical formulation of dictionary learning
1 n
min fn (D) = min l(xi , D),
D∈C D∈C n i=1

where
1
l(x, D) = mink ||x − Dα||2 + λ||α||1 .
2
α∈R 2


Which formulation are we interested in?

1 n
min f (D) = Ex [l(x, D)] ≈ lim l(xi , D)
D∈C n→+∞ n
i=1


Online learning can
handle potentially inﬁnite datasets,
adapt to dynamic training sets,
be dramatically faster than batch
algorithms [Bottou & Bousquet (’08)].

Proposed approach

1: for t=1,. . . ,T do
2: Draw xt
3: Sparse Coding
1
αt ← arg min ||xt − Dt−1 α||2 + λ||α||1 ,
2
α∈Rk 2
4: Dictionary Learning
1 t 1
Dt ← arg min ||xi −Dαi ||2 +λ||αi ||1 ,
2
D∈C t i=1 2
5: end for

Proposed approach

Implementation details
Use LARS for the sparse coding step,
Use a block-coordinate approach for the
dictionary update, with warm restart,
Use a mini-batch.

Proposed approach

Which guarantees do we have?
Under a few reasonable assumptions,
ˆ
we build a surrogate function ft of the
expected cost f verifying
ˆ
lim ft (Dt ) − f (Dt ) = 0,
t→+∞

Dt is asymptotically close to a stationary
point.

Experimental results, batch vs online

m = 8 × 8, k = 256


m = 12 × 12 × 3, k = 512


m = 16 × 16, k = 1024

Experimental results, ODL vs SGD

m = 8 × 8, k = 256


m = 12 × 12 × 3, k = 512


m = 16 × 16, k = 1024

Inpainting a 12-Mpixel photograph

Extension to NMF and sparse PCA

NMF extension

1n
min ||xi − Dαi ||2 s.t. αi ≥ 0,
2 D ≥ 0.
α∈R i=1 2
k×n
D∈C

SPCA extension
n 1
min ||xi − Dαi ||2 + λ||α1 ||1
2
α∈Rk×n i=1 2
D∈C

C = {D ∈ Rm×k s.t. ∀j ||dj ||2 + γ||dj ||1 ≤ 1}.
2

Faces: Extended Yale Database B

(a) PCA (b) NNMF (c) DL

Faces: Extended Yale Database B

(d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%

Natural Patches

(a) PCA (b) NNMF (c) DL

Natural Patches

(d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%

Conclusion

Take-home message
Online techniques are adapted to the
dictionary learning problem.
Our method makes some large-scale
image processing tasks tractable—. . .
. . . — and extends to various matrix
factorization problems.

Talk icml

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (6)

Ähnlich wie Talk icml

Ähnlich wie Talk icml (20)

Talk icml