SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Downloaden Sie, um offline zu lesen
Online Dictionary Learning for
       Sparse Coding

        Julien Mairal1 Francis Bach1
        Jean Ponce2 Guillermo Sapiro3
 1
     INRIA–Willow project 2 Ecole Normale Supérieure
                3
                  University of Minnesota

          ICML, Montréal, June 2008
What this talk is about
    Learning efficiently dictionaries (basis set) for sparse
    coding.
    Solving a large-scale matrix factorization problem.
    Making some large-scale image processing problems
    tractable.
    Proposing an algorithm which extends to NMF, sparse
    PCA,. . .
1   The Dictionary Learning Problem


2   Online Dictionary Learning


3   Extensions
1   The Dictionary Learning Problem


2   Online Dictionary Learning


3   Extensions
The Dictionary Learning Problem




              y         =       xorig        + w
         measurements       original image    noise
The Dictionary Learning Problem
[Elad & Aharon (’06)]


   Solving the denoising problem
          Extract all overlapping 8 × 8 patches xi .
          Solve a matrix factorization problem:
                         n 1
                 min         ||x − Dαi ||2 + λ||αi ||1 ,
                αi ,D∈C i=1 2 i          2
                                               sparsity
                             reconstruction

          with n > 100, 000
          Average the reconstruction of each patch.
The Dictionary Learning Problem
[Mairal, Bach, Ponce, Sapiro & Zisserman (’09)]




                       Denoising result
The Dictionary Learning Problem
[Mairal, Sapiro & Elad (’08)]




                 Image completion example
The Dictionary Learning Problem
What does D look like?
The Dictionary Learning Problem


                 n 1
           min       ||xi − Dαi ||2 + λ||αi ||1
                                  2
          α∈R i=1 2
             k×n
            D∈C

   C = {D ∈ Rm×k s.t. ∀j = 1, . . . , k, ||dj ||2 ≤ 1}.


       Classical optimization alternates between D
       and α.
       Good results, but very slow!
1   The Dictionary Learning Problem


2   Online Dictionary Learning


3   Extensions
Online Dictionary Learning



  Classical formulation of dictionary learning
                               1 n
              min fn (D) = min       l(xi , D),
              D∈C          D∈C n i=1

  where
                        1
          l(x, D) = mink ||x − Dα||2 + λ||α||1 .
                                   2
                    α∈R 2
Online Dictionary Learning




  Which formulation are we interested in?

                                   1 n
    min f (D) = Ex [l(x, D)] ≈ lim       l(xi , D)
    D∈C                       n→+∞ n
                                     i=1
Online Dictionary Learning



  Online learning can
       handle potentially infinite datasets,
       adapt to dynamic training sets,
       be dramatically faster than batch
       algorithms [Bottou & Bousquet (’08)].
Online Dictionary Learning
Proposed approach


     1: for t=1,. . . ,T do
     2:    Draw xt
     3:    Sparse Coding
                        1
            αt ← arg min ||xt − Dt−1 α||2 + λ||α||1 ,
                                        2
                  α∈Rk 2
     4:    Dictionary Learning
                          1 t 1
           Dt ← arg min           ||xi −Dαi ||2 +λ||αi ||1 ,
                                              2
                    D∈C   t i=1 2
     5: end for
Online Dictionary Learning
Proposed approach




   Implementation details
         Use LARS for the sparse coding step,
         Use a block-coordinate approach for the
         dictionary update, with warm restart,
         Use a mini-batch.
Online Dictionary Learning
Proposed approach


   Which guarantees do we have?
   Under a few reasonable assumptions,
                                       ˆ
         we build a surrogate function ft of the
         expected cost f verifying
                        ˆ
                    lim ft (Dt ) − f (Dt ) = 0,
                    t→+∞

         Dt is asymptotically close to a stationary
         point.
Online Dictionary Learning
Experimental results, batch vs online




                       m = 8 × 8, k = 256
Online Dictionary Learning
Experimental results, batch vs online




                   m = 12 × 12 × 3, k = 512
Online Dictionary Learning
Experimental results, batch vs online




                     m = 16 × 16, k = 1024
Online Dictionary Learning
Experimental results, ODL vs SGD




                     m = 8 × 8, k = 256
Online Dictionary Learning
Experimental results, ODL vs SGD




                 m = 12 × 12 × 3, k = 512
Online Dictionary Learning
Experimental results, ODL vs SGD




                   m = 16 × 16, k = 1024
Online Dictionary Learning
Inpainting a 12-Mpixel photograph
Online Dictionary Learning
Inpainting a 12-Mpixel photograph
Online Dictionary Learning
Inpainting a 12-Mpixel photograph
Online Dictionary Learning
Inpainting a 12-Mpixel photograph
1   The Dictionary Learning Problem


2   Online Dictionary Learning


3   Extensions
Extension to NMF and sparse PCA

  NMF extension

           1n
    min      ||xi − Dαi ||2 s.t. αi ≥ 0,
                          2                      D ≥ 0.
   α∈R i=1 2
      k×n
     D∈C

  SPCA extension
                   n 1
            min         ||xi − Dαi ||2 + λ||α1 ||1
                                     2
           α∈Rk×n i=1 2
            D∈C

   C = {D ∈ Rm×k s.t. ∀j ||dj ||2 + γ||dj ||1 ≤ 1}.
                                2
Extension to NMF and sparse PCA
Faces: Extended Yale Database B




         (a) PCA           (b) NNMF   (c) DL
Extension to NMF and sparse PCA
Faces: Extended Yale Database B




   (d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%
Extension to NMF and sparse PCA
Natural Patches




         (a) PCA   (b) NNMF       (c) DL
Extension to NMF and sparse PCA
Natural Patches




   (d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%
Conclusion


  Take-home message
      Online techniques are adapted to the
      dictionary learning problem.
      Our method makes some large-scale
      image processing tasks tractable—. . .
      . . . — and extends to various matrix
      factorization problems.

Weitere ähnliche Inhalte

Was ist angesagt?

Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksValentin De Bortoli
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonChun-Ming Chang
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksDenis Dus
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Boosted tree
Boosted treeBoosted tree
Boosted treeZhuyi Xue
 
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysisrik0
 
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Universitat Politècnica de Catalunya
 
Macrocanonical models for texture synthesis
Macrocanonical models for texture synthesisMacrocanonical models for texture synthesis
Macrocanonical models for texture synthesisValentin De Bortoli
 
Output Units and Cost Function in FNN
Output Units and Cost Function in FNNOutput Units and Cost Function in FNN
Output Units and Cost Function in FNNLin JiaMing
 
Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Cheng Feng
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorizationrecsysfr
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Universitat Politècnica de Catalunya
 
GAN in medical imaging
GAN in medical imagingGAN in medical imaging
GAN in medical imagingCheng-Bin Jin
 

Was ist angesagt? (20)

Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
 
Hands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in PythonHands-on Tutorial of Machine Learning in Python
Hands-on Tutorial of Machine Learning in Python
 
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
 
Gtti 10032021
Gtti 10032021Gtti 10032021
Gtti 10032021
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural Networks
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Boosted tree
Boosted treeBoosted tree
Boosted tree
 
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
 
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
The Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intelligence)
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
 
Macrocanonical models for texture synthesis
Macrocanonical models for texture synthesisMacrocanonical models for texture synthesis
Macrocanonical models for texture synthesis
 
Output Units and Cost Function in FNN
Output Units and Cost Function in FNNOutput Units and Cost Function in FNN
Output Units and Cost Function in FNN
 
Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01Epsrcws08 campbell isvm_01
Epsrcws08 campbell isvm_01
 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
 
GAN in medical imaging
GAN in medical imagingGAN in medical imaging
GAN in medical imaging
 

Andere mochten auch

Tuto part2
Tuto part2Tuto part2
Tuto part2Bo Li
 
omp-and-k-svd - Gdc2013
omp-and-k-svd - Gdc2013omp-and-k-svd - Gdc2013
omp-and-k-svd - Gdc2013Manchor Ko
 
CSTalks - GPGPU - 19 Jan
CSTalks  -  GPGPU - 19 JanCSTalks  -  GPGPU - 19 Jan
CSTalks - GPGPU - 19 Jancstalks
 
image denoising technique using disctere wavelet transform
image denoising technique using disctere wavelet transformimage denoising technique using disctere wavelet transform
image denoising technique using disctere wavelet transformalishapb
 

Andere mochten auch (6)

Tuto part2
Tuto part2Tuto part2
Tuto part2
 
omp-and-k-svd - Gdc2013
omp-and-k-svd - Gdc2013omp-and-k-svd - Gdc2013
omp-and-k-svd - Gdc2013
 
Introduction to Sparse Methods
Introduction to Sparse Methods Introduction to Sparse Methods
Introduction to Sparse Methods
 
CSTalks - GPGPU - 19 Jan
CSTalks  -  GPGPU - 19 JanCSTalks  -  GPGPU - 19 Jan
CSTalks - GPGPU - 19 Jan
 
image denoising technique using disctere wavelet transform
image denoising technique using disctere wavelet transformimage denoising technique using disctere wavelet transform
image denoising technique using disctere wavelet transform
 
Sparse representation and compressive sensing
Sparse representation and compressive sensingSparse representation and compressive sensing
Sparse representation and compressive sensing
 

Ähnlich wie Talk icml

Massive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringMassive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringArthur Mensch
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...MLconf
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationArthur Mensch
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Meanstthonet
 
SURF 2012 Final Report(1)
SURF 2012 Final Report(1)SURF 2012 Final Report(1)
SURF 2012 Final Report(1)Eric Zhang
 
Signal processingcolumbia
Signal processingcolumbiaSignal processingcolumbia
Signal processingcolumbiavevin1986
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningzukun
 
Practical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentPractical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentRaphael Reitzig
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learningbutest
 
SIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsSIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsJagadeeswaran Rathinavel
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionCharles Deledalle
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data ScienceAlbert Bifet
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applicationsFrank Nielsen
 
Divide-and-Conquer & Dynamic ProgrammingDivide-and-Conqu.docx
Divide-and-Conquer & Dynamic ProgrammingDivide-and-Conqu.docxDivide-and-Conquer & Dynamic ProgrammingDivide-and-Conqu.docx
Divide-and-Conquer & Dynamic ProgrammingDivide-and-Conqu.docxjacksnathalie
 
Topological Inference via Meshing
Topological Inference via MeshingTopological Inference via Meshing
Topological Inference via MeshingDon Sheehy
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector MachinesEdgar Marca
 

Ähnlich wie Talk icml (20)

Massive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringMassive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filtering
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Means
 
SURF 2012 Final Report(1)
SURF 2012 Final Report(1)SURF 2012 Final Report(1)
SURF 2012 Final Report(1)
 
Signal processingcolumbia
Signal processingcolumbiaSignal processingcolumbia
Signal processingcolumbia
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learning
 
Practical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient ApportionmentPractical and Worst-Case Efficient Apportionment
Practical and Worst-Case Efficient Apportionment
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learning
 
SIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithmsSIAM - Minisymposium on Guaranteed numerical algorithms
SIAM - Minisymposium on Guaranteed numerical algorithms
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Information-theoretic clustering with applications
Information-theoretic clustering  with applicationsInformation-theoretic clustering  with applications
Information-theoretic clustering with applications
 
Chapter 1 - Introduction
Chapter 1 - IntroductionChapter 1 - Introduction
Chapter 1 - Introduction
 
Divide-and-Conquer & Dynamic ProgrammingDivide-and-Conqu.docx
Divide-and-Conquer & Dynamic ProgrammingDivide-and-Conqu.docxDivide-and-Conquer & Dynamic ProgrammingDivide-and-Conqu.docx
Divide-and-Conquer & Dynamic ProgrammingDivide-and-Conqu.docx
 
Topological Inference via Meshing
Topological Inference via MeshingTopological Inference via Meshing
Topological Inference via Meshing
 
upgrade2013
upgrade2013upgrade2013
upgrade2013
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
 
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
 

Talk icml

  • 1. Online Dictionary Learning for Sparse Coding Julien Mairal1 Francis Bach1 Jean Ponce2 Guillermo Sapiro3 1 INRIA–Willow project 2 Ecole Normale Supérieure 3 University of Minnesota ICML, Montréal, June 2008
  • 2. What this talk is about Learning efficiently dictionaries (basis set) for sparse coding. Solving a large-scale matrix factorization problem. Making some large-scale image processing problems tractable. Proposing an algorithm which extends to NMF, sparse PCA,. . .
  • 3. 1 The Dictionary Learning Problem 2 Online Dictionary Learning 3 Extensions
  • 4. 1 The Dictionary Learning Problem 2 Online Dictionary Learning 3 Extensions
  • 5. The Dictionary Learning Problem y = xorig + w measurements original image noise
  • 6. The Dictionary Learning Problem [Elad & Aharon (’06)] Solving the denoising problem Extract all overlapping 8 × 8 patches xi . Solve a matrix factorization problem: n 1 min ||x − Dαi ||2 + λ||αi ||1 , αi ,D∈C i=1 2 i 2 sparsity reconstruction with n > 100, 000 Average the reconstruction of each patch.
  • 7. The Dictionary Learning Problem [Mairal, Bach, Ponce, Sapiro & Zisserman (’09)] Denoising result
  • 8. The Dictionary Learning Problem [Mairal, Sapiro & Elad (’08)] Image completion example
  • 9. The Dictionary Learning Problem What does D look like?
  • 10. The Dictionary Learning Problem n 1 min ||xi − Dαi ||2 + λ||αi ||1 2 α∈R i=1 2 k×n D∈C C = {D ∈ Rm×k s.t. ∀j = 1, . . . , k, ||dj ||2 ≤ 1}. Classical optimization alternates between D and α. Good results, but very slow!
  • 11. 1 The Dictionary Learning Problem 2 Online Dictionary Learning 3 Extensions
  • 12. Online Dictionary Learning Classical formulation of dictionary learning 1 n min fn (D) = min l(xi , D), D∈C D∈C n i=1 where 1 l(x, D) = mink ||x − Dα||2 + λ||α||1 . 2 α∈R 2
  • 13. Online Dictionary Learning Which formulation are we interested in? 1 n min f (D) = Ex [l(x, D)] ≈ lim l(xi , D) D∈C n→+∞ n i=1
  • 14. Online Dictionary Learning Online learning can handle potentially infinite datasets, adapt to dynamic training sets, be dramatically faster than batch algorithms [Bottou & Bousquet (’08)].
  • 15. Online Dictionary Learning Proposed approach 1: for t=1,. . . ,T do 2: Draw xt 3: Sparse Coding 1 αt ← arg min ||xt − Dt−1 α||2 + λ||α||1 , 2 α∈Rk 2 4: Dictionary Learning 1 t 1 Dt ← arg min ||xi −Dαi ||2 +λ||αi ||1 , 2 D∈C t i=1 2 5: end for
  • 16. Online Dictionary Learning Proposed approach Implementation details Use LARS for the sparse coding step, Use a block-coordinate approach for the dictionary update, with warm restart, Use a mini-batch.
  • 17. Online Dictionary Learning Proposed approach Which guarantees do we have? Under a few reasonable assumptions, ˆ we build a surrogate function ft of the expected cost f verifying ˆ lim ft (Dt ) − f (Dt ) = 0, t→+∞ Dt is asymptotically close to a stationary point.
  • 18. Online Dictionary Learning Experimental results, batch vs online m = 8 × 8, k = 256
  • 19. Online Dictionary Learning Experimental results, batch vs online m = 12 × 12 × 3, k = 512
  • 20. Online Dictionary Learning Experimental results, batch vs online m = 16 × 16, k = 1024
  • 21. Online Dictionary Learning Experimental results, ODL vs SGD m = 8 × 8, k = 256
  • 22. Online Dictionary Learning Experimental results, ODL vs SGD m = 12 × 12 × 3, k = 512
  • 23. Online Dictionary Learning Experimental results, ODL vs SGD m = 16 × 16, k = 1024
  • 24. Online Dictionary Learning Inpainting a 12-Mpixel photograph
  • 25. Online Dictionary Learning Inpainting a 12-Mpixel photograph
  • 26. Online Dictionary Learning Inpainting a 12-Mpixel photograph
  • 27. Online Dictionary Learning Inpainting a 12-Mpixel photograph
  • 28. 1 The Dictionary Learning Problem 2 Online Dictionary Learning 3 Extensions
  • 29. Extension to NMF and sparse PCA NMF extension 1n min ||xi − Dαi ||2 s.t. αi ≥ 0, 2 D ≥ 0. α∈R i=1 2 k×n D∈C SPCA extension n 1 min ||xi − Dαi ||2 + λ||α1 ||1 2 α∈Rk×n i=1 2 D∈C C = {D ∈ Rm×k s.t. ∀j ||dj ||2 + γ||dj ||1 ≤ 1}. 2
  • 30. Extension to NMF and sparse PCA Faces: Extended Yale Database B (a) PCA (b) NNMF (c) DL
  • 31. Extension to NMF and sparse PCA Faces: Extended Yale Database B (d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%
  • 32. Extension to NMF and sparse PCA Natural Patches (a) PCA (b) NNMF (c) DL
  • 33. Extension to NMF and sparse PCA Natural Patches (d) SPCA, τ = 70% (e) SPCA, τ = 30% (f) SPCA, τ = 10%
  • 34. Conclusion Take-home message Online techniques are adapted to the dictionary learning problem. Our method makes some large-scale image processing tasks tractable—. . . . . . — and extends to various matrix factorization problems.