SlideShare ist ein Scribd-Unternehmen logo
1 von 100
Downloaden Sie, um offline zu lesen
Adaptive Signal
and Image Processing
    Gabriel Peyré
 www.numerical-tours.com
Natural Image Priors




Fourier decomposition   Wavelet decomposition
Natural Image Priors




Fourier decomposition   Wavelet decomposition
Overview


•Sparsity for Compression and Denoising

•Geometric Representations

•L0 vs. L1 Sparsity

•Dictionary Learning
Sparse Approximation in a Basis
Sparse Approximation in a Basis
Sparse Approximation in a Basis
Image and Texture Models
Uniformly smooth C image.
    Fourier, Wavelets: ||f fM ||2 = O(M   ).

                                               | f |2
Image and Texture Models
Uniformly smooth C image.
    Fourier, Wavelets: ||f fM ||2 = O(M   ).

                                               | f |2

Discontinuous image with bounded variation.
   Wavelets: ||f fM ||2 = O(M 1 ).


                                               | f|
Image and Texture Models
Uniformly smooth C image.
    Fourier, Wavelets: ||f fM ||2 = O(M       ).

                                                   | f |2

Discontinuous image with bounded variation.
   Wavelets: ||f fM ||2 = O(M 1 ).


                                                   | f|
C -geometrically regular image.
 2

   Curvelets: ||f fM ||2 = O(log3 (M )M   2
                                              ).



                                                   | f|
Image and Texture Models
Uniformly smooth C image.
    Fourier, Wavelets: ||f fM ||2 = O(M       ).

                                                   | f |2

Discontinuous image with bounded variation.
   Wavelets: ||f fM ||2 = O(M 1 ).


                                                   | f|
C -geometrically regular image.
 2

   Curvelets: ||f fM ||2 = O(log3 (M )M   2
                                              ).



                                                   | f|
Image and Texture Models
Uniformly smooth C image.
    Fourier, Wavelets: ||f fM ||2 = O(M       ).

                                                   | f |2

Discontinuous image with bounded variation.
   Wavelets: ||f fM ||2 = O(M 1 ).


                                                   | f|
C -geometrically regular image.
 2

   Curvelets: ||f fM ||2 = O(log3 (M )M   2
                                              ).



                                                   | f|



More complex images: needs adaptivity.
Compression by Transform-coding
    forward
f
    transform
                a[m] = ⇥f,   m⇤   R




    Image f                  Zoom on f
Compression by Transform-coding
     forward
 f
     transform
                 a[m] = ⇥f,   m⇤   R                   q[m]   Z
                                          bin T



                                            ⇥                              2T
                                |a[m]|                        2T   T   T        a[m]
Quantization: q[m] = sign(a[m])                    Z
                                  T                                    a[m]
                                                                       ˜




     Image f                  Zoom on f           Quantized q[m]
Compression by Transform-coding
     forward                                                       codi
 f
     transform
                 a[m] = ⇥f,   m⇤   R                  q[m]   Z          ng
                                          bin T



                                            ⇥                                    2T
                                    |a[m]|            2T           T         T        a[m]
Quantization: q[m] = sign(a[m])              Z
                                      T                                      a[m]
                                                                             ˜
Entropic coding: use statistical redundancy (many 0’s).




     Image f                  Zoom on f           Quantized q[m]
Compression by Transform-coding
     forward                                                        codi
 f
     transform
                 a[m] = ⇥f,   m⇤   R                  q[m]   Z           n        g
                                          bin T
                                                                              ng
                                                                        c odi
                                                                   de
                                                      q[m]   Z
                                            ⇥                                             2T
                                    |a[m]|            2T            T                 T        a[m]
Quantization: q[m] = sign(a[m])              Z
                                      T                                               a[m]
                                                                                      ˜
Entropic coding: use statistical redundancy (many 0’s).




     Image f                  Zoom on f           Quantized q[m]
Compression by Transform-coding
     forward                                                           codi
 f
     transform
                 a[m] = ⇥f,   m⇤     R                     q[m]   Z         n        g
                                            bin T
                                                                                 ng
                                                                           c odi
                                          dequantization              de
                                   a[m]
                                   ˜                       q[m]   Z
                                               ⇥                                             2T
                                    |a[m]|            2T               T                 T        a[m]
Quantization: q[m] = sign(a[m])              Z
                                      T                                                  a[m]
                                                                                         ˜
Entropic coding: use statistical redundancy (many 0’s).
                                               ⇥
                                             1
Dequantization: a[m] = sign(q[m]) |q[m] +
                 ˜                               T
                                             2




     Image f                  Zoom on f             Quantized q[m]
Compression by Transform-coding
       forward                                                            codi
 f
       transform
                   a[m] = ⇥f,    m⇤     R                     q[m]   Z         n        g
                                               bin T
                                                                                    ng
                                                                              c odi
                         backward            dequantization              de
fR =          a[m]
              ˜      m                a[m]
                                      ˜                       q[m]   Z
                         transform
                                                  ⇥                                             2T
       m IT
                                    |a[m]|            2T                  T                 T        a[m]
Quantization: q[m] = sign(a[m])              Z
                                      T                                                     a[m]
                                                                                            ˜
Entropic coding: use statistical redundancy (many 0’s).
                                               ⇥
                                             1
Dequantization: a[m] = sign(q[m]) |q[m] +
                 ˜                               T
                                             2




       Image f                  Zoom on f              Quantized q[m]             f , R =0.2 bit/pixel
Compression by Transform-coding
       forward                                                               codi
 f
       transform
                   a[m] = ⇥f,     m⇤     R                     q[m]     Z         n        g
                                                bin T
                                                                                       ng
                                                                                 c odi
                         backward             dequantization                de
fR =          a[m]
              ˜      m                 a[m]
                                       ˜                       q[m]     Z
                         transform
                                                    ⇥                                              2T
       m IT
                                    |a[m]|            2T                     T                 T        a[m]
Quantization: q[m] = sign(a[m])              Z
                                      T                                                        a[m]
                                                                                               ˜
Entropic coding: use statistical redundancy (many 0’s).
                                               ⇥
                                             1
Dequantization: a[m] = sign(q[m]) |q[m] +
                 ˜                               T
                                             2

        “Theorem:”          ||f      fM ||2 = O(M       ) =⇥ ||f      fR ||2 = O(log (R)R          )




       Image f                    Zoom on f             Quantized q[m]               f , R =0.2 bit/pixel
Wavelets…JPEG-2000             vs. JPEG

                        JPEG2k: exploit the statistical redundancy of coe cients.

                            + embedded coder.

                             chunks of large coe cients.
                             neighboring coe cients are not independent.
                     Compression
                JPEG Compression       EZW Compression




orm




      Image f             JPEG, R = .19bit/pxl         JPEG2k, R = .15bit/pxl
Denoising (Donoho/Johnstone)
Denoising (Donoho/Johnstone)

            N 1
                              thresh. ˜
       f=         f,   m⇥ m           f=               f,   m⇥ m
            m=0                        | f,   m ⇥|>T
Denoising (Donoho/Johnstone)

                                  N 1
                                                    thresh. ˜
                           f=           f,   m⇥ m           f=                f,   m⇥ m
                                  m=0                        | f,    m ⇥|>T


Theorem:       if ||f0   f0,M ||2 = O(M        ),                   In practice:
    ˜
E(||f   f0 || ) = O(
           2              2
                           +1   ) for    T =    2 log(N )             T 3
Overview


•Sparsity for Compression and Denoising

•Geometric Representations

•L0 vs. L1 Sparsity

•Dictionary Learning
Piecewise Regular Functions in 1D




   Theorem:     If f is C outside a finite set of discontinuities:
                                 O(M   1
                                         ) (Fourier),
               ||f     fM || =
                          2
                                 O(M   2
                                          ) (wavelets).



For Fourier, linear non-linear, sub-optimal.

For wavelets, linear     non-linear, optimal.
Piecewise Regular Functions in 2D




    Theorem:     If f is C outside a set of finite length edge curves,
                             O(M    1/2
                                       ) (Fourier),
           ||f    fM || =
                      2
                             O(M    1
                                      ) (wavelets).

Fourier    Wavelets.
Wavelets: same result for BV functions (optimal).
Piecewise Regular Functions in 2D




    Theorem:     If f is C outside a set of finite length edge curves,
                             O(M    1/2
                                       ) (Fourier),
           ||f    fM || =
                      2
                             O(M    1
                                      ) (wavelets).

Fourier    Wavelets.
Wavelets: same result for BV functions (optimal).
Regular C edges: sub-optimal (requires anisotropy).
Geometrically Regular Images
Geometric image model: f is C outside a set of C edge curves.
    BV image: level sets have finite lengths.
    Geometric image: level sets are regular.




                                                            | f|




   Geometry = cartoon image          Sharp edges           Smoothed edges
Curvelets for Cartoon Images
Curvelets: [Candes, Donoho] [Candes, Demanet, Ying, Donoho]
Curvelets for Cartoon Images
Curvelets: [Candes, Donoho] [Candes, Demanet, Ying, Donoho]




Redundant tight frame (redundancy 5): not e cient for compression.
Denoising by curvelet thresholding: recovers edges and geometric textures.
Approximation with Triangulation
                        Vertices V = {vi }M .
Triangulation (V, F):                     i=1
                        Faces F    {1, . . . , M }3 .
Approximation with Triangulation
                              Vertices V = {vi }M .
Triangulation (V, F):                           i=1
                              Faces F    {1, . . . , M }3 .
                                             M
Piecewise linear approximation: fM =               m ⇥m
                                             m=1
         = argmin ||f             µm ⇥m ||
                 µ
                              m
   ⇥m (vi ) =   m
                i   is a ne on each face of F.
Approximation with Triangulation
                                    Vertices V = {vi }M .
Triangulation (V, F):                                 i=1
                                    Faces F     {1, . . . , M }3 .
                                                   M
Piecewise linear approximation: fM =                       m ⇥m
                                                   m=1
            = argmin ||f                µm ⇥m ||
                     µ
                                    m
   ⇥m (vi ) =   m
                i    is a ne on each face of F.

Theorem:
 There exists (V, F) such that                 Regular areas:
                                            M/2 equilateral triangles.
                                2
      ||f    fM ||       Cf M
                                                                     1/2
                                                               M

                                                                           Singular areas:
                                                   M     1/2                M/2 anisotropic triangles.
Approximation with Triangulation
                                    Vertices V = {vi }M .
Triangulation (V, F):                                 i=1
                                    Faces F     {1, . . . , M }3 .
                                                   M
Piecewise linear approximation: fM =                       m ⇥m
                                                   m=1
            = argmin ||f                µm ⇥m ||
                     µ
                                    m
   ⇥m (vi ) =   m
                i    is a ne on each face of F.

Theorem:
 There exists (V, F) such that                 Regular areas:
                                            M/2 equilateral triangles.
                                2
      ||f    fM ||       Cf M
                                                                     1/2
                                                               M
Optimal (V, F): NP-hard.
                                                                           Singular areas:
Provably good greedy schemes:                      M     1/2                M/2 anisotropic triangles.
    [Mirebeau, Cohen, 2009]
Greedy Triangulation Optimization
          Bougleux, Peyr´, Cohen, ICCV’09
                        e
M = 200
M = 600




                         Anisotropic triangulation   JPEG2000
Geometric Multiscale Processing
                         f
 Image f   R   N



         Wavelet
        transform

                         j
Wavelet coe cients
 fj [n] = f, j,n
                         fj
Geometric Multiscale Processing
                                  f
           Image f   R   N



                  Wavelet
                 transform

                                  j
   Wavelet coe cients
    fj [n] = f, j,n
                                  fj
Geometry




                 Geometric
                 transform

  Geometric coe cients
Geometric Multiscale Processing
                                     f
           Image f   R   N



                  Wavelet
                 transform

                                  j
   Wavelet coe cients
    fj [n] = f, j,n
                                  fj
Geometry




                 Geometric
                 transform

  Geometric coe cients                   2   j

  b[j, , n] = fj , b⇤,n ⇥
               ⇥

                                 2
Geometric Multiscale Processing
                                     f
           Image f   R   N



                  Wavelet
                 transform

                                  j
   Wavelet coe cients
    fj [n] = f, j,n
                                  fj
Geometry




                 Geometric
                 transform
                                                 C cartoon image:
  Geometric coe cients                   2   j
                                                 ||f   fM ||2 = O(M   )
  b[j, , n] = fj , b⇤,n ⇥
               ⇥
                                                  M = Mband + M
                                 2
Overview


•Sparsity for Compression and Denoising

•Geometric Representations

•L0 vs. L1 Sparsity

•Dictionary Learning
Image Representation
                      Q 1
Dictionary D =   {dm }m=0   of atoms dm    RN .
                             Q 1
Image decomposition: f =           xm dm = Dx
                             m=0                           dm

                                                           xm
                                                  dm
                                           =

                                       f          D    x
Image Representation
                      Q 1
Dictionary D =   {dm }m=0   of atoms dm    RN .
                             Q 1
Image decomposition: f =           xm dm = Dx
                             m=0                           dm
Image approximation: f       Dx
                                                           xm
                                                  dm
                                           =

                                       f          D    x
Image Representation
                      Q 1
Dictionary D =   {dm }m=0   of atoms dm    RN .
                             Q 1
Image decomposition: f =           xm dm = Dx
                             m=0                           dm
Image approximation: f       Dx
                                                           xm
                                                  dm
Orthogonal dictionary: N = Q               =
   xm = f, dm
                                       f          D    x
Image Representation
                      Q 1
Dictionary D =   {dm }m=0   of atoms dm      RN .
                             Q 1
Image decomposition: f =           xm dm = Dx
                             m=0                             dm
Image approximation: f       Dx
                                                             xm
                                                    dm
Orthogonal dictionary: N = Q               =
   xm = f, dm
                                       f            D    x
Redundant dictionary: N       Q
   Examples: TI wavelets, curvelets, . . .
     x is not unique.
Sparsity
                      Q 1
Decomposition:   f=         xm dm = Dx
                      m=0



Sparsity: most xm are small.
     Example: wavelet transform.
                                           Image f




                                         Coe cients x
Sparsity
                       Q 1
Decomposition:    f=         xm dm = Dx
                       m=0



Sparsity: most xm are small.
     Example: wavelet transform.
                                            Image f
Ideal sparsity: most xm are zero.
    J0 (x) = | {m  xm = 0} |




                                          Coe cients x
Sparsity
                        Q 1
Decomposition:     f=         xm dm = Dx
                        m=0



Sparsity: most xm are small.
     Example: wavelet transform.
                                             Image f
Ideal sparsity: most xm are zero.
    J0 (x) = | {m  xm = 0} |

Approximate sparsity: compressibility
    ||f   Dx|| is small with J0 (x)   M.
                                           Coe cients x
Sparse Coding
                                Q 1
Redundant dictionary D =   {dm }m=0 ,   Q   N.
        non-unique representation f = Dx.
Sparsest decomposition:      min J0 (x)
                            f =Dx
Sparse Coding
                                Q 1
Redundant dictionary D =   {dm }m=0 ,      Q      N.
        non-unique representation f = Dx.
Sparsest decomposition:      min J0 (x)
                            f =Dx
                                      1
Sparsest approximation:            min ||f        Dx|| + J0 (x)
                                                        2
                                    x 2

       Equivalence                 min     ||f     Dx||
           M     ⇥           J0 (x) M
                                   min         J0 (x)
                             ||f    Dx||
Sparse Coding
                                Q 1
Redundant dictionary D =   {dm }m=0 ,        Q      N.
        non-unique representation f = Dx.
Sparsest decomposition:       min J0 (x)
                              f =Dx
                                        1
Sparsest approximation:              min ||f        Dx|| + J0 (x)
                                                          2
                                      x 2

       Equivalence                   min     ||f     Dx||
           M     ⇥             J0 (x) M
                                     min         J0 (x)
                               ||f    Dx||
Ortho-basis D:
                                ⇤                  Pick the M largest
           f, dm ⇥ if |xm |          2                 coe cients
 xm =
          0 otherwise.                                in { f, dm ⇥}m
Sparse Coding
                                Q 1
Redundant dictionary D =   {dm }m=0 ,        Q      N.
        non-unique representation f = Dx.
Sparsest decomposition:       min J0 (x)
                              f =Dx
                                        1
Sparsest approximation:              min ||f        Dx|| + J0 (x)
                                                          2
                                      x 2

       Equivalence                   min     ||f     Dx||
           M     ⇥             J0 (x) M
                                     min         J0 (x)
                               ||f    Dx||
Ortho-basis D:
                                ⇤                  Pick the M largest
           f, dm ⇥ if |xm |          2                 coe cients
 xm =
          0 otherwise.                                in { f, dm ⇥}m

General redundant dictionary: NP-hard.
Convex Relaxation: L1 Prior
                            J0 (x) = | {m  xm = 0} |
                       J0 (x) = 0           null image.
Image with 2 pixels:   J0 (x) = 1           sparse image.
                       J0 (x) = 2           non-sparse image.
     d1

          d0


  q=0
Convex Relaxation: L1 Prior
                              J0 (x) = | {m  xm = 0} |
                         J0 (x) = 0           null image.
Image with 2 pixels:     J0 (x) = 1           sparse image.
                         J0 (x) = 2           non-sparse image.
       d1

               d0


     q=0            q = 1/2          q=1      q = 3/2        q=2
 q
     priors:         Jq (x) =       |xm |q        (convex for q    1)
                                m
Convex Relaxation: L1 Prior
                                   J0 (x) = | {m  xm = 0} |
                              J0 (x) = 0           null image.
Image with 2 pixels:          J0 (x) = 1           sparse image.
                              J0 (x) = 2           non-sparse image.
       d1

                d0


     q=0                 q = 1/2          q=1             q = 3/2        q=2
 q
     priors:              Jq (x) =       |xm |q               (convex for q    1)
                                     m



Sparse      1
                prior:     J1 (x) = ||x||1 =          |xm |
                                                  m
Inverse Problems


Denoising/approximation:   = Id.
Inverse Problems


Denoising/approximation:      = Id.
Examples: Inpainting, super-resolution, compressed-sensing
Regularized Inversion
Denoising/compression: y = f0 + w   RN .
   Sparse approximation: f = Dx where
                1
      x ⇥ argmin ||y   Dx||2 + ||x||1
             x  2

                   Fidelity
Regularized Inversion
Denoising/compression: y = f0 + w    RN .
   Sparse approximation: f = Dx where
                1
      x ⇥ argmin ||y    Dx||2 + ||x||1
             x  2

                    Fidelity                Replace
                                            D by D
Inverse problems y = f0 + w RP .
                   1
      x ⇥ argmin ||y     Dx|| + ||x||1
                             2
              x    2
Regularized Inversion
Denoising/compression: y = f0 + w     RN .
   Sparse approximation: f = Dx where
                1
      x ⇥ argmin ||y     Dx||2 + ||x||1
             x  2

                     Fidelity                    Replace
                                                 D by D
Inverse problems y = f0 + w RP .
                   1
      x ⇥ argmin ||y     Dx|| + ||x||1
                             2
              x    2

Numerical solvers: proximal splitting schemes.
             www.numerical-tours.com
Inpainting Results
Overview


•Sparsity for Compression and Denoising

•Geometric Representations

•L0 vs. L1 Sparsity

•Dictionary Learning
Dictionary Learning: MAP Energy
Set of (noisy) exemplars {yk }k .
                                    1
Sparse approximation:         min     ||yk   Dxk || + ||xk ||1
                                                  2
                               xk   2
Dictionary Learning: MAP Energy
Set of (noisy) exemplars {yk }k .
                                         1
Sparse approximation: min min              ||yk   Dxk || + ||xk ||1
                                                       2
                         D C   xk
                                     k
                                         2

                          Dictionary learning
Dictionary Learning: MAP Energy
Set of (noisy) exemplars {yk }k .
                                        1
Sparse approximation: min min             ||yk   Dxk || + ||xk ||1
                                                      2
                         D C   xk
                                    k
                                        2

                       Dictionary learning
 Constraint:   C = {D = (dm )m  m, ||dm ||      1}
       Otherwise: D       + , X     0
Dictionary Learning: MAP Energy
Set of (noisy) exemplars {yk }k .
                                          1
Sparse approximation: min min               ||yk   Dxk || + ||xk ||1
                                                        2
                          D C   xk
                                     k
                                          2

                        Dictionary learning
 Constraint:    C = {D = (dm )m  m, ||dm ||       1}
       Otherwise: D       + , X      0
Matrix formulation:
                          1
            min f (X, D) = ||Y           DX|| + ||X||1
                                             2

          X⇥R  Q K        2
          D⇥C    RN   Q
Dictionary Learning: MAP Energy
Set of (noisy) exemplars {yk }k .
                                           1
Sparse approximation: min min                ||yk     Dxk || + ||xk ||1
                                                           2
                           D C   xk
                                      k
                                           2

                        Dictionary learning
 Constraint:    C = {D = (dm )m  m, ||dm ||         1}
       Otherwise: D        + , X      0
Matrix formulation:
                          1
            min f (X, D) = ||Y            DX|| + ||X||1
                                                 2

          X⇥R  Q K        2
                                            min f (X, D)
          D⇥C    R   N Q
                                             X

    Convex with respect to X.
    Convex with respect to D.                                        D
    Non-onvex with respect to (X, D).                Local minima
Dictionary Learning: Algorithm
Step 1:   k, minimization on xk
      1
   min ||yk    Dxk || + ||xk ||1
                    2
   xk 2

      Convex sparse coding.

                                   D, initialization
Dictionary Learning: Algorithm
Step 1:    k, minimization on xk
      1
   min ||yk         Dxk || + ||xk ||1
                         2
   xk 2

      Convex sparse coding.

Step 2: Minimization on D               D, initialization
          min ||Y      DX||  2
          D C

      Convex constraint minimization.
Dictionary Learning: Algorithm
    Step 1:     k, minimization on xk
           1
        min ||yk         Dxk || + ||xk ||1
                               2
        xk 2

             Convex sparse coding.

    Step 2: Minimization on D                            D, initialization
               min ||Y      DX||   2
               D C

           Convex constraint minimization.
      Projected gradient descent:
D   ( +1)
            = ProjC D    ( )
                                   (D   ( )
                                              X   Y )X
Dictionary Learning: Algorithm
    Step 1:     k, minimization on xk
           1
        min ||yk         Dxk || + ||xk ||1
                               2
        xk 2

             Convex sparse coding.

    Step 2: Minimization on D                            D, initialization
               min ||Y      DX||   2
               D C

           Convex constraint minimization.
      Projected gradient descent:
D   ( +1)
            = ProjC D    ( )
                                   (D   ( )
                                              X   Y )X
Convergence: toward a stationary point
             of f (X, D).                                D, convergence
Patch-based Learning

                      Learning D



Exemplar patches yk                      Dictionary D
                                     [Olshausen, Fields 1997]
 State of the art denoising [Elad et al. 2006]
Patch-based Learning

                       Learning D



Exemplar patches yk                      Dictionary D
                                     [Olshausen, Fields 1997]
 State of the art denoising [Elad et al. 2006]



                      Learning D



  Sparse texture synthesis, inpainting [Peyr´ 2008]
                                            e
Comparison with PCA
PCA dimensionality reduction:
         ⇥ k, min ||Y D(k) X||                           D   (k)
                                                                   =   (dm )m=0
                                                                            k 1
                   D
Linear (PCA): Fourier-like atoms.
         RUBINSTEIN et al.: al.: DICTIONARIES FOR SPARSE REPRESENTATION
           RUBINSTEIN et DICTIONARIES FOR SPARSE REPRESENTATION



                                                                                1980 by by Bast
                                                                                   1980 Bastiaa
                                                                                fundamental prop
                                                                                   fundamental p
                                                                                   A basic 1-D G
                                                                                      A basic 1-D
                                                                                forms
                                                                                   forms
                                                                                             © ©
                                                                                        G = = n,
                                                                                          G ⇤ ⇤

                     DCT                                    PCA                    where w(·) is is
                                                                                      where w(·) a
         Fig. 1. 1.Left: A fewfew £ 12 12 DCT atoms. Right: The first 40 KLT atoms, (typically a Gau
             Fig.      Left: A 12 12 £ DCT atoms. Right: The first 40 KLT atoms,       (typically a G
         trained using 12 £ 12 12 image patches from Lena.
             trained using 12 £ image patches from Lena.
                                                                                   frequency resolu
                                                                                      frequency reso
                                                                                   matical foundatio
                                                                                      matical founda
                                                                                   late 1980’s by by
                                                                                      late 1980’s D
         B. B. Non-Linear Revolution and Elements Modern Dictionary
              Non-Linear Revolution and Elements of of Modern Dictionary
                                                                                   who studied thet
                                                                                      who studied
         Design
             Design
                                                                                   and by by Feichti
                                                                                      and Feichting
             In In statistics research, the 1980’s saw the rise of new generalized group
                  statistics research, the 1980’s saw the rise of a a new generalized gro
         powerful approach known as as robust statistics. Robust statistics
             powerful approach known robust statistics. Robust statistics
Comparison with PCA
PCA dimensionality reduction:
         ⇥ k, min ||Y D(k) X||                                  D   (k)
                                                                          =   (dm )m=0
                                                                                   k 1
                          D
Linear (PCA): Fourier-like atoms.
           RUBINSTEIN et al.: al.: DICTIONARIES FOR SPARSE REPRESENTATION
             RUBINSTEIN et DICTIONARIES FOR SPARSE REPRESENTATION
Sparse (learning): Gabor-like atoms.
                                                                                       1980 by by Bast
                                                                                          1980 Bastiaa
                                                                                       fundamental prop
                                                                                          fundamental p
                                                                                          A basic 1-D G
                                                                                             A basic 1-D
                                                                                       forms
                                                                                          forms
                                                                                                    © ©
                4
                                                                                               G = = n,
                                                                                                 G ⇤ ⇤
                    4

                            DCT                                    PCA                    where w(·) is is
                                                                                             where w(·) a
                Fig. 1. 1.Left: A fewfew £ 12 12 DCT atoms. Right: The first 40 KLT atoms, (typically a Gau
                    Fig.      Left: A 12 12 £ DCT atoms. Right: The first 40 KLT atoms,  0.15
                                                                                             (typically a G
                                                                                                0.15

                trained using 12 £ 12 12 image patches from Lena.
                    trained using 12 £ image patches from Lena.
                                                                                          frequency resolu
                                                                                         0.1
                                                                                             frequency reso
                                                                                                 0.1




                                                                                          matical foundatio
                                                                                             matical founda
                                                                                        0.05    0.05


                                                                                           0       0

                                                                                          late 1980’s by by
                                                                                             late 1980’s D
                B. B. Non-Linear Revolution and Elements Modern Dictionary
                     Non-Linear Revolution and Elements of of Modern Dictionary        -0.05   -0.05


                                                                                          who studied thet
                                                                                             who studied
                Design
                    Design
                                                                                        -0.1    -0.1



                                                                                          and by by Feichti
                                                                                       -0.15
                                                                                             and Feichting
                                                                                               -0.15



                    In In statistics research, the 1980’s saw the rise of new generalized group
                         statistics research, the 1980’s saw the rise of a a new generalized gro
                                                                                        -0.2    -0.2



                              Gabor                                   Learned
                powerful approach known as as robust statistics. Robust statistics
                    powerful approach known robust statistics. Robust statistics
Patch-based Denoising
Noisy image: f = f0 + w.
Step 1: Extract patches.   yk (·) = f (zk + ·)




    yk




[Aharon & Elad 2006]
Patch-based Denoising
Noisy image: f = f0 + w.
Step 1: Extract patches. yk (·) = f (zk + ·)
Step 2: Dictionary learning.
                 1
       min         ||yk Dxk || + ||xk ||1
                              2
      D,(xk )k   2
              k




    yk




[Aharon & Elad 2006]
Patch-based Denoising
Noisy image: f = f0 + w.
Step 1: Extract patches. yk (·) = f (zk + ·)
Step 2: Dictionary learning.
                   1
       min           ||yk Dxk || + ||xk ||1
                                 2
      D,(xk )k     2
                 k
Step 3: Patch averaging.         yk = Dxk
                                  ˜
            ˜
           f (·) ⇥      yk (· zk )
                        ˜
                  k


    yk                             ˜
                                   yk




[Aharon & Elad 2006]
Learning with Missing Data
Inverse problem: y = f0 + w                                           LEARNING MULTISCALE AND S


         1            1
 min ||y       f || +
                   2
                        ||pk (f )          Dxk || + ⇥||xk ||1
                                                    2
f,(xk )k 2            2
                       k
D   C
                                                                                      f0
Patch extractor:   pk (f ) = f (zk + ·)
                                                                      pk
                                   LEARNING MULTISCALE AND SPARSE REPRESENTATIONS       237




                                                                       (a) Original

                                                                                      y




                                    (a) Original                       (b) Damaged
Learning with Missing Data
Inverse problem: y = f0 + w                                              LEARNING MULTISCALE AND S


         1            1
 min ||y       f || +
                   2
                        ||pk (f )             Dxk || + ⇥||xk ||1
                                                       2
f,(xk )k 2            2
                          k
D   C
                                                                                         f0
Patch extractor:      pk (f ) = f (zk + ·)
                                                                         pk
    Step 1:    k, minimization on xk  LEARNING MULTISCALE AND SPARSE REPRESENTATIONS       237




              Convex sparse coding.
                                                                          (a) Original

                                                                                         y




                                       (a) Original                       (b) Damaged
Learning with Missing Data
Inverse problem: y = f0 + w                                               LEARNING MULTISCALE AND S


         1            1
 min ||y       f || +
                   2
                        ||pk (f )              Dxk || + ⇥||xk ||1
                                                        2
f,(xk )k 2            2
                          k
D   C
                                                                                          f0
Patch extractor:      pk (f ) = f (zk + ·)
                                                                          pk
    Step 1:    k, minimization on xk   LEARNING MULTISCALE AND SPARSE REPRESENTATIONS       237




              Convex sparse coding.

    Step 2: Minimization on D
                                                                           (a) Original

                                                                                          y
              Quadratic constrained.



                                        (a) Original                       (b) Damaged
Learning with Missing Data
Inverse problem: y = f0 + w                                               LEARNING MULTISCALE AND S


         1            1
 min ||y       f || +
                   2
                        ||pk (f )              Dxk || + ⇥||xk ||1
                                                        2
f,(xk )k 2            2
                           k
D   C
                                                                                          f0
Patch extractor:      pk (f ) = f (zk + ·)
                                                                          pk
    Step 1:    k, minimization on xk   LEARNING MULTISCALE AND SPARSE REPRESENTATIONS       237




              Convex sparse coding.

    Step 2: Minimization on D
                                                                           (a) Original

                                                                                          y
              Quadratic constrained.

    Step 3: Minimization on f
              Quadratic.
                                        (a) Original                       (b) Damaged
Inpainting Example
    LEARNING MULTISCALE AND SPARSE REPRESENTATIONS
      LEARNING MULTISCALE AND SPARSE REPRESENTATIONS                   237
                                                                         237




                                           (a) Original                                       (b) Damaged




    Image f0
     (a) Original
        (a) Original
                                    Observations
                                       (c) Restored, N = 1
                                             (b) Damaged
                                               (b) Damaged
                                                                                 Regularized f
                                                                                      (d) Restored, N = 2

                                    y = using + w
                              Fig. 14. Inpainting f0 N = 2 and n = 16 × 16 (bottom-right image), or N = 1 and n = 8 × 8
                          (bottom-left). J = 100 iterations were performed, producing an adaptive dictionary. During the
                          learning, 50% of the patches were used. A sparsity factor L = 10 has been used during the learning
                          process and L = 25 for the final reconstruction. The damaged image was created by removing 75% of
                          the data from the original image. The initial PSNR is 6.13dB. The resulting PSNR for N = 2 is

[Mairal et al. 2008]
                          33.97dB and 31.75dB for N = 1.
Adaptive Inpainting and Separation




      Wavelets       Local DCT




      Wavelets        Local DCT      Learned




      [Peyr´, Fadili, Starck 2010]
           e
OISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR
        TS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3
         D DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPRO




                                                                                                                                                                             SENTATION FOR COLOR IMAGE RESTORATION
         EST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUC

         K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT AR
 s are reduced with our proposed technique (
                                                                                                              Higher Dimensional Learning
mples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed (             in the new metric).
                                                         in our proposed new metric). Both images have been denoised with the same global dictionary.
bserves a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when MAIRAL
 rs), which is another artifact our approach corrected. (a) Original. (b) Original algorithm,
                  dB.
                                                                                                                            dB. (c) Proposed algorithm,
                                                                                                                                                                                                                     et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION

naries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.
 can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.


OLOR IMAGE RESTORATION                                                                                                                                                                                                                                                      61


                                                                                                              Fig. 7. Data set used for evaluating denoising experiments.
                                                                                                                                                                            Learning D

 (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
                                                                                                    TABLE I




    Fig. 7. Data set used for evaluating denoising experiments.


 les of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposedatoms learned on new metric).
                                                                                             Fig. 2. Dictionaries with 256 (       in the a generic database of natural images, with two d
are reduced with our proposed technique ( TABLE I our proposed new metric). Both images have been denoised negative values, the vectors are presented scaled and shifted to the [0,2
                                                     in
                                                                                             Since the atoms can have
                                                                                                                      with the same global dictionary.
ervesITH 256 ATOMS OF SIZE castle and in3 FOR of the water. What is more, the color of the sky is . EACH constant IS DIVIDED IN FOUR
  W a bias effect in the color from the 7 7 some part                      AND 6        6 3 FOR                       piecewise CASE when
EN BY is another artifact our approach corrected. (a)HEIR “3(b) Original algorithm, HE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY
), which
          MCAULEY AND AL [28] WITH T Original.                          3 MODEL.” T                                       dB. (c) Proposed algorithm,
                                                                   3 MODEL.” THE TOP-RIGHT RESUL




                 dB.
                                                                     AND 6




M [2] ON EACH CHANNEL SEPARATELY WITH 8    8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINED
E BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.
EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS
                                                                            6 3 FOR         . EAC
OISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR
        TS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3
         D DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPRO




                                                                                                                                                                             SENTATION FOR COLOR IMAGE RESTORATION
         EST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUC

         K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT AR
                                                                                                              Higher Dimensional Learning
                                                                                               O NLINE L EARNING FOR M ATRIX FACTORIZATION AND FACTORIZATION AND S PARSE C ODING
                                                                                                                  O NLINE L EARNING FOR M ATRIX S PARSE C ODING
mples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed (
 s are reduced with our proposed technique (
                                                                                                                                     in the new metric).
                                                         in our proposed new metric). Both images have been denoised with the same global dictionary.
bserves a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when MAIRAL
 rs), which is another artifact our approach corrected. (a) Original. (b) Original algorithm,
                  dB.
                                                                                                                            dB. (c) Proposed algorithm,
                                                                                                                                                                                                                     et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION

naries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.
 can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.


OLOR IMAGE RESTORATION                                                                                                                                                                                                                                                      61


                                                                                                              Fig. 7. Data set used for evaluating denoising experiments.
                                                                                                                                                                            Learning D

 (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
                                                                                                    TABLE I




    Fig. 7. Data set used for evaluating denoising experiments.


 les of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposedatoms learned on new metric).
                                                                                             Fig. 2. Dictionaries with 256 (       in the a generic database of natural images, with two d
are reduced with our proposed technique ( TABLE I our proposed new metric). Both images have been denoised negative values, the vectors are presented scaled and shifted to the [0,2
                                                     in
                                                                                             Since the atoms can have
                                                                                                                      with the same global dictionary.
ervesITH 256 ATOMS OF SIZE castle and in3 FOR of the water. What is more, the color of the sky is . EACH constant IS DIVIDED IN FOUR
  W a bias effect in the color from the 7 7 some part                      AND 6        6 3 FOR                       piecewise CASE when
EN BY is another artifact our approach corrected. (a)HEIR “3(b) Original algorithm, HE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY
), which
          MCAULEY AND AL [28] WITH T Original.                          3 MODEL.” T                                       dB. (c) Proposed algorithm,
                                                                   3 MODEL.” THE TOP-RIGHT RESUL




                 dB.
                                                                     AND 6




M [2] ON EACH CHANNEL SEPARATELY WITH 8    8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINED
E BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.                                                                                          Inpainting
EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS
                                                                            6 3 FOR         . EAC
Movie Inpainting
Facial Image Compression                        O. Bryt, M. Elad / J. Vis. Commun. Image R. 19 (2008) 270–282                                                271




          [Elad et al. 2009]
           show recognizable faces. We use a database containing around 6000
           such facial images, some of which are used for training and tuning
           the algorithm, and the others for testing it, similar to the approach

Image registration.
           taken in [17].
               In our work we propose a novel compression algorithm, related
           to the one presented in [17], improving over it.
               Our algorithm relies strongly on recent advancements made in
           using sparse and redundant representation of signals [18–26], and
           learning their sparsifying dictionaries [27–29]. We use the K-SVD
           algorithm for learning the dictionaries for representing small
           image patches in a locally adaptive way, and use these to sparse-
           code the patches’ content. This is a relatively simple and
           straight-forward algorithm with hardly any entropy coding stage.
           Yet, it is shown to be superior to several competing algorithms:
           (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17],
           and (iii) A Principal Component Analysis (PCA) approach.2                       Fig. 1. (Left) Piece-wise affine warping of the image by triangulation. (Right) A
               In the next section we provide some background material for                 uniform slicing to disjoint square patches for coding purposes.
           this work: we start by presenting the details of the compression
           algorithm developed in [17], as their scheme is the one we embark               K-Means) per each patch separately, using patches taken from the
           from in the development of ours. We also describe the topic of                  same location from 5000 training images. This way, each VQ is
           sparse and redundant representations and the K-SVD, that are                    adapted to the expected local content, and thus the high perfor-
           the foundations for our algorithm. In Section 3 we turn to present              mance presented by this algorithm. The number of code-words
           the proposed algorithm in details, showing its various steps, and               in the VQ is a function of the bit-allocation for the patches. As
           discussing its computational/memory complexities. Section 4                     we argue in the next section, VQ coding is limited by the available
           presents results of our method, demonstrating the claimed                       number of examples and the desired rate, forcing relatively small
           superiority. We conclude in Section 5 with a list of future activities          patch sizes. This, in turn, leads to a loss of some redundancy be-
           that can further improve over the proposed scheme.                              tween adjacent patches, and thus loss of potential compression.
                                                                                               Another ingredient in this algorithm that partly compensates
           2. Background material                                                          for the above-described shortcoming is a multi-scale coding
                                                                                           scheme. The image is scaled down and VQ-coded using patches
           2.1. VQ-based image compression                                                 of size 8 Â 8. Then it is interpolated back to the original resolution,
                                                                                           and the residual is coded using VQ on 8 Â 8 pixel patches once
               Among the thousands of papers that study still image                        again. This method can be applied on a Laplacian pyramid of the
           compression algorithms, there are relatively few that consider                  original (warped) image with several scales [33].
           the treatment of facial images [2–17]. Among those, the most                        As already mentioned above, the results shown in [17] surpass
           recent and the best performing algorithm is the one reported in                 those obtained by JPEG2000, both visually and in Peak-Signal-to-
           [17]. That paper also provides a thorough literature survey that                Noise Ratio (PSNR) quantitative comparisons. In our work we pro-
           compares the various methods and discusses similarities and                     pose to replace the coding stage from VQ to sparse and redundant
           differences between them. Therefore, rather than repeating such                 representations—this leads us to the next subsection, were we de-
           a survey here, we refer the interested reader to [17]. In this                  scribe the principles behind this coding strategy.
           sub-section we concentrate on the description of the algorithm
           in [17] as our method resembles it to some extent.                              2.2. Sparse and redundant representations
Facial Image Compression                         O.O. Bryt, M. EladJ. J. Vis. Commun. Image R. 19 (2008) 270–282
                                                              Bryt, M. Elad / / Vis. Commun. Image R. 19 (2008) 270–282                                                 271
                                                                                                                                                                         271




          [Elad et al. 2009]
            show recognizable faces. We use a a database containing around 6000
              show recognizable faces. We use database containing around 6000
            such facial images, some of which are used for training and tuning
              such facial images, some of which are used for training and tuning
            the algorithm, and the others for testing it, similar to the approach
              the algorithm, and the others for testing it, similar to the approach

Image registration.
            taken in [17].
              taken in [17].
                In our work we propose a a novel compression algorithm, related
                  In our work we propose novel compression algorithm, related
            to the one presented in [17], improving over it.
              to the one presented in [17], improving over it.
                Our algorithm relies strongly on recent advancements made in
                  Our algorithm relies strongly on recent advancements made in

Non-overlapping patches (fk )k .
            using sparse and redundant representation of signals [18–26], and
              using sparse and redundant representation of signals [18–26], and                                                                                  fk
            learning their sparsifying dictionaries [27–29]. We use the K-SVD
              learning their sparsifying dictionaries [27–29]. We use the K-SVD
            algorithm for learning the dictionaries for representing small
              algorithm for learning the dictionaries for representing small
            image patches in a a locally adaptive way, and use these to sparse-
              image patches in locally adaptive way, and use these to sparse-
            code the patches’ content. This isis a a relatively simple and
              code the patches’ content. This               relatively simple and
            straight-forward algorithm with hardly any entropy coding stage.
              straight-forward algorithm with hardly any entropy coding stage.
            Yet, itit is shown to be superior to several competing algorithms:
              Yet, is shown to be superior to several competing algorithms:
            (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17],
              (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17],
                                                                          2
            and (iii) AA Principal Component Analysis (PCA) approach.2
              and (iii) Principal Component Analysis (PCA) approach.                        Fig. 1.1. (Left) Piece-wise affine warping of the image by triangulation. (Right) A
                                                                                              Fig. (Left) Piece-wise affine warping of the image by triangulation. (Right) A
                In the next section we provide some background material for
                  In the next section we provide some background material for               uniform slicing toto disjoint square patches for coding purposes.
                                                                                              uniform slicing disjoint square patches for coding purposes.
            this work: we start by presenting the details of the compression
              this work: we start by presenting the details of the compression
            algorithm developed in [17], as their scheme isis the one we embark
              algorithm developed in [17], as their scheme the one we embark                K-Means) per each patch separately, using patches taken from the
                                                                                              K-Means) per each patch separately, using patches taken from the
            from in the development of ours. We also describe the topic of
              from in the development of ours. We also describe the topic of                same location from 5000 training images. This way, each VQ isis
                                                                                              same location from 5000 training images. This way, each VQ
            sparse and redundant representations and the K-SVD, that are
              sparse and redundant representations and the K-SVD, that are                  adapted to the expected local content, and thus the high perfor-
                                                                                              adapted to the expected local content, and thus the high perfor-
            the foundations for our algorithm. In Section 3 3 we turn to present
              the foundations for our algorithm. In Section we turn to present              mance presented by this algorithm. The number of code-words
                                                                                              mance presented by this algorithm. The number of code-words
            the proposed algorithm in details, showing its various steps, and
              the proposed algorithm in details, showing its various steps, and             in the VQ isisa afunction of the bit-allocation for the patches. As
                                                                                              in the VQ        function of the bit-allocation for the patches. As
            discussing its computational/memory complexities. Section 4 4
              discussing its computational/memory complexities. Section                     we argue in the next section, VQ coding isis limited by the available
                                                                                              we argue in the next section, VQ coding limited by the available
            presents results of our method, demonstrating the claimed
              presents results of our method, demonstrating the claimed                     number of examples and the desired rate, forcing relatively small
                                                                                              number of examples and the desired rate, forcing relatively small
            superiority. We conclude in Section 5 5 with a list of future activities
              superiority. We conclude in Section with a list of future activities          patch sizes. This, in turn, leads to a a loss of some redundancy be-
                                                                                              patch sizes. This, in turn, leads to loss of some redundancy be-
            that can further improve over the proposed scheme.
              that can further improve over the proposed scheme.                            tween adjacent patches, and thus loss of potential compression.
                                                                                              tween adjacent patches, and thus loss of potential compression.
                                                                                                Another ingredient in this algorithm that partly compensates
                                                                                                  Another ingredient in this algorithm that partly compensates
            2. Background material
             2. Background material                                                         for the above-described shortcoming isis a a multi-scale coding
                                                                                              for the above-described shortcoming               multi-scale coding
                                                                                            scheme. The image isisscaled down and VQ-coded using patches
                                                                                              scheme. The image         scaled down and VQ-coded using patches
            2.1. VQ-based image compression
             2.1. VQ-based image compression                                                of size 8 8 Â 8. Then it is interpolated back to the original resolution,
                                                                                              of size  8. Then it is interpolated back to the original resolution,
                                                                                            and the residual isiscoded using VQ on 8 8 Â 8pixel patches once
                                                                                              and the residual      coded using VQ on  8 pixel patches once
                Among the thousands of papers that study still image
                  Among the thousands of papers that study still image                      again. This method can be applied on a a Laplacian pyramid of the
                                                                                              again. This method can be applied on Laplacian pyramid of the
            compression algorithms, there are relatively few that consider
              compression algorithms, there are relatively few that consider                original (warped) image with several scales [33].
                                                                                              original (warped) image with several scales [33].
            the treatment of facial images [2–17]. Among those, the most
              the treatment of facial images [2–17]. Among those, the most                      As already mentioned above, the results shown in [17] surpass
                                                                                                  As already mentioned above, the results shown in [17] surpass
            recent and the best performing algorithm isis the one reported in
              recent and the best performing algorithm the one reported in                  those obtained by JPEG2000, both visually and in Peak-Signal-to-
                                                                                              those obtained by JPEG2000, both visually and in Peak-Signal-to-
            [17]. That paper also provides a athorough literature survey that
              [17]. That paper also provides thorough literature survey that                Noise Ratio (PSNR) quantitative comparisons. In our work we pro-
                                                                                              Noise Ratio (PSNR) quantitative comparisons. In our work we pro-
            compares the various methods and discusses similarities and
              compares the various methods and discusses similarities and                   pose to replace the coding stage from VQ to sparse and redundant
                                                                                              pose to replace the coding stage from VQ to sparse and redundant
            differences between them. Therefore, rather than repeating such
              differences between them. Therefore, rather than repeating such               representations—this leads us to the next subsection, were we de-
                                                                                              representations—this leads us to the next subsection, were we de-
            a asurvey here, we refer the interested reader to [17]. In this
                 survey here, we refer the interested reader to [17]. In this               scribe the principles behind this coding strategy.
                                                                                              scribe the principles behind this coding strategy.
            sub-section we concentrate on the description of the algorithm
              sub-section we concentrate on the description of the algorithm
            in [17] as our method resembles itit to some extent.
              in [17] as our method resembles to some extent.                               2.2. Sparse and redundant representations
                                                                                             2.2. Sparse and redundant representations
Facial Image Compression                                                                                                 O. Bryt, M. Elad / J. Vis. Commun. Image R. 19 (2008) 270–282


                                                                                           Before turning to preset the results we should add the follow-
                                                                   O.O. Bryt, M. EladJ. J. ing: while all theImage R. 19 (2008) 270–282 specific database
                                                                      Bryt, M. Elad / / Vis. Commun. Image R. shown here 270–282
                                                                                            Vis. Commun. results 19 (2008) refer to the
                                                                                                                                                                    was trained for patch number 80 (The left
                                                                                                                                                                                                           271
                                                                                                                                                                    coding atoms, and similarly, in Fig. 7 we271
                                                                                                                                                                                                              can
                                                                             we operate on, the overall scheme proposed is general and should                       was trained for patch number 87 (The right
                                                                             apply to other face images databases just as well. Naturally, some                     sparse coding atoms. It can be seen that bot


          [Elad et al. 2009]
            show recognizable faces. We use a a database containing around 6000 the parameters might be necessary, and among those,
              show recognizable faces. We use database containing around 6000 in
                                                                             changes                                                                                images similar in nature to the image patch
            such facial images, some of which are used for training and tuning size is the most important to consider. We also note that
              such facial images, some of which are used for training andthe patch
                                                                               tuning                                                                               trained for. A similar behavior was observed

            the algorithm, and the others for testing it, similar to the approach from one source of images to another, this relative size
                                                                             as one shifts
              the algorithm, and the others for testing it, similar to the approach
                                                                             of the background in the photos may vary, and
                                                                                                                                the
                                                                                                                                     necessarily                    4.2. Reconstructed images


Image registration.
            taken in [17].
              taken in [17].                                                 leads to changes in performance. More specifically, when the back-
                In our work we propose a a novel compression algorithm, ground small such larger (e.g., the images we useperformance is
                  In our work we propose novel compression algorithm, related related
                                                                             tively
                                                                                     regions are
                                                                                                  regions),
                                                                                                            the
                                                                                                                compression
                                                                                                                                 here have rela-                        Our coding strategy allows us to learn w
                                                                                                                                                                    age are more difficult than others to co
            to the one presented in [17], improving over it.
              to the one presented in [17], improving over it.               expected to improve.                                                                   assigning the same representation error th
                Our algorithm relies strongly on recent advancements made in
                  Our algorithm relies strongly on recent advancements made in dictionaries

Non-overlapping patches (fk )k .
                                                                            4.1. K-SVD
            using sparse and redundant representation of signals [18–26], and
              using sparse and redundant representation of signals [18–26], and                                                                                                                    fk
                                                                                                                                                                    patches, and observing how many atoms
                                                                                                                                                                    representation of each patch on average.
                                                                                                                                                                    a small number of allocated atoms are simp
            learning their sparsifying dictionaries [27–29]. We use the K-SVD
              learning their sparsifying dictionaries [27–29]. We use the K-SVDThe primary stopping condition for the training process was set                      others. We would expect that the represent
                                                                            to be a limitation on the maximal number of K-SVD iterations                            of the image such as the background, p
            algorithm for learning the dictionaries for representing (being 100). A secondary stopping condition was a limitation on
              algorithm for learning the dictionaries for representingsmall    small                                                                                maybe parts of the clothes will be simpler
            image patches in a a locally adaptive way, and use these to sparse-
              image patches in locally adaptive way, and use these to sparse-
                                                                            the minimal representation error. In the image compression stage                        tion of areas containing high frequency e


Dictionary learning (Dk )k .
                                                                            we added a limitation on the maximal number of atoms per patch.                         hair or the eyes. Fig. 8 shows maps of atom
            code the patches’ content. This isis a a relatively simple and
              code the patches’ content. This             relatively simple and
                                                                            These conditions were used to allow us to better control the rates                      and representation error (RMSE—squared
            straight-forward algorithm with hardly any entropy coding of stage.
                                                                            stage.
              straight-forward algorithm with hardly any entropy coding the resulting images and the overall simulation time.                                       squared error) per patch for the images in
                                                                               Every obtained dictionary contains 512 patches of size                               different bit-rates. It can be seen that more
            Yet, itit is shown to be superior to several competing algorithms: as atoms. In Fig. 6 we can see the dictionary that
              Yet, is shown to be superior to several competing algorithms:pixels
                                                                            15 Â 15                                                                                 to patches containing the facial details (h
            (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17],
              (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17],
                                                                        2
            and (iii) AA Principal Component Analysis (PCA) approach.2
              and (iii) Principal Component Analysis (PCA) approach.                       Fig. 1.1. (Left) Piece-wise affine warping of the image by triangulation. (Right) A
                                                                                             Fig. (Left) Piece-wise affine warping of the image by triangulation. (Right) A
                In the next section we provide some background material for
                  In the next section we provide some background material for              uniform slicing toto disjoint square patches for coding purposes.
                                                                                             uniform slicing disjoint square patches for coding purposes.
            this work: we start by presenting the details of the compression
              this work: we start by presenting the details of the compression
            algorithm developed in [17], as their scheme isis the one we embark
              algorithm developed in [17], as their scheme the one we embark               K-Means) per each patch separately, using patches taken from the
                                                                                             K-Means) per each patch separately, using patches taken from the
            from in the development of ours. We also describe the topic of
              from in the development of ours. We also describe the topic of               same location from 5000 training images. This way, each VQ isis
                                                                                             same location from 5000 training images. This way, each VQ
            sparse and redundant representations and the K-SVD, that are
              sparse and redundant representations and the K-SVD, that are                 adapted to the expected local content, and thus the high perfor-
                                                                                             adapted to the expected local content, and thus the high perfor-
            the foundations for our algorithm. In Section 3 3 we turn to present
              the foundations for our algorithm. In Section we turn to present             mance presented by this algorithm. The number of code-words
                                                                                             mance presented by this algorithm. The number of code-words
            the proposed algorithm in details, showing its various steps, and
              the proposed algorithm in details, showing its various steps, and            in the VQ isisa afunction of the bit-allocation for the patches. As
                                                                                             in the VQ               function of the bit-allocation for the patches. As
            discussing its computational/memory complexities. Section 4 4
              discussing its computational/memory complexities. Section                    we argue in the next section, VQ coding isis limited by the available
                                                                                             we argue in the next section, VQ coding limited by the available
            presents results of our method, demonstrating the claimed
              presents results of our method, demonstrating the claimed                    number of examples and the desired rate, forcing relatively small
                                                                                             number of examples and the desired rate, forcing relatively small
            superiority. We conclude in Section 5 5 with a list of future activities
              superiority. We conclude in Section with a list of future activities         patch sizes. This, in turn, leads to a a loss of some redundancy be-
                                                                                             patch sizes. This, in turn, leads to loss of some redundancy be-
            that can further improve over the proposed scheme.
              that can further improve over the proposed scheme.                           tween adjacent patches, and thus loss of potential compression.
                                                                                             tween adjacent patches, and thus loss of potential compression.

            2. Background material
             2. Background material
                                                                                                           AnotherThe Dictionary obtained by K-SVD for Patch No. 80 (the that partlyOMPcompensates
                                                                                                             Another ingredient in this algorithmleft eye) using the compensates
                                                                                                               Fig. 6. ingredient in this algorithm that partly method with L ¼ 4.
                                                                                                       for the above-described shortcoming isis a a multi-scale coding
                                                                                                         for the above-described shortcoming                                   multi-scale coding
                                                                                                                                                                                                   Dk
                                                                                                       scheme. The image isisscaled down and VQ-coded using patches
                                                                                                         scheme. The image                scaled down and VQ-coded using patches
            2.1. VQ-based image compression
             2.1. VQ-based image compression                                                           of size 8 8 Â 8. Then it is interpolated back to the original resolution,
                                                                                                         of size   Â 8. Then it is interpolated back to the original resolution,
                                                                                                       and the residual isiscoded using VQ on 8 8 Â 8pixel patches once
                                                                                                         and the residual             coded using VQ on  8 pixel patches once
                Among the thousands of papers that study still image
                  Among the thousands of papers that study still image                                 again. This method can be applied on a a Laplacian pyramid of the
                                                                                                         again. This method can be applied on Laplacian pyramid of the
            compression algorithms, there are relatively few that consider
              compression algorithms, there are relatively few that consider                           original (warped) image with several scales [33].
                                                                                                         original (warped) image with several scales [33].
            the treatment of facial images [2–17]. Among those, the most
              the treatment of facial images [2–17]. Among those, the most                                 As already mentioned above, the results shown in [17] surpass
                                                                                                             As already mentioned above, the results shown in [17] surpass
            recent and the best performing algorithm isis the one reported in
              recent and the best performing algorithm the one reported in                             those obtained by JPEG2000, both visually and in Peak-Signal-to-
                                                                                                         those obtained by JPEG2000, both visually and in Peak-Signal-to-
            [17]. That paper also provides a athorough literature survey that
              [17]. That paper also provides thorough literature survey that                           Noise Ratio (PSNR) quantitative comparisons. In our work we pro-
                                                                                                         Noise Ratio (PSNR) quantitative comparisons. In our work we pro-
            compares the various methods and discusses similarities and
              compares the various methods and discusses similarities and                              pose to replace the coding stage from VQ to sparse and redundant
                                                                                                         pose to replace the coding stage from VQ to sparse and redundant
            differences between them. Therefore, rather than repeating such
              differences between them. Therefore, rather than repeating such                          representations—this leads us to the next subsection, were we de-
                                                                                                         representations—this leads us to the next subsection, were we de-
            a asurvey here, we refer the interested reader to [17]. In this
                 survey here, we refer the interested reader to [17]. In this                          scribe the principles behind this coding strategy.
                                                                                                         scribe the principles behind this coding strategy.
            sub-section we concentrate on the description of the algorithm
              sub-section we concentrate on the description of the algorithm
            in [17] as our method resembles itit to some extent.
              in [17] as our method resembles to some extent.                                          2.2. Sparse and redundant representations
                                                                                                        2.2. Sparse and redundant representations
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing
Adaptive Signal and Image Processing

Weitere ähnliche Inhalte

Was ist angesagt?

Mesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh ParameterizationMesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh ParameterizationGabriel Peyré
 
Reflect tsukuba524
Reflect tsukuba524Reflect tsukuba524
Reflect tsukuba524kazuhase2011
 
Lesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of CalculusLesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of CalculusMatthew Leingang
 
"Modern Tracking" Short Course Taught at University of Hawaii
"Modern Tracking" Short Course Taught at University of Hawaii"Modern Tracking" Short Course Taught at University of Hawaii
"Modern Tracking" Short Course Taught at University of HawaiiWilliam J Farrell III
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationSignal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationGabriel Peyré
 
Signal Processing Course : Convex Optimization
Signal Processing Course : Convex OptimizationSignal Processing Course : Convex Optimization
Signal Processing Course : Convex OptimizationGabriel Peyré
 
Mesh Processing Course : Differential Calculus
Mesh Processing Course : Differential CalculusMesh Processing Course : Differential Calculus
Mesh Processing Course : Differential CalculusGabriel Peyré
 
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse ProblemsLow Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse ProblemsGabriel Peyré
 
Jyokyo-kai-20120605
Jyokyo-kai-20120605Jyokyo-kai-20120605
Jyokyo-kai-20120605ketanaka
 
Nonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer VisionNonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer Visionzukun
 
The convenience yield implied by quadratic volatility smiles presentation [...
The convenience yield implied by quadratic volatility smiles   presentation [...The convenience yield implied by quadratic volatility smiles   presentation [...
The convenience yield implied by quadratic volatility smiles presentation [...yigalbt
 
L. Jonke - A Twisted Look on Kappa-Minkowski: U(1) Gauge Theory
L. Jonke - A Twisted Look on Kappa-Minkowski: U(1) Gauge TheoryL. Jonke - A Twisted Look on Kappa-Minkowski: U(1) Gauge Theory
L. Jonke - A Twisted Look on Kappa-Minkowski: U(1) Gauge TheorySEENET-MTP
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas EberleBigMC
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Michael Lie
 
Lesson 26: The Fundamental Theorem of Calculus (slides)
Lesson 26: The Fundamental Theorem of Calculus (slides)Lesson 26: The Fundamental Theorem of Calculus (slides)
Lesson 26: The Fundamental Theorem of Calculus (slides)Matthew Leingang
 
Mesh Processing Course : Multiresolution
Mesh Processing Course : MultiresolutionMesh Processing Course : Multiresolution
Mesh Processing Course : MultiresolutionGabriel Peyré
 

Was ist angesagt? (20)

Mesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh ParameterizationMesh Processing Course : Mesh Parameterization
Mesh Processing Course : Mesh Parameterization
 
Reflect tsukuba524
Reflect tsukuba524Reflect tsukuba524
Reflect tsukuba524
 
Lesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of CalculusLesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of Calculus
 
Test
TestTest
Test
 
"Modern Tracking" Short Course Taught at University of Hawaii
"Modern Tracking" Short Course Taught at University of Hawaii"Modern Tracking" Short Course Taught at University of Hawaii
"Modern Tracking" Short Course Taught at University of Hawaii
 
Signal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems RegularizationSignal Processing Course : Inverse Problems Regularization
Signal Processing Course : Inverse Problems Regularization
 
Signal Processing Course : Convex Optimization
Signal Processing Course : Convex OptimizationSignal Processing Course : Convex Optimization
Signal Processing Course : Convex Optimization
 
1533 game mathematics
1533 game mathematics1533 game mathematics
1533 game mathematics
 
Mesh Processing Course : Differential Calculus
Mesh Processing Course : Differential CalculusMesh Processing Course : Differential Calculus
Mesh Processing Course : Differential Calculus
 
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse ProblemsLow Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
 
Notes 17
Notes 17Notes 17
Notes 17
 
Jyokyo-kai-20120605
Jyokyo-kai-20120605Jyokyo-kai-20120605
Jyokyo-kai-20120605
 
Nonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer VisionNonlinear Manifolds in Computer Vision
Nonlinear Manifolds in Computer Vision
 
The convenience yield implied by quadratic volatility smiles presentation [...
The convenience yield implied by quadratic volatility smiles   presentation [...The convenience yield implied by quadratic volatility smiles   presentation [...
The convenience yield implied by quadratic volatility smiles presentation [...
 
L. Jonke - A Twisted Look on Kappa-Minkowski: U(1) Gauge Theory
L. Jonke - A Twisted Look on Kappa-Minkowski: U(1) Gauge TheoryL. Jonke - A Twisted Look on Kappa-Minkowski: U(1) Gauge Theory
L. Jonke - A Twisted Look on Kappa-Minkowski: U(1) Gauge Theory
 
Andreas Eberle
Andreas EberleAndreas Eberle
Andreas Eberle
 
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...
 
YSC 2013
YSC 2013YSC 2013
YSC 2013
 
Lesson 26: The Fundamental Theorem of Calculus (slides)
Lesson 26: The Fundamental Theorem of Calculus (slides)Lesson 26: The Fundamental Theorem of Calculus (slides)
Lesson 26: The Fundamental Theorem of Calculus (slides)
 
Mesh Processing Course : Multiresolution
Mesh Processing Course : MultiresolutionMesh Processing Course : Multiresolution
Mesh Processing Course : Multiresolution
 

Ähnlich wie Adaptive Signal and Image Processing

Signal Processing Course : Approximation
Signal Processing Course : ApproximationSignal Processing Course : Approximation
Signal Processing Course : ApproximationGabriel Peyré
 
Discussion of Matti Vihola's talk
Discussion of Matti Vihola's talkDiscussion of Matti Vihola's talk
Discussion of Matti Vihola's talkChristian Robert
 
Monash University short course, part II
Monash University short course, part IIMonash University short course, part II
Monash University short course, part IIChristian Robert
 
Mcmc & lkd free II
Mcmc & lkd free IIMcmc & lkd free II
Mcmc & lkd free IIDeb Roy
 
1.Draw these vector fields by a computing software. (Hint if you .docx
1.Draw these vector fields by a computing software. (Hint if you .docx1.Draw these vector fields by a computing software. (Hint if you .docx
1.Draw these vector fields by a computing software. (Hint if you .docxChereCoble417
 
Tele4653 l8
Tele4653 l8Tele4653 l8
Tele4653 l8Vin Voro
 

Ähnlich wie Adaptive Signal and Image Processing (9)

Signal Processing Course : Approximation
Signal Processing Course : ApproximationSignal Processing Course : Approximation
Signal Processing Course : Approximation
 
Angle modulation
Angle modulationAngle modulation
Angle modulation
 
Discussion of Matti Vihola's talk
Discussion of Matti Vihola's talkDiscussion of Matti Vihola's talk
Discussion of Matti Vihola's talk
 
Monash University short course, part II
Monash University short course, part IIMonash University short course, part II
Monash University short course, part II
 
Mcmc & lkd free II
Mcmc & lkd free IIMcmc & lkd free II
Mcmc & lkd free II
 
DFT,DCT TRANSFORMS.pdf
DFT,DCT TRANSFORMS.pdfDFT,DCT TRANSFORMS.pdf
DFT,DCT TRANSFORMS.pdf
 
1.Draw these vector fields by a computing software. (Hint if you .docx
1.Draw these vector fields by a computing software. (Hint if you .docx1.Draw these vector fields by a computing software. (Hint if you .docx
1.Draw these vector fields by a computing software. (Hint if you .docx
 
Future CMB Experiments
Future CMB ExperimentsFuture CMB Experiments
Future CMB Experiments
 
Tele4653 l8
Tele4653 l8Tele4653 l8
Tele4653 l8
 

Mehr von Gabriel Peyré

Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Gabriel Peyré
 
Model Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular GaugesModel Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular GaugesGabriel Peyré
 
Mesh Processing Course : Introduction
Mesh Processing Course : IntroductionMesh Processing Course : Introduction
Mesh Processing Course : IntroductionGabriel Peyré
 
Signal Processing Course : Theory for Sparse Recovery
Signal Processing Course : Theory for Sparse RecoverySignal Processing Course : Theory for Sparse Recovery
Signal Processing Course : Theory for Sparse RecoveryGabriel Peyré
 
Signal Processing Course : Presentation of the Course
Signal Processing Course : Presentation of the CourseSignal Processing Course : Presentation of the Course
Signal Processing Course : Presentation of the CourseGabriel Peyré
 
Signal Processing Course : Orthogonal Bases
Signal Processing Course : Orthogonal BasesSignal Processing Course : Orthogonal Bases
Signal Processing Course : Orthogonal BasesGabriel Peyré
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsGabriel Peyré
 
Signal Processing Course : Fourier
Signal Processing Course : FourierSignal Processing Course : Fourier
Signal Processing Course : FourierGabriel Peyré
 
Signal Processing Course : Denoising
Signal Processing Course : DenoisingSignal Processing Course : Denoising
Signal Processing Course : DenoisingGabriel Peyré
 
Signal Processing Course : Compressed Sensing
Signal Processing Course : Compressed SensingSignal Processing Course : Compressed Sensing
Signal Processing Course : Compressed SensingGabriel Peyré
 
Signal Processing Course : Wavelets
Signal Processing Course : WaveletsSignal Processing Course : Wavelets
Signal Processing Course : WaveletsGabriel Peyré
 
Sparsity and Compressed Sensing
Sparsity and Compressed SensingSparsity and Compressed Sensing
Sparsity and Compressed SensingGabriel Peyré
 
Optimal Transport in Imaging Sciences
Optimal Transport in Imaging SciencesOptimal Transport in Imaging Sciences
Optimal Transport in Imaging SciencesGabriel Peyré
 
An Introduction to Optimal Transport
An Introduction to Optimal TransportAn Introduction to Optimal Transport
An Introduction to Optimal TransportGabriel Peyré
 
A Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New OneA Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New OneGabriel Peyré
 

Mehr von Gabriel Peyré (15)

Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
 
Model Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular GaugesModel Selection with Piecewise Regular Gauges
Model Selection with Piecewise Regular Gauges
 
Mesh Processing Course : Introduction
Mesh Processing Course : IntroductionMesh Processing Course : Introduction
Mesh Processing Course : Introduction
 
Signal Processing Course : Theory for Sparse Recovery
Signal Processing Course : Theory for Sparse RecoverySignal Processing Course : Theory for Sparse Recovery
Signal Processing Course : Theory for Sparse Recovery
 
Signal Processing Course : Presentation of the Course
Signal Processing Course : Presentation of the CourseSignal Processing Course : Presentation of the Course
Signal Processing Course : Presentation of the Course
 
Signal Processing Course : Orthogonal Bases
Signal Processing Course : Orthogonal BasesSignal Processing Course : Orthogonal Bases
Signal Processing Course : Orthogonal Bases
 
Signal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse ProblemsSignal Processing Course : Sparse Regularization of Inverse Problems
Signal Processing Course : Sparse Regularization of Inverse Problems
 
Signal Processing Course : Fourier
Signal Processing Course : FourierSignal Processing Course : Fourier
Signal Processing Course : Fourier
 
Signal Processing Course : Denoising
Signal Processing Course : DenoisingSignal Processing Course : Denoising
Signal Processing Course : Denoising
 
Signal Processing Course : Compressed Sensing
Signal Processing Course : Compressed SensingSignal Processing Course : Compressed Sensing
Signal Processing Course : Compressed Sensing
 
Signal Processing Course : Wavelets
Signal Processing Course : WaveletsSignal Processing Course : Wavelets
Signal Processing Course : Wavelets
 
Sparsity and Compressed Sensing
Sparsity and Compressed SensingSparsity and Compressed Sensing
Sparsity and Compressed Sensing
 
Optimal Transport in Imaging Sciences
Optimal Transport in Imaging SciencesOptimal Transport in Imaging Sciences
Optimal Transport in Imaging Sciences
 
An Introduction to Optimal Transport
An Introduction to Optimal TransportAn Introduction to Optimal Transport
An Introduction to Optimal Transport
 
A Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New OneA Review of Proximal Methods, with a New One
A Review of Proximal Methods, with a New One
 

Kürzlich hochgeladen

Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.raviapr7
 
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptxPractical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptxKatherine Villaluna
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsEugene Lysak
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...raviapr7
 
How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17Celine George
 
Ultra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxUltra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxDr. Asif Anas
 
UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?TechSoup
 
How to Use api.constrains ( ) in Odoo 17
How to Use api.constrains ( ) in Odoo 17How to Use api.constrains ( ) in Odoo 17
How to Use api.constrains ( ) in Odoo 17Celine George
 
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptxClinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptxraviapr7
 
M-2- General Reactions of amino acids.pptx
M-2- General Reactions of amino acids.pptxM-2- General Reactions of amino acids.pptx
M-2- General Reactions of amino acids.pptxDr. Santhosh Kumar. N
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphNetziValdelomar1
 
How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17Celine George
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapitolTechU
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxMYDA ANGELICA SUAN
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfYu Kanazawa / Osaka University
 
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfMaximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfTechSoup
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17Celine George
 
How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17Celine George
 

Kürzlich hochgeladen (20)

Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.
 
Finals of Kant get Marx 2.0 : a general politics quiz
Finals of Kant get Marx 2.0 : a general politics quizFinals of Kant get Marx 2.0 : a general politics quiz
Finals of Kant get Marx 2.0 : a general politics quiz
 
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptxPractical Research 1: Lesson 8 Writing the Thesis Statement.pptx
Practical Research 1: Lesson 8 Writing the Thesis Statement.pptx
 
The Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George WellsThe Stolen Bacillus by Herbert George Wells
The Stolen Bacillus by Herbert George Wells
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...
 
How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17How to Make a Field read-only in Odoo 17
How to Make a Field read-only in Odoo 17
 
Ultra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxUltra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptx
 
UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024UKCGE Parental Leave Discussion March 2024
UKCGE Parental Leave Discussion March 2024
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?
 
How to Use api.constrains ( ) in Odoo 17
How to Use api.constrains ( ) in Odoo 17How to Use api.constrains ( ) in Odoo 17
How to Use api.constrains ( ) in Odoo 17
 
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptxClinical Pharmacy  Introduction to Clinical Pharmacy, Concept of clinical pptx
Clinical Pharmacy Introduction to Clinical Pharmacy, Concept of clinical pptx
 
M-2- General Reactions of amino acids.pptx
M-2- General Reactions of amino acids.pptxM-2- General Reactions of amino acids.pptx
M-2- General Reactions of amino acids.pptx
 
Presentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a ParagraphPresentation on the Basics of Writing. Writing a Paragraph
Presentation on the Basics of Writing. Writing a Paragraph
 
How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17How to Add a many2many Relational Field in Odoo 17
How to Add a many2many Relational Field in Odoo 17
 
CapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptxCapTechU Doctoral Presentation -March 2024 slides.pptx
CapTechU Doctoral Presentation -March 2024 slides.pptx
 
Patterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptxPatterns of Written Texts Across Disciplines.pptx
Patterns of Written Texts Across Disciplines.pptx
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
 
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdfMaximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
Maximizing Impact_ Nonprofit Website Planning, Budgeting, and Design.pdf
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17
 
How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17
 

Adaptive Signal and Image Processing

  • 1. Adaptive Signal and Image Processing Gabriel Peyré www.numerical-tours.com
  • 2. Natural Image Priors Fourier decomposition Wavelet decomposition
  • 3. Natural Image Priors Fourier decomposition Wavelet decomposition
  • 4. Overview •Sparsity for Compression and Denoising •Geometric Representations •L0 vs. L1 Sparsity •Dictionary Learning
  • 8. Image and Texture Models Uniformly smooth C image. Fourier, Wavelets: ||f fM ||2 = O(M ). | f |2
  • 9. Image and Texture Models Uniformly smooth C image. Fourier, Wavelets: ||f fM ||2 = O(M ). | f |2 Discontinuous image with bounded variation. Wavelets: ||f fM ||2 = O(M 1 ). | f|
  • 10. Image and Texture Models Uniformly smooth C image. Fourier, Wavelets: ||f fM ||2 = O(M ). | f |2 Discontinuous image with bounded variation. Wavelets: ||f fM ||2 = O(M 1 ). | f| C -geometrically regular image. 2 Curvelets: ||f fM ||2 = O(log3 (M )M 2 ). | f|
  • 11. Image and Texture Models Uniformly smooth C image. Fourier, Wavelets: ||f fM ||2 = O(M ). | f |2 Discontinuous image with bounded variation. Wavelets: ||f fM ||2 = O(M 1 ). | f| C -geometrically regular image. 2 Curvelets: ||f fM ||2 = O(log3 (M )M 2 ). | f|
  • 12. Image and Texture Models Uniformly smooth C image. Fourier, Wavelets: ||f fM ||2 = O(M ). | f |2 Discontinuous image with bounded variation. Wavelets: ||f fM ||2 = O(M 1 ). | f| C -geometrically regular image. 2 Curvelets: ||f fM ||2 = O(log3 (M )M 2 ). | f| More complex images: needs adaptivity.
  • 13. Compression by Transform-coding forward f transform a[m] = ⇥f, m⇤ R Image f Zoom on f
  • 14. Compression by Transform-coding forward f transform a[m] = ⇥f, m⇤ R q[m] Z bin T ⇥ 2T |a[m]| 2T T T a[m] Quantization: q[m] = sign(a[m]) Z T a[m] ˜ Image f Zoom on f Quantized q[m]
  • 15. Compression by Transform-coding forward codi f transform a[m] = ⇥f, m⇤ R q[m] Z ng bin T ⇥ 2T |a[m]| 2T T T a[m] Quantization: q[m] = sign(a[m]) Z T a[m] ˜ Entropic coding: use statistical redundancy (many 0’s). Image f Zoom on f Quantized q[m]
  • 16. Compression by Transform-coding forward codi f transform a[m] = ⇥f, m⇤ R q[m] Z n g bin T ng c odi de q[m] Z ⇥ 2T |a[m]| 2T T T a[m] Quantization: q[m] = sign(a[m]) Z T a[m] ˜ Entropic coding: use statistical redundancy (many 0’s). Image f Zoom on f Quantized q[m]
  • 17. Compression by Transform-coding forward codi f transform a[m] = ⇥f, m⇤ R q[m] Z n g bin T ng c odi dequantization de a[m] ˜ q[m] Z ⇥ 2T |a[m]| 2T T T a[m] Quantization: q[m] = sign(a[m]) Z T a[m] ˜ Entropic coding: use statistical redundancy (many 0’s). ⇥ 1 Dequantization: a[m] = sign(q[m]) |q[m] + ˜ T 2 Image f Zoom on f Quantized q[m]
  • 18. Compression by Transform-coding forward codi f transform a[m] = ⇥f, m⇤ R q[m] Z n g bin T ng c odi backward dequantization de fR = a[m] ˜ m a[m] ˜ q[m] Z transform ⇥ 2T m IT |a[m]| 2T T T a[m] Quantization: q[m] = sign(a[m]) Z T a[m] ˜ Entropic coding: use statistical redundancy (many 0’s). ⇥ 1 Dequantization: a[m] = sign(q[m]) |q[m] + ˜ T 2 Image f Zoom on f Quantized q[m] f , R =0.2 bit/pixel
  • 19. Compression by Transform-coding forward codi f transform a[m] = ⇥f, m⇤ R q[m] Z n g bin T ng c odi backward dequantization de fR = a[m] ˜ m a[m] ˜ q[m] Z transform ⇥ 2T m IT |a[m]| 2T T T a[m] Quantization: q[m] = sign(a[m]) Z T a[m] ˜ Entropic coding: use statistical redundancy (many 0’s). ⇥ 1 Dequantization: a[m] = sign(q[m]) |q[m] + ˜ T 2 “Theorem:” ||f fM ||2 = O(M ) =⇥ ||f fR ||2 = O(log (R)R ) Image f Zoom on f Quantized q[m] f , R =0.2 bit/pixel
  • 20. Wavelets…JPEG-2000 vs. JPEG JPEG2k: exploit the statistical redundancy of coe cients. + embedded coder. chunks of large coe cients. neighboring coe cients are not independent. Compression JPEG Compression EZW Compression orm Image f JPEG, R = .19bit/pxl JPEG2k, R = .15bit/pxl
  • 22. Denoising (Donoho/Johnstone) N 1 thresh. ˜ f= f, m⇥ m f= f, m⇥ m m=0 | f, m ⇥|>T
  • 23. Denoising (Donoho/Johnstone) N 1 thresh. ˜ f= f, m⇥ m f= f, m⇥ m m=0 | f, m ⇥|>T Theorem: if ||f0 f0,M ||2 = O(M ), In practice: ˜ E(||f f0 || ) = O( 2 2 +1 ) for T = 2 log(N ) T 3
  • 24. Overview •Sparsity for Compression and Denoising •Geometric Representations •L0 vs. L1 Sparsity •Dictionary Learning
  • 25. Piecewise Regular Functions in 1D Theorem: If f is C outside a finite set of discontinuities: O(M 1 ) (Fourier), ||f fM || = 2 O(M 2 ) (wavelets). For Fourier, linear non-linear, sub-optimal. For wavelets, linear non-linear, optimal.
  • 26. Piecewise Regular Functions in 2D Theorem: If f is C outside a set of finite length edge curves, O(M 1/2 ) (Fourier), ||f fM || = 2 O(M 1 ) (wavelets). Fourier Wavelets. Wavelets: same result for BV functions (optimal).
  • 27. Piecewise Regular Functions in 2D Theorem: If f is C outside a set of finite length edge curves, O(M 1/2 ) (Fourier), ||f fM || = 2 O(M 1 ) (wavelets). Fourier Wavelets. Wavelets: same result for BV functions (optimal). Regular C edges: sub-optimal (requires anisotropy).
  • 28. Geometrically Regular Images Geometric image model: f is C outside a set of C edge curves. BV image: level sets have finite lengths. Geometric image: level sets are regular. | f| Geometry = cartoon image Sharp edges Smoothed edges
  • 29. Curvelets for Cartoon Images Curvelets: [Candes, Donoho] [Candes, Demanet, Ying, Donoho]
  • 30. Curvelets for Cartoon Images Curvelets: [Candes, Donoho] [Candes, Demanet, Ying, Donoho] Redundant tight frame (redundancy 5): not e cient for compression. Denoising by curvelet thresholding: recovers edges and geometric textures.
  • 31. Approximation with Triangulation Vertices V = {vi }M . Triangulation (V, F): i=1 Faces F {1, . . . , M }3 .
  • 32. Approximation with Triangulation Vertices V = {vi }M . Triangulation (V, F): i=1 Faces F {1, . . . , M }3 . M Piecewise linear approximation: fM = m ⇥m m=1 = argmin ||f µm ⇥m || µ m ⇥m (vi ) = m i is a ne on each face of F.
  • 33. Approximation with Triangulation Vertices V = {vi }M . Triangulation (V, F): i=1 Faces F {1, . . . , M }3 . M Piecewise linear approximation: fM = m ⇥m m=1 = argmin ||f µm ⇥m || µ m ⇥m (vi ) = m i is a ne on each face of F. Theorem: There exists (V, F) such that Regular areas: M/2 equilateral triangles. 2 ||f fM || Cf M 1/2 M Singular areas: M 1/2 M/2 anisotropic triangles.
  • 34. Approximation with Triangulation Vertices V = {vi }M . Triangulation (V, F): i=1 Faces F {1, . . . , M }3 . M Piecewise linear approximation: fM = m ⇥m m=1 = argmin ||f µm ⇥m || µ m ⇥m (vi ) = m i is a ne on each face of F. Theorem: There exists (V, F) such that Regular areas: M/2 equilateral triangles. 2 ||f fM || Cf M 1/2 M Optimal (V, F): NP-hard. Singular areas: Provably good greedy schemes: M 1/2 M/2 anisotropic triangles. [Mirebeau, Cohen, 2009]
  • 35. Greedy Triangulation Optimization Bougleux, Peyr´, Cohen, ICCV’09 e M = 200 M = 600 Anisotropic triangulation JPEG2000
  • 36. Geometric Multiscale Processing f Image f R N Wavelet transform j Wavelet coe cients fj [n] = f, j,n fj
  • 37. Geometric Multiscale Processing f Image f R N Wavelet transform j Wavelet coe cients fj [n] = f, j,n fj Geometry Geometric transform Geometric coe cients
  • 38. Geometric Multiscale Processing f Image f R N Wavelet transform j Wavelet coe cients fj [n] = f, j,n fj Geometry Geometric transform Geometric coe cients 2 j b[j, , n] = fj , b⇤,n ⇥ ⇥ 2
  • 39. Geometric Multiscale Processing f Image f R N Wavelet transform j Wavelet coe cients fj [n] = f, j,n fj Geometry Geometric transform C cartoon image: Geometric coe cients 2 j ||f fM ||2 = O(M ) b[j, , n] = fj , b⇤,n ⇥ ⇥ M = Mband + M 2
  • 40. Overview •Sparsity for Compression and Denoising •Geometric Representations •L0 vs. L1 Sparsity •Dictionary Learning
  • 41. Image Representation Q 1 Dictionary D = {dm }m=0 of atoms dm RN . Q 1 Image decomposition: f = xm dm = Dx m=0 dm xm dm = f D x
  • 42. Image Representation Q 1 Dictionary D = {dm }m=0 of atoms dm RN . Q 1 Image decomposition: f = xm dm = Dx m=0 dm Image approximation: f Dx xm dm = f D x
  • 43. Image Representation Q 1 Dictionary D = {dm }m=0 of atoms dm RN . Q 1 Image decomposition: f = xm dm = Dx m=0 dm Image approximation: f Dx xm dm Orthogonal dictionary: N = Q = xm = f, dm f D x
  • 44. Image Representation Q 1 Dictionary D = {dm }m=0 of atoms dm RN . Q 1 Image decomposition: f = xm dm = Dx m=0 dm Image approximation: f Dx xm dm Orthogonal dictionary: N = Q = xm = f, dm f D x Redundant dictionary: N Q Examples: TI wavelets, curvelets, . . . x is not unique.
  • 45. Sparsity Q 1 Decomposition: f= xm dm = Dx m=0 Sparsity: most xm are small. Example: wavelet transform. Image f Coe cients x
  • 46. Sparsity Q 1 Decomposition: f= xm dm = Dx m=0 Sparsity: most xm are small. Example: wavelet transform. Image f Ideal sparsity: most xm are zero. J0 (x) = | {m xm = 0} | Coe cients x
  • 47. Sparsity Q 1 Decomposition: f= xm dm = Dx m=0 Sparsity: most xm are small. Example: wavelet transform. Image f Ideal sparsity: most xm are zero. J0 (x) = | {m xm = 0} | Approximate sparsity: compressibility ||f Dx|| is small with J0 (x) M. Coe cients x
  • 48. Sparse Coding Q 1 Redundant dictionary D = {dm }m=0 , Q N. non-unique representation f = Dx. Sparsest decomposition: min J0 (x) f =Dx
  • 49. Sparse Coding Q 1 Redundant dictionary D = {dm }m=0 , Q N. non-unique representation f = Dx. Sparsest decomposition: min J0 (x) f =Dx 1 Sparsest approximation: min ||f Dx|| + J0 (x) 2 x 2 Equivalence min ||f Dx|| M ⇥ J0 (x) M min J0 (x) ||f Dx||
  • 50. Sparse Coding Q 1 Redundant dictionary D = {dm }m=0 , Q N. non-unique representation f = Dx. Sparsest decomposition: min J0 (x) f =Dx 1 Sparsest approximation: min ||f Dx|| + J0 (x) 2 x 2 Equivalence min ||f Dx|| M ⇥ J0 (x) M min J0 (x) ||f Dx|| Ortho-basis D: ⇤ Pick the M largest f, dm ⇥ if |xm | 2 coe cients xm = 0 otherwise. in { f, dm ⇥}m
  • 51. Sparse Coding Q 1 Redundant dictionary D = {dm }m=0 , Q N. non-unique representation f = Dx. Sparsest decomposition: min J0 (x) f =Dx 1 Sparsest approximation: min ||f Dx|| + J0 (x) 2 x 2 Equivalence min ||f Dx|| M ⇥ J0 (x) M min J0 (x) ||f Dx|| Ortho-basis D: ⇤ Pick the M largest f, dm ⇥ if |xm | 2 coe cients xm = 0 otherwise. in { f, dm ⇥}m General redundant dictionary: NP-hard.
  • 52. Convex Relaxation: L1 Prior J0 (x) = | {m xm = 0} | J0 (x) = 0 null image. Image with 2 pixels: J0 (x) = 1 sparse image. J0 (x) = 2 non-sparse image. d1 d0 q=0
  • 53. Convex Relaxation: L1 Prior J0 (x) = | {m xm = 0} | J0 (x) = 0 null image. Image with 2 pixels: J0 (x) = 1 sparse image. J0 (x) = 2 non-sparse image. d1 d0 q=0 q = 1/2 q=1 q = 3/2 q=2 q priors: Jq (x) = |xm |q (convex for q 1) m
  • 54. Convex Relaxation: L1 Prior J0 (x) = | {m xm = 0} | J0 (x) = 0 null image. Image with 2 pixels: J0 (x) = 1 sparse image. J0 (x) = 2 non-sparse image. d1 d0 q=0 q = 1/2 q=1 q = 3/2 q=2 q priors: Jq (x) = |xm |q (convex for q 1) m Sparse 1 prior: J1 (x) = ||x||1 = |xm | m
  • 56. Inverse Problems Denoising/approximation: = Id. Examples: Inpainting, super-resolution, compressed-sensing
  • 57. Regularized Inversion Denoising/compression: y = f0 + w RN . Sparse approximation: f = Dx where 1 x ⇥ argmin ||y Dx||2 + ||x||1 x 2 Fidelity
  • 58. Regularized Inversion Denoising/compression: y = f0 + w RN . Sparse approximation: f = Dx where 1 x ⇥ argmin ||y Dx||2 + ||x||1 x 2 Fidelity Replace D by D Inverse problems y = f0 + w RP . 1 x ⇥ argmin ||y Dx|| + ||x||1 2 x 2
  • 59. Regularized Inversion Denoising/compression: y = f0 + w RN . Sparse approximation: f = Dx where 1 x ⇥ argmin ||y Dx||2 + ||x||1 x 2 Fidelity Replace D by D Inverse problems y = f0 + w RP . 1 x ⇥ argmin ||y Dx|| + ||x||1 2 x 2 Numerical solvers: proximal splitting schemes. www.numerical-tours.com
  • 61. Overview •Sparsity for Compression and Denoising •Geometric Representations •L0 vs. L1 Sparsity •Dictionary Learning
  • 62. Dictionary Learning: MAP Energy Set of (noisy) exemplars {yk }k . 1 Sparse approximation: min ||yk Dxk || + ||xk ||1 2 xk 2
  • 63. Dictionary Learning: MAP Energy Set of (noisy) exemplars {yk }k . 1 Sparse approximation: min min ||yk Dxk || + ||xk ||1 2 D C xk k 2 Dictionary learning
  • 64. Dictionary Learning: MAP Energy Set of (noisy) exemplars {yk }k . 1 Sparse approximation: min min ||yk Dxk || + ||xk ||1 2 D C xk k 2 Dictionary learning Constraint: C = {D = (dm )m m, ||dm || 1} Otherwise: D + , X 0
  • 65. Dictionary Learning: MAP Energy Set of (noisy) exemplars {yk }k . 1 Sparse approximation: min min ||yk Dxk || + ||xk ||1 2 D C xk k 2 Dictionary learning Constraint: C = {D = (dm )m m, ||dm || 1} Otherwise: D + , X 0 Matrix formulation: 1 min f (X, D) = ||Y DX|| + ||X||1 2 X⇥R Q K 2 D⇥C RN Q
  • 66. Dictionary Learning: MAP Energy Set of (noisy) exemplars {yk }k . 1 Sparse approximation: min min ||yk Dxk || + ||xk ||1 2 D C xk k 2 Dictionary learning Constraint: C = {D = (dm )m m, ||dm || 1} Otherwise: D + , X 0 Matrix formulation: 1 min f (X, D) = ||Y DX|| + ||X||1 2 X⇥R Q K 2 min f (X, D) D⇥C R N Q X Convex with respect to X. Convex with respect to D. D Non-onvex with respect to (X, D). Local minima
  • 67. Dictionary Learning: Algorithm Step 1: k, minimization on xk 1 min ||yk Dxk || + ||xk ||1 2 xk 2 Convex sparse coding. D, initialization
  • 68. Dictionary Learning: Algorithm Step 1: k, minimization on xk 1 min ||yk Dxk || + ||xk ||1 2 xk 2 Convex sparse coding. Step 2: Minimization on D D, initialization min ||Y DX|| 2 D C Convex constraint minimization.
  • 69. Dictionary Learning: Algorithm Step 1: k, minimization on xk 1 min ||yk Dxk || + ||xk ||1 2 xk 2 Convex sparse coding. Step 2: Minimization on D D, initialization min ||Y DX|| 2 D C Convex constraint minimization. Projected gradient descent: D ( +1) = ProjC D ( ) (D ( ) X Y )X
  • 70. Dictionary Learning: Algorithm Step 1: k, minimization on xk 1 min ||yk Dxk || + ||xk ||1 2 xk 2 Convex sparse coding. Step 2: Minimization on D D, initialization min ||Y DX|| 2 D C Convex constraint minimization. Projected gradient descent: D ( +1) = ProjC D ( ) (D ( ) X Y )X Convergence: toward a stationary point of f (X, D). D, convergence
  • 71. Patch-based Learning Learning D Exemplar patches yk Dictionary D [Olshausen, Fields 1997] State of the art denoising [Elad et al. 2006]
  • 72. Patch-based Learning Learning D Exemplar patches yk Dictionary D [Olshausen, Fields 1997] State of the art denoising [Elad et al. 2006] Learning D Sparse texture synthesis, inpainting [Peyr´ 2008] e
  • 73. Comparison with PCA PCA dimensionality reduction: ⇥ k, min ||Y D(k) X|| D (k) = (dm )m=0 k 1 D Linear (PCA): Fourier-like atoms. RUBINSTEIN et al.: al.: DICTIONARIES FOR SPARSE REPRESENTATION RUBINSTEIN et DICTIONARIES FOR SPARSE REPRESENTATION 1980 by by Bast 1980 Bastiaa fundamental prop fundamental p A basic 1-D G A basic 1-D forms forms © © G = = n, G ⇤ ⇤ DCT PCA where w(·) is is where w(·) a Fig. 1. 1.Left: A fewfew £ 12 12 DCT atoms. Right: The first 40 KLT atoms, (typically a Gau Fig. Left: A 12 12 £ DCT atoms. Right: The first 40 KLT atoms, (typically a G trained using 12 £ 12 12 image patches from Lena. trained using 12 £ image patches from Lena. frequency resolu frequency reso matical foundatio matical founda late 1980’s by by late 1980’s D B. B. Non-Linear Revolution and Elements Modern Dictionary Non-Linear Revolution and Elements of of Modern Dictionary who studied thet who studied Design Design and by by Feichti and Feichting In In statistics research, the 1980’s saw the rise of new generalized group statistics research, the 1980’s saw the rise of a a new generalized gro powerful approach known as as robust statistics. Robust statistics powerful approach known robust statistics. Robust statistics
  • 74. Comparison with PCA PCA dimensionality reduction: ⇥ k, min ||Y D(k) X|| D (k) = (dm )m=0 k 1 D Linear (PCA): Fourier-like atoms. RUBINSTEIN et al.: al.: DICTIONARIES FOR SPARSE REPRESENTATION RUBINSTEIN et DICTIONARIES FOR SPARSE REPRESENTATION Sparse (learning): Gabor-like atoms. 1980 by by Bast 1980 Bastiaa fundamental prop fundamental p A basic 1-D G A basic 1-D forms forms © © 4 G = = n, G ⇤ ⇤ 4 DCT PCA where w(·) is is where w(·) a Fig. 1. 1.Left: A fewfew £ 12 12 DCT atoms. Right: The first 40 KLT atoms, (typically a Gau Fig. Left: A 12 12 £ DCT atoms. Right: The first 40 KLT atoms, 0.15 (typically a G 0.15 trained using 12 £ 12 12 image patches from Lena. trained using 12 £ image patches from Lena. frequency resolu 0.1 frequency reso 0.1 matical foundatio matical founda 0.05 0.05 0 0 late 1980’s by by late 1980’s D B. B. Non-Linear Revolution and Elements Modern Dictionary Non-Linear Revolution and Elements of of Modern Dictionary -0.05 -0.05 who studied thet who studied Design Design -0.1 -0.1 and by by Feichti -0.15 and Feichting -0.15 In In statistics research, the 1980’s saw the rise of new generalized group statistics research, the 1980’s saw the rise of a a new generalized gro -0.2 -0.2 Gabor Learned powerful approach known as as robust statistics. Robust statistics powerful approach known robust statistics. Robust statistics
  • 75. Patch-based Denoising Noisy image: f = f0 + w. Step 1: Extract patches. yk (·) = f (zk + ·) yk [Aharon & Elad 2006]
  • 76. Patch-based Denoising Noisy image: f = f0 + w. Step 1: Extract patches. yk (·) = f (zk + ·) Step 2: Dictionary learning. 1 min ||yk Dxk || + ||xk ||1 2 D,(xk )k 2 k yk [Aharon & Elad 2006]
  • 77. Patch-based Denoising Noisy image: f = f0 + w. Step 1: Extract patches. yk (·) = f (zk + ·) Step 2: Dictionary learning. 1 min ||yk Dxk || + ||xk ||1 2 D,(xk )k 2 k Step 3: Patch averaging. yk = Dxk ˜ ˜ f (·) ⇥ yk (· zk ) ˜ k yk ˜ yk [Aharon & Elad 2006]
  • 78. Learning with Missing Data Inverse problem: y = f0 + w LEARNING MULTISCALE AND S 1 1 min ||y f || + 2 ||pk (f ) Dxk || + ⇥||xk ||1 2 f,(xk )k 2 2 k D C f0 Patch extractor: pk (f ) = f (zk + ·) pk LEARNING MULTISCALE AND SPARSE REPRESENTATIONS 237 (a) Original y (a) Original (b) Damaged
  • 79. Learning with Missing Data Inverse problem: y = f0 + w LEARNING MULTISCALE AND S 1 1 min ||y f || + 2 ||pk (f ) Dxk || + ⇥||xk ||1 2 f,(xk )k 2 2 k D C f0 Patch extractor: pk (f ) = f (zk + ·) pk Step 1: k, minimization on xk LEARNING MULTISCALE AND SPARSE REPRESENTATIONS 237 Convex sparse coding. (a) Original y (a) Original (b) Damaged
  • 80. Learning with Missing Data Inverse problem: y = f0 + w LEARNING MULTISCALE AND S 1 1 min ||y f || + 2 ||pk (f ) Dxk || + ⇥||xk ||1 2 f,(xk )k 2 2 k D C f0 Patch extractor: pk (f ) = f (zk + ·) pk Step 1: k, minimization on xk LEARNING MULTISCALE AND SPARSE REPRESENTATIONS 237 Convex sparse coding. Step 2: Minimization on D (a) Original y Quadratic constrained. (a) Original (b) Damaged
  • 81. Learning with Missing Data Inverse problem: y = f0 + w LEARNING MULTISCALE AND S 1 1 min ||y f || + 2 ||pk (f ) Dxk || + ⇥||xk ||1 2 f,(xk )k 2 2 k D C f0 Patch extractor: pk (f ) = f (zk + ·) pk Step 1: k, minimization on xk LEARNING MULTISCALE AND SPARSE REPRESENTATIONS 237 Convex sparse coding. Step 2: Minimization on D (a) Original y Quadratic constrained. Step 3: Minimization on f Quadratic. (a) Original (b) Damaged
  • 82. Inpainting Example LEARNING MULTISCALE AND SPARSE REPRESENTATIONS LEARNING MULTISCALE AND SPARSE REPRESENTATIONS 237 237 (a) Original (b) Damaged Image f0 (a) Original (a) Original Observations (c) Restored, N = 1 (b) Damaged (b) Damaged Regularized f (d) Restored, N = 2 y = using + w Fig. 14. Inpainting f0 N = 2 and n = 16 × 16 (bottom-right image), or N = 1 and n = 8 × 8 (bottom-left). J = 100 iterations were performed, producing an adaptive dictionary. During the learning, 50% of the patches were used. A sparsity factor L = 10 has been used during the learning process and L = 25 for the final reconstruction. The damaged image was created by removing 75% of the data from the original image. The initial PSNR is 6.13dB. The resulting PSNR for N = 2 is [Mairal et al. 2008] 33.97dB and 31.75dB for N = 1.
  • 83. Adaptive Inpainting and Separation Wavelets Local DCT Wavelets Local DCT Learned [Peyr´, Fadili, Starck 2010] e
  • 84. OISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR TS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 D DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPRO SENTATION FOR COLOR IMAGE RESTORATION EST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUC K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT AR s are reduced with our proposed technique ( Higher Dimensional Learning mples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric). in our proposed new metric). Both images have been denoised with the same global dictionary. bserves a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when MAIRAL rs), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. dB. (c) Proposed algorithm, et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION naries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms. can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches. OLOR IMAGE RESTORATION 61 Fig. 7. Data set used for evaluating denoising experiments. Learning D (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one. TABLE I Fig. 7. Data set used for evaluating denoising experiments. les of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposedatoms learned on new metric). Fig. 2. Dictionaries with 256 ( in the a generic database of natural images, with two d are reduced with our proposed technique ( TABLE I our proposed new metric). Both images have been denoised negative values, the vectors are presented scaled and shifted to the [0,2 in Since the atoms can have with the same global dictionary. ervesITH 256 ATOMS OF SIZE castle and in3 FOR of the water. What is more, the color of the sky is . EACH constant IS DIVIDED IN FOUR W a bias effect in the color from the 7 7 some part AND 6 6 3 FOR piecewise CASE when EN BY is another artifact our approach corrected. (a)HEIR “3(b) Original algorithm, HE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY ), which MCAULEY AND AL [28] WITH T Original. 3 MODEL.” T dB. (c) Proposed algorithm, 3 MODEL.” THE TOP-RIGHT RESUL dB. AND 6 M [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINED E BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS. EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS 6 3 FOR . EAC
  • 85. OISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR TS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 D DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPRO SENTATION FOR COLOR IMAGE RESTORATION EST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUC K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT AR Higher Dimensional Learning O NLINE L EARNING FOR M ATRIX FACTORIZATION AND FACTORIZATION AND S PARSE C ODING O NLINE L EARNING FOR M ATRIX S PARSE C ODING mples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( s are reduced with our proposed technique ( in the new metric). in our proposed new metric). Both images have been denoised with the same global dictionary. bserves a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when MAIRAL rs), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. dB. (c) Proposed algorithm, et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION naries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms. can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches. OLOR IMAGE RESTORATION 61 Fig. 7. Data set used for evaluating denoising experiments. Learning D (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one. TABLE I Fig. 7. Data set used for evaluating denoising experiments. les of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposedatoms learned on new metric). Fig. 2. Dictionaries with 256 ( in the a generic database of natural images, with two d are reduced with our proposed technique ( TABLE I our proposed new metric). Both images have been denoised negative values, the vectors are presented scaled and shifted to the [0,2 in Since the atoms can have with the same global dictionary. ervesITH 256 ATOMS OF SIZE castle and in3 FOR of the water. What is more, the color of the sky is . EACH constant IS DIVIDED IN FOUR W a bias effect in the color from the 7 7 some part AND 6 6 3 FOR piecewise CASE when EN BY is another artifact our approach corrected. (a)HEIR “3(b) Original algorithm, HE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY ), which MCAULEY AND AL [28] WITH T Original. 3 MODEL.” T dB. (c) Proposed algorithm, 3 MODEL.” THE TOP-RIGHT RESUL dB. AND 6 M [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINED E BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS. Inpainting EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS 6 3 FOR . EAC
  • 87. Facial Image Compression O. Bryt, M. Elad / J. Vis. Commun. Image R. 19 (2008) 270–282 271 [Elad et al. 2009] show recognizable faces. We use a database containing around 6000 such facial images, some of which are used for training and tuning the algorithm, and the others for testing it, similar to the approach Image registration. taken in [17]. In our work we propose a novel compression algorithm, related to the one presented in [17], improving over it. Our algorithm relies strongly on recent advancements made in using sparse and redundant representation of signals [18–26], and learning their sparsifying dictionaries [27–29]. We use the K-SVD algorithm for learning the dictionaries for representing small image patches in a locally adaptive way, and use these to sparse- code the patches’ content. This is a relatively simple and straight-forward algorithm with hardly any entropy coding stage. Yet, it is shown to be superior to several competing algorithms: (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], and (iii) A Principal Component Analysis (PCA) approach.2 Fig. 1. (Left) Piece-wise affine warping of the image by triangulation. (Right) A In the next section we provide some background material for uniform slicing to disjoint square patches for coding purposes. this work: we start by presenting the details of the compression algorithm developed in [17], as their scheme is the one we embark K-Means) per each patch separately, using patches taken from the from in the development of ours. We also describe the topic of same location from 5000 training images. This way, each VQ is sparse and redundant representations and the K-SVD, that are adapted to the expected local content, and thus the high perfor- the foundations for our algorithm. In Section 3 we turn to present mance presented by this algorithm. The number of code-words the proposed algorithm in details, showing its various steps, and in the VQ is a function of the bit-allocation for the patches. As discussing its computational/memory complexities. Section 4 we argue in the next section, VQ coding is limited by the available presents results of our method, demonstrating the claimed number of examples and the desired rate, forcing relatively small superiority. We conclude in Section 5 with a list of future activities patch sizes. This, in turn, leads to a loss of some redundancy be- that can further improve over the proposed scheme. tween adjacent patches, and thus loss of potential compression. Another ingredient in this algorithm that partly compensates 2. Background material for the above-described shortcoming is a multi-scale coding scheme. The image is scaled down and VQ-coded using patches 2.1. VQ-based image compression of size 8 Â 8. Then it is interpolated back to the original resolution, and the residual is coded using VQ on 8 Â 8 pixel patches once Among the thousands of papers that study still image again. This method can be applied on a Laplacian pyramid of the compression algorithms, there are relatively few that consider original (warped) image with several scales [33]. the treatment of facial images [2–17]. Among those, the most As already mentioned above, the results shown in [17] surpass recent and the best performing algorithm is the one reported in those obtained by JPEG2000, both visually and in Peak-Signal-to- [17]. That paper also provides a thorough literature survey that Noise Ratio (PSNR) quantitative comparisons. In our work we pro- compares the various methods and discusses similarities and pose to replace the coding stage from VQ to sparse and redundant differences between them. Therefore, rather than repeating such representations—this leads us to the next subsection, were we de- a survey here, we refer the interested reader to [17]. In this scribe the principles behind this coding strategy. sub-section we concentrate on the description of the algorithm in [17] as our method resembles it to some extent. 2.2. Sparse and redundant representations
  • 88. Facial Image Compression O.O. Bryt, M. EladJ. J. Vis. Commun. Image R. 19 (2008) 270–282 Bryt, M. Elad / / Vis. Commun. Image R. 19 (2008) 270–282 271 271 [Elad et al. 2009] show recognizable faces. We use a a database containing around 6000 show recognizable faces. We use database containing around 6000 such facial images, some of which are used for training and tuning such facial images, some of which are used for training and tuning the algorithm, and the others for testing it, similar to the approach the algorithm, and the others for testing it, similar to the approach Image registration. taken in [17]. taken in [17]. In our work we propose a a novel compression algorithm, related In our work we propose novel compression algorithm, related to the one presented in [17], improving over it. to the one presented in [17], improving over it. Our algorithm relies strongly on recent advancements made in Our algorithm relies strongly on recent advancements made in Non-overlapping patches (fk )k . using sparse and redundant representation of signals [18–26], and using sparse and redundant representation of signals [18–26], and fk learning their sparsifying dictionaries [27–29]. We use the K-SVD learning their sparsifying dictionaries [27–29]. We use the K-SVD algorithm for learning the dictionaries for representing small algorithm for learning the dictionaries for representing small image patches in a a locally adaptive way, and use these to sparse- image patches in locally adaptive way, and use these to sparse- code the patches’ content. This isis a a relatively simple and code the patches’ content. This relatively simple and straight-forward algorithm with hardly any entropy coding stage. straight-forward algorithm with hardly any entropy coding stage. Yet, itit is shown to be superior to several competing algorithms: Yet, is shown to be superior to several competing algorithms: (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], 2 and (iii) AA Principal Component Analysis (PCA) approach.2 and (iii) Principal Component Analysis (PCA) approach. Fig. 1.1. (Left) Piece-wise affine warping of the image by triangulation. (Right) A Fig. (Left) Piece-wise affine warping of the image by triangulation. (Right) A In the next section we provide some background material for In the next section we provide some background material for uniform slicing toto disjoint square patches for coding purposes. uniform slicing disjoint square patches for coding purposes. this work: we start by presenting the details of the compression this work: we start by presenting the details of the compression algorithm developed in [17], as their scheme isis the one we embark algorithm developed in [17], as their scheme the one we embark K-Means) per each patch separately, using patches taken from the K-Means) per each patch separately, using patches taken from the from in the development of ours. We also describe the topic of from in the development of ours. We also describe the topic of same location from 5000 training images. This way, each VQ isis same location from 5000 training images. This way, each VQ sparse and redundant representations and the K-SVD, that are sparse and redundant representations and the K-SVD, that are adapted to the expected local content, and thus the high perfor- adapted to the expected local content, and thus the high perfor- the foundations for our algorithm. In Section 3 3 we turn to present the foundations for our algorithm. In Section we turn to present mance presented by this algorithm. The number of code-words mance presented by this algorithm. The number of code-words the proposed algorithm in details, showing its various steps, and the proposed algorithm in details, showing its various steps, and in the VQ isisa afunction of the bit-allocation for the patches. As in the VQ function of the bit-allocation for the patches. As discussing its computational/memory complexities. Section 4 4 discussing its computational/memory complexities. Section we argue in the next section, VQ coding isis limited by the available we argue in the next section, VQ coding limited by the available presents results of our method, demonstrating the claimed presents results of our method, demonstrating the claimed number of examples and the desired rate, forcing relatively small number of examples and the desired rate, forcing relatively small superiority. We conclude in Section 5 5 with a list of future activities superiority. We conclude in Section with a list of future activities patch sizes. This, in turn, leads to a a loss of some redundancy be- patch sizes. This, in turn, leads to loss of some redundancy be- that can further improve over the proposed scheme. that can further improve over the proposed scheme. tween adjacent patches, and thus loss of potential compression. tween adjacent patches, and thus loss of potential compression. Another ingredient in this algorithm that partly compensates Another ingredient in this algorithm that partly compensates 2. Background material 2. Background material for the above-described shortcoming isis a a multi-scale coding for the above-described shortcoming multi-scale coding scheme. The image isisscaled down and VQ-coded using patches scheme. The image scaled down and VQ-coded using patches 2.1. VQ-based image compression 2.1. VQ-based image compression of size 8 8  8. Then it is interpolated back to the original resolution, of size  8. Then it is interpolated back to the original resolution, and the residual isiscoded using VQ on 8 8  8pixel patches once and the residual coded using VQ on  8 pixel patches once Among the thousands of papers that study still image Among the thousands of papers that study still image again. This method can be applied on a a Laplacian pyramid of the again. This method can be applied on Laplacian pyramid of the compression algorithms, there are relatively few that consider compression algorithms, there are relatively few that consider original (warped) image with several scales [33]. original (warped) image with several scales [33]. the treatment of facial images [2–17]. Among those, the most the treatment of facial images [2–17]. Among those, the most As already mentioned above, the results shown in [17] surpass As already mentioned above, the results shown in [17] surpass recent and the best performing algorithm isis the one reported in recent and the best performing algorithm the one reported in those obtained by JPEG2000, both visually and in Peak-Signal-to- those obtained by JPEG2000, both visually and in Peak-Signal-to- [17]. That paper also provides a athorough literature survey that [17]. That paper also provides thorough literature survey that Noise Ratio (PSNR) quantitative comparisons. In our work we pro- Noise Ratio (PSNR) quantitative comparisons. In our work we pro- compares the various methods and discusses similarities and compares the various methods and discusses similarities and pose to replace the coding stage from VQ to sparse and redundant pose to replace the coding stage from VQ to sparse and redundant differences between them. Therefore, rather than repeating such differences between them. Therefore, rather than repeating such representations—this leads us to the next subsection, were we de- representations—this leads us to the next subsection, were we de- a asurvey here, we refer the interested reader to [17]. In this survey here, we refer the interested reader to [17]. In this scribe the principles behind this coding strategy. scribe the principles behind this coding strategy. sub-section we concentrate on the description of the algorithm sub-section we concentrate on the description of the algorithm in [17] as our method resembles itit to some extent. in [17] as our method resembles to some extent. 2.2. Sparse and redundant representations 2.2. Sparse and redundant representations
  • 89. Facial Image Compression O. Bryt, M. Elad / J. Vis. Commun. Image R. 19 (2008) 270–282 Before turning to preset the results we should add the follow- O.O. Bryt, M. EladJ. J. ing: while all theImage R. 19 (2008) 270–282 specific database Bryt, M. Elad / / Vis. Commun. Image R. shown here 270–282 Vis. Commun. results 19 (2008) refer to the was trained for patch number 80 (The left 271 coding atoms, and similarly, in Fig. 7 we271 can we operate on, the overall scheme proposed is general and should was trained for patch number 87 (The right apply to other face images databases just as well. Naturally, some sparse coding atoms. It can be seen that bot [Elad et al. 2009] show recognizable faces. We use a a database containing around 6000 the parameters might be necessary, and among those, show recognizable faces. We use database containing around 6000 in changes images similar in nature to the image patch such facial images, some of which are used for training and tuning size is the most important to consider. We also note that such facial images, some of which are used for training andthe patch tuning trained for. A similar behavior was observed the algorithm, and the others for testing it, similar to the approach from one source of images to another, this relative size as one shifts the algorithm, and the others for testing it, similar to the approach of the background in the photos may vary, and the necessarily 4.2. Reconstructed images Image registration. taken in [17]. taken in [17]. leads to changes in performance. More specifically, when the back- In our work we propose a a novel compression algorithm, ground small such larger (e.g., the images we useperformance is In our work we propose novel compression algorithm, related related tively regions are regions), the compression here have rela- Our coding strategy allows us to learn w age are more difficult than others to co to the one presented in [17], improving over it. to the one presented in [17], improving over it. expected to improve. assigning the same representation error th Our algorithm relies strongly on recent advancements made in Our algorithm relies strongly on recent advancements made in dictionaries Non-overlapping patches (fk )k . 4.1. K-SVD using sparse and redundant representation of signals [18–26], and using sparse and redundant representation of signals [18–26], and fk patches, and observing how many atoms representation of each patch on average. a small number of allocated atoms are simp learning their sparsifying dictionaries [27–29]. We use the K-SVD learning their sparsifying dictionaries [27–29]. We use the K-SVDThe primary stopping condition for the training process was set others. We would expect that the represent to be a limitation on the maximal number of K-SVD iterations of the image such as the background, p algorithm for learning the dictionaries for representing (being 100). A secondary stopping condition was a limitation on algorithm for learning the dictionaries for representingsmall small maybe parts of the clothes will be simpler image patches in a a locally adaptive way, and use these to sparse- image patches in locally adaptive way, and use these to sparse- the minimal representation error. In the image compression stage tion of areas containing high frequency e Dictionary learning (Dk )k . we added a limitation on the maximal number of atoms per patch. hair or the eyes. Fig. 8 shows maps of atom code the patches’ content. This isis a a relatively simple and code the patches’ content. This relatively simple and These conditions were used to allow us to better control the rates and representation error (RMSE—squared straight-forward algorithm with hardly any entropy coding of stage. stage. straight-forward algorithm with hardly any entropy coding the resulting images and the overall simulation time. squared error) per patch for the images in Every obtained dictionary contains 512 patches of size different bit-rates. It can be seen that more Yet, itit is shown to be superior to several competing algorithms: as atoms. In Fig. 6 we can see the dictionary that Yet, is shown to be superior to several competing algorithms:pixels 15  15 to patches containing the facial details (h (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], (i) the JPEG2000, (ii) the VQ-based algorithm presented in [17], 2 and (iii) AA Principal Component Analysis (PCA) approach.2 and (iii) Principal Component Analysis (PCA) approach. Fig. 1.1. (Left) Piece-wise affine warping of the image by triangulation. (Right) A Fig. (Left) Piece-wise affine warping of the image by triangulation. (Right) A In the next section we provide some background material for In the next section we provide some background material for uniform slicing toto disjoint square patches for coding purposes. uniform slicing disjoint square patches for coding purposes. this work: we start by presenting the details of the compression this work: we start by presenting the details of the compression algorithm developed in [17], as their scheme isis the one we embark algorithm developed in [17], as their scheme the one we embark K-Means) per each patch separately, using patches taken from the K-Means) per each patch separately, using patches taken from the from in the development of ours. We also describe the topic of from in the development of ours. We also describe the topic of same location from 5000 training images. This way, each VQ isis same location from 5000 training images. This way, each VQ sparse and redundant representations and the K-SVD, that are sparse and redundant representations and the K-SVD, that are adapted to the expected local content, and thus the high perfor- adapted to the expected local content, and thus the high perfor- the foundations for our algorithm. In Section 3 3 we turn to present the foundations for our algorithm. In Section we turn to present mance presented by this algorithm. The number of code-words mance presented by this algorithm. The number of code-words the proposed algorithm in details, showing its various steps, and the proposed algorithm in details, showing its various steps, and in the VQ isisa afunction of the bit-allocation for the patches. As in the VQ function of the bit-allocation for the patches. As discussing its computational/memory complexities. Section 4 4 discussing its computational/memory complexities. Section we argue in the next section, VQ coding isis limited by the available we argue in the next section, VQ coding limited by the available presents results of our method, demonstrating the claimed presents results of our method, demonstrating the claimed number of examples and the desired rate, forcing relatively small number of examples and the desired rate, forcing relatively small superiority. We conclude in Section 5 5 with a list of future activities superiority. We conclude in Section with a list of future activities patch sizes. This, in turn, leads to a a loss of some redundancy be- patch sizes. This, in turn, leads to loss of some redundancy be- that can further improve over the proposed scheme. that can further improve over the proposed scheme. tween adjacent patches, and thus loss of potential compression. tween adjacent patches, and thus loss of potential compression. 2. Background material 2. Background material AnotherThe Dictionary obtained by K-SVD for Patch No. 80 (the that partlyOMPcompensates Another ingredient in this algorithmleft eye) using the compensates Fig. 6. ingredient in this algorithm that partly method with L ¼ 4. for the above-described shortcoming isis a a multi-scale coding for the above-described shortcoming multi-scale coding Dk scheme. The image isisscaled down and VQ-coded using patches scheme. The image scaled down and VQ-coded using patches 2.1. VQ-based image compression 2.1. VQ-based image compression of size 8 8  8. Then it is interpolated back to the original resolution, of size  8. Then it is interpolated back to the original resolution, and the residual isiscoded using VQ on 8 8  8pixel patches once and the residual coded using VQ on  8 pixel patches once Among the thousands of papers that study still image Among the thousands of papers that study still image again. This method can be applied on a a Laplacian pyramid of the again. This method can be applied on Laplacian pyramid of the compression algorithms, there are relatively few that consider compression algorithms, there are relatively few that consider original (warped) image with several scales [33]. original (warped) image with several scales [33]. the treatment of facial images [2–17]. Among those, the most the treatment of facial images [2–17]. Among those, the most As already mentioned above, the results shown in [17] surpass As already mentioned above, the results shown in [17] surpass recent and the best performing algorithm isis the one reported in recent and the best performing algorithm the one reported in those obtained by JPEG2000, both visually and in Peak-Signal-to- those obtained by JPEG2000, both visually and in Peak-Signal-to- [17]. That paper also provides a athorough literature survey that [17]. That paper also provides thorough literature survey that Noise Ratio (PSNR) quantitative comparisons. In our work we pro- Noise Ratio (PSNR) quantitative comparisons. In our work we pro- compares the various methods and discusses similarities and compares the various methods and discusses similarities and pose to replace the coding stage from VQ to sparse and redundant pose to replace the coding stage from VQ to sparse and redundant differences between them. Therefore, rather than repeating such differences between them. Therefore, rather than repeating such representations—this leads us to the next subsection, were we de- representations—this leads us to the next subsection, were we de- a asurvey here, we refer the interested reader to [17]. In this survey here, we refer the interested reader to [17]. In this scribe the principles behind this coding strategy. scribe the principles behind this coding strategy. sub-section we concentrate on the description of the algorithm sub-section we concentrate on the description of the algorithm in [17] as our method resembles itit to some extent. in [17] as our method resembles to some extent. 2.2. Sparse and redundant representations 2.2. Sparse and redundant representations