SlideShare ist ein Scribd-Unternehmen logo
1 von 157
Downloaden Sie, um offline zu lesen
Generalized Principal Component Analysis
                 Tutorial @ CVPR 2008
              Yi Ma                        René Vidal
        ECE Department               Center for Imaging Science
       University of Illinois   Institute for Computational Medicine
       Urbana Champaign               Johns Hopkins University
Data segmentation and clustering
 • Given a set of points, separate them into multiple groups




 • Discriminative methods: learn boundary
 • Generative methods: learn mixture model, using, e.g.
   Expectation Maximization
Dimensionality reduction and clustering
 • In many problems data is high-dimensional: can reduce
   dimensionality using, e.g. Principal Component Analysis




 • Image compression
 • Recognition
    – Faces (Eigenfaces)
 • Image segmentation
    – Intensity (black-white)
    – Texture
Segmentation problems in dynamic vision
 • Segmentation of video and dynamic textures




 • Segmentation of rigid-body motions
Segmentation problems in dynamic vision
 • Segmentation of rigid-body motions from dynamic textures
Clustering data on non Euclidean spaces
 • Clustering data on non Euclidean spaces
    – Mixtures of linear spaces
    – Mixtures of algebraic varieties
    – Mixtures of Lie groups




 • “Chicken-and-egg” problems
    – Given segmentation, estimate models
    – Given models, segment the data
    – Initialization?

 • Need to combine
    – Algebra/geometry, dynamics and statistics
Outline of the tutorial
 • Introduction (8.00-8.15)
 • Part I: Theory (8.15-9.45)
    – Basic GPCA theory and algorithms (8.15-9.00)
    – Advanced statistical methods for GPCA (9.00-9.45)

 • Questions (9.45-10.00)
 • Break (10.00-10.30)
 • Part II: Applications (10.30-12.00)
    – Applications to motion and video segmentation (10.30-11.15)
    – Applications to image representation & segmentation (11.15-12.00)

 • Questions (12.00-12.15)
Part I: Theory
 • Introduction to GPCA (8.00-8.15)

 • Basic GPCA theory and algorithms (8.15-9.00)
    –   Review of PCA and extensions
    –   Introductory cases: line, plane and hyperplane segmentation
    –   Segmentation of a known number of subspaces
    –   Segmentation of an unknown number of subspaces


 • Advanced statistical and methods for GPCA (9.00-9.45)
    – Lossy coding of samples from a subspace
    – Minimum coding length principle for data segmentation
    – Agglomerative lossy coding for subspace clustering
Part II: Applications in computer vision
 • Applications to motion & video segmentation (10.30-11.15)
    – 2-D and 3-D motion segmentation
    – Temporal video segmentation
    – Dynamic texture segmentation




 • Applications to image representation and segmentation
   (11.15-12.00)
    – Multi-scale hybrid linear models for sparse
      image representation
    – Hybrid linear models for image segmentation
References: Springer-Verlag 2008
Slides, MATLAB code, papers
 Slides: http://www.vision.jhu.edu/gpca/cvpr08-tutorial-gpca.htm
             Code: http://perception.csl.uiuc.edu/gpca
Part I
Generalized Principal Component Analysis
                       René Vidal
                 Center for Imaging Science
            Institute for Computational Medicine
                  Johns Hopkins University
Principal Component Analysis (PCA)
 • Given a set of points x1, x2, …, xN
     – Geometric PCA: find a subspace S passing through them
     – Statistical PCA: find projection directions that maximize the variance




 • Solution (Beltrami’1873, Jordan’1874, Hotelling’33, Eckart-Householder-Young’36)


                     Basis for S

 • Applications: data compression, regression, computer
   vision (eigenfaces), pattern recognition, genomics
Extensions of PCA
 •   Higher order SVD (Tucker’66, Davis’02)

 •   Independent Component Analysis (Common ‘94)

 •   Probabilistic PCA (Tipping-Bishop ’99)
      – Identify subspace from noisy data
      – Gaussian noise: standard PCA
      – Noise in exponential family (Collins et al.’01)

 •   Nonlinear dimensionality reduction
      – Multidimensional scaling (Torgerson’58)
      – Locally linear embedding (Roweis-Saul ’00)
      – Isomap (Tenenbaum ’00)

 •   Nonlinear PCA (Scholkopf-Smola-Muller ’98)
      – Identify nonlinear manifold by applying PCA to
        data embedded in high-dimensional space

 •   Principal Curves and Principal Geodesic Analysis
     (Hastie-Stuetzle’89, Tishbirany ‘92, Fletcher ‘04)
Generalized Principal Component Analysis
 • Given a set of points lying in multiple subspaces, identify
    – The number of subspaces and their dimensions
    – A basis for each subspace
    – The segmentation of the data points


 • “Chicken-and-egg” problem
    – Given segmentation, estimate subspaces
    – Given subspaces, segment the data
Prior work on subspace clustering
 • Iterative algorithms:
    – K-subspace (Ho et al. ’03),
    – RANSAC, subspace selection and growing (Leonardis et al. ’02)

 • Probabilistic approaches: learn the parameters of a mixture
   model using e.g. EM
    – Mixtures of PPCA: (Tipping-Bishop ‘99):
    – Multi-Stage Learning (Kanatani’04)

 • Initialization
    – Geometric approaches: 2 planes in R3 (Shizawa-Maze ’91)
    – Factorization approaches: independent subspaces of equal
      dimension (Boult-Brown ‘91, Costeira-Kanade ‘98, Kanatani ’01)
    – Spectral clustering based approaches: (Yan-Pollefeys’06)
Basic ideas behind GPCA
•   Towards an analytic solution to subspace clustering
    – Can we estimate ALL models simultaneously using ALL data?
    – When can we do so analytically? In closed form?
    – Is there a formula for the number of models?


•   Will consider the most general case
    – Subspaces of unknown and possibly different dimensions
    – Subspaces may intersect arbitrarily (not only at the origin)


•   GPCA is an algebraic geometric approach to data segmentation
    – Number of subspaces         = degree of a polynomial
    – Subspace basis              = derivatives of a polynomial
    – Subspace clustering is algebraically equivalent to
         • Polynomial fitting
         • Polynomial differentiation
Applications of GPCA in computer vision
 •   Geometry
      – Vanishing points
 •   Image compression
 •   Segmentation
      –   Intensity (black-white)
      –   Texture
      –   Motion (2-D, 3-D)
      –   Video (host-guest)
 •   Recognition
      – Faces (Eigenfaces)
           • Man - Woman
      – Human Gaits
      – Dynamic Textures
           • Water-bird

 •   Biomedical imaging
 •   Hybrid systems identification
Introductory example: algebraic clustering in 1D




  •   Number of groups?
Introductory example: algebraic clustering in 1D
                            •   How to compute n, c, b’s?
                                – Number of clusters



                                – Cluster centers



                                – Solution is unique if



                                – Solution is closed form if
Introductory example: algebraic clustering in 2D
 • What about dimension 2?




 • What about higher dimensions?
    – Complex numbers in higher dimensions?
    – How to find roots of a polynomial of quaternions?

 • Instead
    – Project data onto one or two dimensional space
    – Apply same algorithm to projected data
Representing one subspace
 • One plane


 • One line




 • One subspace can be represented with
    – Set of linear equations
    – Set of polynomials of degree 1
Representing n subspaces
 • Two planes




 • One plane and one line
    – Plane:
    – Line:


                      De Morgan’s rule



 • A union of n subspaces can be represented with a set of
   homogeneous polynomials of degree n
Fitting polynomials to data points
 •   Polynomials can be written linearly in terms of the vector of coefficients
     by using polynomial embedding



                                    Veronese map




 •   Coefficients of the polynomials can be computed from nullspace of
     embedded data
      – Solve using least squares
      – N = #data points
Finding a basis for each subspace
 •   Case of hyperplanes:
     – Only one polynomial
     – Number of subspaces
     – Basis are normal vectors

                                Polynomial Factorization (GPCA-PFA) [CVPR 2003]
                                •   Find roots of polynomial of degree in one variable
                                •   Solve            linear systems in variables
                                •   Solution obtained in closed form for


 •   Problems
     – Computing roots may be sensitive to noise
     – The estimated polynomial may not perfectly factor with noisy
     – Cannot be applied to subspaces of different dimensions
         • Polynomials are estimated up to change of basis, hence they may not factor,
           even with perfect data
Finding a basis for each subspace
                       Polynomial Differentiation (GPCA-PDA) [CVPR’04]




 • To learn a mixture of subspaces we just need one positive
   example per class
Choosing one point per subspace
 • With noise and outliers
    – Polynomials may not be a perfect union of subspaces




    – Normals can estimated correctly by choosing points optimally
 • Distance to closest subspace without knowing
   segmentation?
GPCA for hyperplane segmentation
 • Coefficients of the polynomial can be computed from null
   space of embedded data matrix
    – Solve using least squares
    – N = #data points

 • Number of subspaces can be computed from the rank of
   embedded data matrix


 • Normal to the subspaces                  can be computed
   from the derivatives of the polynomial
GPCA for subspaces of different dimensions
 •   There are multiple polynomials
     fitting the data




 •   The derivative of each
     polynomial gives a different
     normal vector




 •   Can obtain a basis for the
     subspace by applying PCA to
     normal vectors
GPCA for subspaces of different dimensions
 • Apply polynomial embedding to projected data


 • Obtain multiple subspace model by polynomial fitting


    – Solve          to obtain
    – Need to know number of subspaces
 • Obtain bases & dimensions by polynomial differentiation



 • Optimally choose one point per subspace using distance
An example
• Given data lying in the union
  of the two subspaces




• We can write the union as




• Therefore, the union can be represented with the two
  polynomials
An example
• Can compute polynomials from




• Can compute normals from
Dealing with high-dimensional data
 •   Minimum number of points
      – K = dimension of ambient space
      – n = number of subspaces
                                                   Subspace 1
 •   In practice the dimension of
     each subspace ki is much
     smaller than K                                               Subspace 2



      – Number and dimension of the
        subspaces is preserved by a
        linear projection onto a
        subspace of dimension

                                          •   Open problem: how to choose
      – Can remove outliers by robustly       projection?
        fitting the subspace                  – PCA?
GPCA with spectral clustering
 • Spectral clustering
    – Build a similarity matrix between pairs of points
    – Use eigenvectors to cluster data


 • How to define a similarity for subspaces?
    – Want points in the same subspace to be close
    – Want points in different subspace to be far


 • Use GPCA to get basis




 • Distance: subspace angles
Comparison of PFA, PDA, K-sub, EM
                                      18
                                           PFA
                                           K−sub
                                      16
                                           PDA
     Error in the normals [degrees]

                                           EM
                                      14   PDA+K−sub
                                           PDA+EM
                                      12   PDA+K−sub+EM

                                      10

                                      8

                                      6

                                      4

                                      2

                                      0
                                       0      1            2           3    4   5
                                                          Noise level [%]
Dealing with outliers
 • GPCA with perfect data




 • GPCA with outliers




 • GPCA fails because PCA fails                       seek a robust estimate
   of Null(Ln ) where Ln = [ n (x1 ), . . . ,   n
                                                    (xN )].
Three approaches to tackle outliers
 •   Probability-based: small-probability samples
      –   Probability plots: [Healy 1968, Cox 1968]
      –   PCs: [Rao 1964, Ganadesikan & Kettenring 1972]
      –   M-estimators: [Huber 1981, Camplbell 1980]
      –   Multivariate-trimming (MVT):
          [Ganadesikan & Kettenring 1972]

 •   Influence-based: large influence on model parameters
      – Parameter difference with and without a sample:
        [Hampel et al. 1986, Critchley 1985]

 •   Consensus-based: not consistent with models of high consensus.
      – Hough: [Ballard 1981, Lowe 1999]
      – RANSAC: [Fischler & Bolles 1981, Torr 1997]
      – Least Median Estimate (LME):
        [Rousseeuw 1984, Steward 1999]
Robust GPCA
Robust GPCA
 Simulation on Robust GPCA (parameters fixed at    = 0.3rad and   = 0.4
 • RGPCA – Influence




(e) 12%      (f) 32%       (g) 48%       (h) 12%       (i) 32%       (j) 48%

 •   RGPCA - MVT




(k) 12%      (l) 32%      (m) 48%       (n) 12%        (o) 32%       (p) 48%
Robust GPCA
Comparison with RANSAC
• Accuracy




    (q) (2,2,1) in   3         (r) (4,2,2,1) in      5         (s) (5,5,5) in   6


•   Speed
       Table: Average time of RANSAC and RGPCA with 24% outliers.

         Arrangement     (2,2,1) in   3    (4,2,2,1) in    5   (5,5,5) in   6

         RANSAC               44s                 5.1min           3.4min
         MVT                  46s                 23min             8min
         Influence           3min                 58min           146min
Summary
• GPCA: algorithm for clustering subspaces
   – Deals with unknown and possibly different dimensions
   – Deals with arbitrary intersections among the subspaces


• Our approach is based on
   – Projecting data onto a low-dimensional subspace
   – Fitting polynomials to projected subspaces
   – Differentiating polynomials to obtain a basis


• Applications in image processing and computer vision
   – Image segmentation: intensity and texture
   – Image compression
   – Face recognition under varying illumination
For more information,

         Vision, Dynamics and Learning Lab
                        @
              Johns Hopkins University




            Thank You!
Generalized Principal Component Analysis
   via Lossy Coding and Compression
                        Yi Ma
      Image Formation & Processing Group, Beckman
      Decision & Control Group, Coordinated Science
                            Lab.
       Electrical & Computer Engineering Department

        University of Illinois at Urbana-Champaign
OUTLINE



MOTIVATION


PROBLEM FORMULATION AND EXISTING APPROACHES


SEGMENTATION VIA LOSSY DATA COMPRESSION


SIMULATIONS (AND EXPERIMENTS)


CONCLUSIONS AND FUTURE DIRECTIONS
MOTIVATION – Motion Segmentation in Computer Vision

 Goal: Given a sequence of images of multiple moving objects, determine:
 –     1. the number and types of motions (rigid-body, affine, linear, etc.)
        2. the features that belong to the same motion.




        QuickTime™ and a
     Cinepak decompressor
are needed to see this picture.




     The “chicken-and-egg” difficulty:
         – Knowing the segmentation, estimating the motions is easy;
         – Knowing the motions, segmenting the features is easy.


 A Unified Algebraic Approach to 2D and 3D Motion Segmentation, [Vidal-Ma, ECCV’
MOTIVATION – Image Segmentation
Goal: segment an image into multiple regions with homogeneous texture.


                                     feature
                                     s




                   Computer                         Human




      Difficulty: A mixture of models of different dimensions or
      complexities.
Multiscale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP’
MOTIVATION – Video Segmentation
   Goal: segmenting a video sequence into segments with “stationary” dynamics
   Model: different
   segments as outputs
   from different (linear)
   dynamical systems:
                                                                  QuickTime™ and a
                                                                H.264 decompressor
                                                           are needed to see this picture.




Identification of Hybrid Linear Systems via Subspace Segmentation, [Huang-Wagner-Ma, C
MOTIVATION – Massive Multivariate Mixed Data




                                                          QuickTime™ and a
                                                         BMP decompressor
                                                   are needed to see this picture.




Face database            Hyperspectral images   Articulate motions




   Hand written digits
                                                  Microarrays
SUBSPACE SEGMENTATION – Problem Formulation

Assumption: the data                     are noisy samples from an
arrangement of linear subspaces:




     noise-free samples    noisy samples      samples with outliers


 Difficulties:
     – the dimensions of the subspaces can be different
     – the data can be corrupted by noise or contaminated by outliers
     – the number and dimensions of subspaces may be unknown
SUBSPACE SEGMENTATION – Statistical Approaches

 Assume that the data
 are i.i.d. samples from a mixture of
 probabilistic distributions:



 Solutions:
 •    Expectation Maximization (EM) for the maximum-likelihood estimate
      [Dempster et. al.’77], e.g., Probabilistic PCA [Tipping-Bishop’99]:



 •    K-Means for a minimax-like estimate [Forgy’65, Jancey’66, MacQueen’67],
      e.g., K-Subspaces [Ho and Kriegman’03]:




     Essentially iterate between data segmentation and model estimation.
SUBSPACE SEGMENTATION – An Algebro-Geometric Approach

Idea: a union of linear subspaces is an
algebraic set -- the zero set of a set of
(homogeneous) polynomials:




Solution:
• Identify the set of polynomials of degree n that vanish on



•    Gradients of the vanishing polynomials are normals to the
     subspaces




    Complexity exponential in the dimension and number of subspaces.


Generalized Principal Component Analysis, [Vidal-Ma-Sastry, IEEE Transactions PAMI’0
SUBSPACE SEGMENTATION – An Information-Theoretic Approach
Problem: If the number/dimension of subspaces not given and data
    corrupted
by noise and outliers, how to determine the optimal subspaces that fit
Solutions: Model Selection Criteria?
    the data?
    – Minimum message length (MML) [Wallace-Boulton’68]
    – Minimum description length (MDL) [Rissanen’78]
    – Bayesian information criterion (BIC)
    – Akaike information criterion (AIC) [Akaike’77]
    – Geometric AIC [Kanatani’03], Robust AIC [Torr’98]

 Key idea (MDL):
 • a good balance between model complexity and data fidelity.
 • minimize the length of codes that describe the model and the
 data:


  with a quantization error optimal for the model.
LOSSY DATA COMPRESSION
Questions:

   – What is the “gain” or “loss” of segmenting or merging data?

   – How does tolerance of error affect segmentation results?



 Basic idea: whether the number of bits required to store “the
 whole is more than the sum of its parts”?
LOSSY DATA COMPRESSION – Problem Formulation

– A coding scheme maps a set of vectors
  to a sequence of bits, from which we can decode
  The coding length is denoted as:




– Given a set of real-valued mixed data
  the optimal segmentation
minimizes
  the overall coding length:



 where
LOSSY DATA COMPRESSION – Coding Length for Multivariate Data



Theorem.
Given                                 with




is the number of bits needed to encode the data s.t.
.


       A nearly optimal bound for even a small number
       of vectors drawn from a subspace or a Gaussian
       source.



        Segmentation of Multivariate Mixed Data, [Ma-Derksen-Hong-Wright, PAMI’
LOSSY DATA COMPRESSION – Two Coding Schemes
Goal: code              s.t. a mean squared error

 Linear subspace          Gaussian source
LOSSY DATA COMPRESSION – Properties of the Coding Length




1. Commutative Property:
  For high-dimensional data, computing the coding length only needs
  the kernel matrix:


2. Asymptotic Property:
   At high SNR, this is the optimal rate distortion for a Gaussian source.


3. Invariant Property:
  Harmonic Analysis is useful for data compression only when the data are
  non-Gaussian or nonlinear ……… so is segmentation!
LOSSY DATA COMPRESSION – Why Segment?



                       partitioning:




                         sifting:
LOSSY DATA COMPRESSION – Probabilistic Segmentation?

Assign the ith point to the jth group with probability




Theorem. The expected coding length of the segmented data



is a concave function in Π over the domain of a convex polytope.




   Minima are reached at the vertexes of the
   polytope -- no probabilistic
   segmentation!



         Segmentation of Multivariate Mixed Data, [Ma-Derksen-Hong-Wright, PAMI’
LOSSY DATA COMPRESSION – Segmentation & Channel Capacity

A MIMO additive white Gaussian noise (AWGN) channel



has the capacity:

If allowing probabilistic grouping of transmitters, the expected
capacity




is a concave function in Π over a convex polytope.
Maximizing such a capacity is a convex
problem.




      On Coding and Segmentation of Multivariate Mixed Data, [Ma-Derksen-Hong-Wright, PAMI
LOSSY DATA COMPRESSION – A Greedy (Agglomerative) Algorithm
Objective: minimizing the overall coding length




Input:                                                                   “Bottom-up” merge


while true do
   choose two sets                                such
that
is minimal                                                                               QuickTime™ and a



   if
                                                                                        PNG decompressor
                                                                                  are needed to see this picture.




   then
   else break
   endif
end
Output:
   Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
SIMULATIONS – Mixture of Almost Degenerate Gaussians
 Noisy samples from two lines and one plane in <3
                           Given Data                              Segmentation Results




ε0 = 0.01




ε0 = 0.08




    Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
SIMULATIONS – “Phase Transition”
 #group v.s. distortion                                        Rate v.s. distortion




                              ε0 =                                                              0.0
                              0.08                                                              8

                        ice
                        cubes
steam                                                   Stability: the same segmentation
         water                                          for ε across 3 magnitudes!


                                   0.08

  Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
SIMULATIONS – Comparison with EM


100 x d uniformly distributed random samples from each subspace, corrupte
with 4% noise. Classification rate averaged over 25 trials for each case.




   Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
SIMULATIONS – Comparison with EM

Segmenting three degenerate or non-degenerate Gaussian clusters for 50 tria




   Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
SIMULATIONS – Robustness with Outliers
          35.8% outliers                                                       45.6%




                71.5%                                                          73.6%




  Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
SIMULATIONS – Affine Subspaces with Outliers

          35.8% outliers                                                       45.6%




                66.2%                                                          69.1%




  Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
SIMULATIONS – Piecewise-Linear Approximation of Manifolds



 Swiss roll     Mobius strip      Torus        Klein bottle
SIMULATIONS – Summary


– The minimum coding length objective automatically addresses
the
  model selection issue: the optimal solution is very stable and
robust.

– The segmentation/merging is physically meaningful (measured
in bits).
  The results resemble phase transition in statistical physics.

– The greedy algorithm is scalable (polynomial in both K and N)
and
  converges well when ε is not too small w.r.t. the sample
density.
Clustering from a Classification Perspective

Assumption: The training data
are drawn from a distribution


Goal: Construct a classifier
such that the misclassification
error

reaches minimum.

Solution: Knowing the distributions                and
, the optimal classifier is the maximum a posteriori (MAP)
classifier:



Difficulties: How to learn the two distributions         from samples?
(parametric, non-parametric, model selection, high-dimension, outliers…)
MINIMUM INCREMENTAL CODING LENGTH – Problem Formulation

Ideas: Using the lossy coding length



as a surrogate for the Shannon lossless coding length w.r.t. true
distributions.

Additional bits need to encode the test
sample     with the jth training set is




Classification Criterion: Minimum Incremental Coding Length
(MICL)
MICL (“Michael”) – Asymptotic Properties


Theorem: As the number of samples         goes to infinity, the MICL
criterion converges with probability one to the following criterion:




where
                                                                            ?


       is the “number of effective
parameters” of the j-th model (class).




Theorem: The MICL classifier converges to the above asymptotic form
at the rate of       for some constant .




          Minimum Incremental Coding Length (MICL), [Wright and Ma et. a.., NIPS’07]
SIMULATIONS – Interpolation and Extrapolation via MICL




    MICL




    SVM




    k-NN




          Minimum Incremental Coding Length (MICL), [Wright and Ma et. a.., NIPS’07]
SIMULATIONS – Improvement over MAP and RDA [Friedman1989]


Two Gaussians in
R2
   isotropic (left)
   anisotropic
(right)
(500 trials)




Three Gaussians in
Rn
  dim = n
  dim = n/2
  dim = 1
(500 trials)


               Minimum Incremental Coding Length (MICL), [Wright and Ma et. a.., NIPS’07]
SIMULATIONS – Local and Kernel MICL
Local MICL (LMICL): Applying MICL locally to the k-nearest
neighbors of the test sample (frequencylist + Bayesianist).

Kernel MICL (KMICL): Incorporating MICL with a nonlinear kernel
naturally through the identity (“kernelized” RDA):



     LMICL                 k-                KMICL-RBF               SVM-RBF
                           NN




          Minimum Incremental Coding Length (MICL), [Wright and Ma et. a.., NIPS’07]
CONCLUSIONS

  Assumptions: Data are in a high-dimensional space but have
  low-dimensional structures (subspaces or submanifolds).

  Compression => Clustering & Classification:
   – Minimum (incremental) coding length subject to distortion.
   – Asymptotically optimal clustering and classification.
   – Greedy clustering algorithm (bottom-up, agglomerative).
   – MICL corroborates MAP, RDA, k-NN, and kernel methods.

  Applications (Next Lectures):
   – Video segmentation, motion segmentation (Vidal)
   – Image representation & segmentation (Ma)
   – Others: microarray clustering, recognition of faces and
     handwritten digits (Ma)
FUTURE DIRECTIONS

  Theory
   – More complex structures: manifolds, systems, random
     fields…
   – Regularization (ridge, lasso, banding etc.)
   – Sparse representation and subspace arrangements

  Computation
   – Global optimality (random techniques, convex
     optimization…)
   – Scalability: random sampling, approximation…

  Future Application Domains
   – Image/video/audio classification, indexing, and retrieval
   – Hyper-spectral images and videos
   – Biomedical images, microarrays
   – Autonomous navigation, surveillance, and 3D mapping
   – Identification of hybrid linear/nonlinear systems
REFERENCES & ACKNOWLEGMENT
 References:
 –   Segmentation of Multivariate Mixed Data via Lossy Data
     Compression, Yi Ma, Harm Derksen, Wei Hong, John Wright,
     PAMI, 2007.
 –   Classification via Minimum Incremental Coding Length (MICL),
     John Wright et. al., NIPS, 2007.
 –   Website: http://perception.csl.uiuc.edu/coding/home.htm

 People:
 –  John Wright, PhD Student, ECE Department, University of Illinois
 –  Prof. Harm Derksen, Mathematics Department, University of
    Michigan
 – Allen Yang (UC Berkeley) and Wei Hong (Texas Instruments R&D)
 –  Zhoucheng Lin and Harry Shum, Microsoft Research Asia, China

 Funding:
 –  ONR YIP N00014-05-1-0633
 –  NSF CAREER IIS-0347456, CCF-TF-0514955, CRS-EHS-0509151
11/2003




   “The whole is more than the sum of its
  parts.”
                                     --
  Aristotle
             Questions, please?




Yi Ma, CVPR 2008
Part II
Applications of GPCA in Computer Vision
                      René Vidal
                Center for Imaging Science
           Institute for Computational Medicine
                 Johns Hopkins University
Part II: Applications in computer vision
 • Applications to motion & video segmentation (10.30-11.15)
    – 2-D and 3-D motion segmentation
    – Temporal video segmentation
    – Dynamic texture segmentation




 • Applications to image representation and segmentation
   (11.15-12.00)
    – Multi-scale hybrid linear models for sparse
      image representation
    – Hybrid linear models for image segmentation
Applications to motion and and video
            segmentation
                     René Vidal
               Center for Imaging Science
          Institute for Computational Medicine
                Johns Hopkins University
3-D motion segmentation problem
 •   Given a set of point correspondences in multiple views, determine
     – Number of motion models
     – Motion model: affine, homography, fundamental matrix, trifocal tensor
     – Segmentation: model to which each pixel belongs




 •   Mathematics of the problem depends on
     –   Number of frames (2, 3, multiple)
     –   Projection model (affine, perspective)
     –   Motion model (affine, translational, homography, fundamental matrix, etc.)
     –   3-D structure (planar or not)
Taxonomy of problems
•   2-D Layered representation
     –   Probabilistic approaches: Jepson-Black’93, Ayer-Sawhney’95, Darrel-Pentland’95, Weiss-
         Adelson’96, Weiss’97, Torr-Szeliski-Anandan’99
     –   Variational approaches: Cremers-Soatto ICCV’03
     –   Initialization: Wang-Adelson’94, Irani-Peleg’92, Shi-Malik‘98, Vidal-Singaraju’05-’06

•   Multiple rigid motions in two perspective views
     –   Probabilistic approaches: Feng-Perona’98, Torr’98
     –   Particular cases: Izawa-Mase’92, Shashua-Levin’01, Sturm’02,
     –   Multibody fundamental matrix: Wolf-Shashua CVPR’01, Vidal et al. ECCV’02, CVPR’03, IJCV’06
     –   Motions of different types: Vidal-Ma-ECCV’04, Rao-Ma-ICCV’05

•   Multiple rigid motions in three perspective views
     –   Multibody trifocal tensor: Hartley-Vidal-CVPR’04

•   Multiple rigid motions in multiple affine views
     –   Factorization-based: Costeira-Kanade’98, Gear’98, Wu et al.’01, Kanatani’ et al.’01-02-04
     –   Algebraic: Yan-Pollefeys-ECCV’06, Vidal-Hartley-CVPR’04

•   Multiple rigid motions in multiple perspective views
     –   Schindler et al. ECCV’06, Li et al. CVPR’07
A unified approach to motion segmentation
 • Estimation of multiple motion models equivalent to
   estimation of one multibody motion model

                                                     chicken-and-egg


    – Eliminate feature clustering: multiplication



    – Estimate a single multibody motion model: polynomial fitting



    – Segment multibody motion model: polynomial differentiation
A unified approach to motion segmentation
 • Applies to most motion models in computer vision




 • All motion models can be segmented algebraically by
    – Fitting multibody model: real or complex polynomial to all data
    – Fitting individual model: differentiate polynomial at a data point
Segmentation of 3-D translational motions
 •   Multiple epipoles (translation)



 •   Epipolar constraint: plane in
      – Plane normal = epipoles
      – Data = epipolar lines




 •   Multibody epipolar constraint     •   Epipoles are derivatives of
                                                  at epipolar lines
Segmentation of 3-D translational motions
Single-body factorization
                                                           Structure = 3D surface
 • Affine camera model



    – p = point
    – f = frame

                                     Motion = camera position and orientation

 • Motion of one rigid-body lives in a 4-D subspace
   (Boult and Brown ’91,
   Tomasi and Kanade ‘92)


    – P = #points
    – F = #frames
Multi-body factorization
 •   Given n rigid motions




 •   Motion segmentation is obtained from
     – Leading singular vector of   (Boult and Brown ’91)
     – Shape interaction matrix     (Costeira & Kanade ’95, Gear ’94)




     – Number of motions (if fully-dimensional)

 •   Motion subspaces need to be independent (Kanatani ’01)
Multi-body factorization
 •   Sensitive to noise


      – Kanatani (ICCV ’01): use model selection to scale Q
      – Wu et al. (CVPR’01): project data onto subspaces and iterate
 •   Fails with partially dependent motions
      – Zelnik-Manor and Irani (CVPR’03)
          • Build similarity matrix from normalized Q
          • Apply spectral clustering to similarity matrix
      – Yan and Pollefeys (ECCV’06)
          • Local subspace estimation + spectral clustering
      – Kanatani (ECCV’04)
          • Assume degeneracy is known: pure translation in the image
          • Segment data by multi-stage optimization (multiple EM problems)

 •   Cannot handle missing data
      – Gruber and Weiss (CVPR’04)
          • Expectation Maximization
PowerFactorization+GPCA
• A motion segmentation algorithm that
   – Is provably correct with perfect data
   – Handles both independent and degenerate motions
   – Handles both complete and incomplete data


• Project trajectories onto a 5-D subspace of
   – Complete data: PCA or SVD
   – Incomplete data: PowerFactorization


• Cluster projected subspaces using GPCA
   – Handles both independent and degenerate motions
   – Non-iterative: can be used to initialize EM
Projection onto a 5-D subspace
 •   Motion of one rigid-body lives in
     4-D subspace of                               Motion 1

 •   By projecting onto a 5-D
     subspace of                                                   Motion 2
      – Number and dimensions of
        subspaces are preserved
      – Motion segmentation is
        equivalent to clustering
        subspaces of dimension
        2, 3 or 4 in
      – Minimum #frames = 3
        (CK needs a minimum of 2n
        frames for n motions)
                                          •   What projection to use?
      – Can remove outliers by robustly
                                              – PCA: 5 principal components
        fitting the 5-D subspace using
        Robust SVD (DeLaTorre-Black)          – RPCA: with outliers
Projection onto a 5-D subspace
 PowerFactorization algorithm:   Given    , factor it as

 •   Complete data               •   Incomplete data



     – Given A solve for B



     – Orthonormalize B
                                                  Linear problem
     – Given B solve for A


     – Iterate
                                 •   It diverges in some cases
 •   Converges to rank-r
     approximation with rate     •   Works well with up to 30% of
                                     missing data
Motion segmentation using GPCA
• Apply polynomial embedding to 5-D points
                         Veronese map
Hopkins 155 motion segmentation database
 • Collected 155 sequences
    – 120 with 2 motions
    – 35 with 3 motions

 • Types of sequences
    – Checkerboard sequences: mostly full
      dimensional and independent motions
    – Traffic sequences: mostly degenerate (linear,
      planar) and partially dependent motions
    – Articulated sequences: mostly full dimensional
      and partially dependent motions

 • Point correspondences
    – In few cases, provided by Kanatani & Pollefeys
    – In most cases, extracted semi-automatically
      with OpenCV
Experimental results: Hopkins 155 database
 • 2 motions, 120 sequences, 266 points, 30 frames
Experimental results: Hopkins 155 database
 • 3 motions, 35 sequences, 398 points, 29 frames
Experimental results: missing data sequences




 •   There is no clear correlation between amount of missing data and
     percentage of misclassification
 •   This could be because convergence of PF depends more on “where”
     missing data is located than on “how much” missing data there is
Conclusions
 • For two motions
    – Algebraic methods (GPCA and LSA) are more accurate than
      statistical methods (RANSAC and MSL)
    – LSA performs better on full and independent sequences, while
      GPCA performs better on degenerate and partially dependent
    – LSA is sensitive to dimension of projection: d=4n better than d=5
    – MSL is very slow, RANSAC and GPCA are fast


 • For three motions
    – GPCA is not very accurate, but is very fast
    – MSL is the most accurate, but it is very slow
    – LSA is almost as accurate as MSL and almost as fast as GPCA
Segmentation of Dynamic Textures
                   René Vidal
             Center for Imaging Science
        Institute for Computational Medicine
              Johns Hopkins University
Modeling a dynamic texture: fixed boundary
 • Examples of dynamic textures:




 • Model temporal evolution as the output of a linear
   dynamical system (LDS): Soatto et al. ‘01
                   dynamics
        zt+1 = Azt + vt
 images   yt = Czt + wt
                  appearance
Segmenting non-moving dynamic textures
 • One dynamic texture lives in the observability subspace
   zt+1 = Azt + vt
     yt = Czt + wt

 • Multiple textures live in multiple subspaces

                      water


                      steam


 • Cluster the data using GPCA
Segmenting moving dynamic textures
Segmenting moving dynamic textures




                Ocean-bird
Level-set intensity-based segmentation
 • Chan-Vese energy functional


 • Implicit methods
    – Represent C as the zero level set
      of an implicit function , i.e.
      C = {(x, y) : (x, y) = 0}



 • Solution
    – The solution to the gradient descent algorithm for   is given by



    – c1 and c2 are the mean intensities inside and outside the contour C.
Dynamics & intensity-based energy
 • We represent the intensities of the pixels in the images as
   the output of a mixture of AR models of order p


 • We propose the following spatial-temporal extension of the
   Chan-Vese energy functional




   where
Variational segmentation of dynamic textures
 • Given the ARX parameters, we can solve for the implicit
   function by solving the PDE




 • Given the implicit function , we can solve for the ARX
   parameters of the jth region by solving the linear system
Variational segmentation of dynamic textures
 • Fixed boundary segmentation results and comparison




    Ocean-smoke       Ocean-dynamics      Ocean-appearance
Variational segmentation of dynamic textures
 • Moving boundary segmentation results and comparison




                        Ocean-fire
Variational segmentation of dynamic textures
 • Results on a real sequence




                        Raccoon on River
Temporal video segmentation
 •   Segmenting N=30 frames of a
     sequence containing n=3
     scenes
     – Host
     – Guest
     – Both

                                   •   Image intensities are output of
                                       linear system


                                                             dynamics
                                                xt+1 = Axt +vt
                                   •               y =C
                                       Apply GPCA totfit n=3 xt +wt
                                          images
                                       observability subspaces
                                                           appearance
Temporal video segmentation
 •   Segmenting N=60 frames of a
     sequence containing n=3
     scenes
     – Burning wheel
     – Burnt car with people
     – Burning car




                                   •   Image intensities are output of linear
                                       system
                                                                 dynamics
                                                   xt+1 = Axt +vt
                                                     yt = Cxt +wt
                                          images
                                   •   Apply GPCA to fit n=3 appearance
                                                             observability
                                       subspaces
Conclusions
 •   Many problems in computer vision can be posed as subspace
     clustering problems
     –   Temporal video segmentation
     –   2-D and 3-D motion segmentation
     –   Dynamic texture segmentation
     –   Nonrigid motion segmentation

 •   These problems can be solved using GPCA: an algorithm for clustering
     subspaces
     – Deals with unknown and possibly different dimensions
     – Deals with arbitrary intersections among the subspaces

 •   GPCA is based on
     – Projecting data onto a low-dimensional subspace
     – Recursively fitting polynomials to projected subspaces
     – Differentiating polynomials to obtain a basis
For more information,

         Vision, Dynamics and Learning Lab
                        @
              Johns Hopkins University




            Thank You!
Generalized Principal Component Analysis
for Image Representation & Segmentation

                        Yi Ma
    Control & Decision, Coordinated Science Laboratory
      Image Formation & Processing Group, Beckman
     Department of Electrical & Computer Engineering

        University of Illinois at Urbana-Champaign
INTRODUCTION


GPCA FOR LOSSY IMAGE REPRESENTATION


IMAGE SEGMENTATION VIA LOSSY COMPRESSION


OTHER APPLICATIONS


CONCLUSIONS AND FUTURE DIRECTIONS
Introduction – Image Representation via Linear Transformations



                                                      better
                                                 representations?

                  pixel-based representation
                  three matrixes of RGB-values


                                         a more compact
                 linear transformation    representation
Introduction
Fixed Orthogonal Bases (representation, approximation, compression)
- Discrete Fourier transform (DFT) or discrete cosine transform (DCT)
  (Ahmed ’74): JPEG.
- Wavelets (multi-resolution) (Daubechies’88, Mallat’92): JPEG-2000.
- Curvelets and contourlets (Candes & Donoho’99, Do & Veterlli’00)




                    Discrete Fourier transform (DFT)
                                                       6.25% coefficients.




                         Wavelet transform

Unorthogonal Bases (for redundant representations)
- Extended lapped transforms, frames, sparse representations (Lp
geometry)…
Introduction

Adaptive Bases (optimal if imagery data are uni-modal)
- Karhunen-Loeve transform (KLT), also known as PCA (Pearson’1901,
  Hotelling’33, Jolliffe’86)



                                             stack




                       adaptive bases
Introduction – Principal Component Analysis (PCA)
Dimensionality Reduction

Find a low-dimensional representation (model) for high-dimensional data.
Principal Component Analysis (Pearson’1901, Hotelling’1933, Eckart &
Young’1936) or Karhunen-Loeve transform (KLT).




                           Basis for S   SVD
Variations of PCA
    – Nonlinear Kernel PCA (Scholkopf-Smola-Muller’98)
    – Probabilistic PCA (Tipping-Bishop’99, Collins et.al’01)
    – Higher-Order SVD (HOSVD) (Tucker’66, Davis’02)
    – Independent Component Analysis (Hyvarinen-Karhunen-Oja’01)
Hybrid Linear Models – Multi-Modal Characteristics

Distribution of the first three principal components of
the Baboon image: A clear multi-modal distribution
Hybrid Linear Models – Multi-Modal Characteristics

Vector Quantization (VQ)
   - multiple 0-dimensional affine subspaces (i.e. cluster means)
   - existing clustering algorithms are iterative (EM, K-means)
Hybrid Linear Models – Versus Linear Models
A single linear model
                                        Linear
                             stack




Hybrid linear models

                                        Hybrid linear
                              stack
Hybrid Linear Models – Characteristics of Natural Images
                Multivariate      Hybrid       Hierarchical    High-dimensio
                 1D 2D         (multi-modal)   (multi-scale)    (vector-value
  Fourier
   (DCT)         X     X

  Wavelets       X
                                                X
  Curvelets            X

Random fields          X          X             X

  PCA/KLT        X     X                                       X

     VQ          X     X          X                            X

Hybrid linear    X     X          X             X               X

 We need a new & simple paradigm to effectively account for all
 these characteristics simultaneously.
Hybrid Linear Models – Subspace Estimation and Segmentation

Hybrid Linear Models (or Subspace
Arrangements)
   – the number of subspaces is
   unknown
   – the dimensions of the
   subspaces are unknown
   – the basis of the subspaces are
   unknown
   – the segmentation of the data
   points is unknown


         “Chicken-and-Egg” Coupling
             – Given segmentation, estimate subspaces
             – Given subspaces, segment the data
Hybrid Linear Models – Recursive GPCA (an Example)
Hybrid Linear Models – Effective Dimension
Model Selection (for Noisy Data)
   Model complexity;
   Data fidelity;

                     Number of
                     subspaces




                Total         Dimension        Number of
              number of        of each        points in each
                points        subspace          subspace


   Model selection criterion: minimizing effective dimension
   subject to a given error tolerance (or PSNR)
Hybrid Linear Models – Simulation Results (5% Noise)




ED=3




ED=2.0067




ED=1.6717
Hybrid Linear Models – Subspaces of the Barbara Image
Hybrid Linear Models – Lossy Image Representation (Baboon)




                                   GPCA




                     Original                     PCA (8x8)




                                DCT (JPEG)




                     Harr
                     Wavelet                      GPCA (8x8)
Multi-Scale Implementation – Algorithm Diagram
Diagram for a level-3 implementation of hybrid linear models
for image representation




Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
Multi-Scale Implementation – The Baboon Image


 The Baboon image




                                 downsample
                                 by two twice




                                 segmentation of
                                 2 by 2 blocks

Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
Multi-Scale Implementation – Comparison with Other Methods



   The Baboon image




Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
Multi-Scale Implementation – Image Approximation

Comparison with level-3 wavelet (7.5% coefficients)




       Level-3 bior-4.4 wavelets              Level-3 hybrid linear model
             PSNR=23.94                              PSNR=24.64
Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
Multi-Scale Implementation – Block Size Effect


   The Baboon image




 Some problems with the multi-scale hybrid linear model:
 1. has minor block effect;
 2. is computationally more costly (than Fourier, wavelets, PCA);
 3. does not fully exploit spatial smoothness as wavelets.

Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
Multi-Scale Implementation – The Wavelet Domain




  The Baboon image                        HL




                              LH          HH




                              segmentation
                              at each scale

Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
Multi-Scale Implementation – Wavelets v.s. Hybrid Linear Wavelets

  The Baboon image




 Advantages of the hybrid linear model in wavelet domain:
 1. eliminates block effect;
 2. is computationally less costly (than in the spatial domain);
 3. achieves higher PSNR.

Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
Multi-Scale Implementation – Visual Comparison
 Comparison among several models (7.5% coefficients)



   Original                                                       Wavelets
   Image                                                         PSNR=23.94




Hybrid model                                                      Hybrid model
  in spatial                                                       in wavelet
   domain                                                            domain
PSNR=24.64                                                        PSNR=24.88



 Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
Image Segmentation – via Lossy Data Compression


                                          stack




               QuickTime™ and a
              PNG decompressor
        are needed to see this picture.
APPLICATIONS – Texture-Based Image Segmentation
         Naïve approach:
         – Take a 7x7 Gaussian window around every
         pixel.
         – Stack these windows as vectors.
         – Clustering the vectors using our algorithm.

A few results:




    Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
APPLICATIONS – Distribution of Texture Features
   Question: why does such a simple algorithm work at all?

   Answer: Compression (MDL/MCL) is well suited to mid-level texture
   segmentation.

   Using a single representation (e.g. windows, filterbank responses) for textures
   different complexity ⇒ redundancy and degeneracy, which can be exploited fo
   clustering / compression.




       QuickTime™ and a
   TIFF (LZW) decompressor
are needed to see this picture.




                                  Above: singular values of feature vectors from two different
                                  segments of the image at left.
APPLICATIONS – Compression-based Texture Merging (CTM)

Problem with the naïve
approach:                                                                                                         QuickTime™ and a
                                                                                                              TIFF (LZW) decompressor
                                              QuickTime™ and a                  QuickTime™ and a           are needed to see this picture.
Strong edges, segment boundaries
                                          TIFF (LZW) decompressor           TIFF (LZW) decompressor
                                       are needed to see this picture.   are needed to see this picture.




Solution:
Low-level, edge-preserving over-segmentation into small homogeneous
regions.
Simple features: stacked Gaussian windows (7x7 in our experiments).
Merge adjacent regions to minimize coding length (“compress” the features).
APPLICATIONS – Hierarchical Image Segmentation via CTM




                 ε = 0.1            ε = 0.2              ε = 0.4
Lossy coding with varying distortion ε => hierarchy of
segmentations
APPLICATIONS – CTM: Qualitative Results
APPLICATIONS – CTM: Quantitative Evaluation and Comparison
Berkeley Image Segmentation Database




 PRI:   Probabilistic Rand Index [Pantofaru 2005]
 VoI:    Variation of Information [Meila 2005]
 GCE:   Global Consistency Error [Martin 2001]
 BDE:   Boundary Displacement Error [Freixenet 2002]

Unsupervised Segmentation of Natural Images via Lossy Data Compression, CVIU, 200
Other Applications: Multiple Motion Segmentation (on Hopkins155)


           QuickTime™ and a                       QuickTime™ and a
        Cinepak decompressor                   Cinepak decompressor
   are needed to see this picture.        are needed to see this picture.




Two Motions: MSL 4.14%, LSA 3.45%, ALC 2.40%, and work with up to 25% outliers.
Three Motions: MSL 8.32%, LSA 9.73%, ALC 6.26%.

                                Shankar Rao, Roberton Tron, Rene Vidal, and Yi Ma, to appear in CVPR’08
Other Applications – Clustering of Microarray Data




        Segmentation of Multivariate Mixed Data, [Ma-Derksen-Hong-Wright, PAMI’
Other Applications – Clustering of Microarray Data




        Segmentation of Multivariate Mixed Data, [Ma-Derksen-Hong-Wright, PAMI’
Other Applications – Supervised Classification



                    Premises: Data    lie on an
                    arrangement of subspaces




    Unsupervised Clustering           Supervised Classification
    – Generalized PCA                 – Sparse Representation
Other Applications – Robust Face Recognition




        Robust Face Recognition via Sparse Representation, to appear in PAMI 2008
Other Applications: Robust Motion Segmentation (on Hopkins155)




 Dealing with incomplete or mistracked features with dataset 80%
 corrupted!
           Shankar Rao, Roberto Tron, Rene Vidal, and Yi Ma, to appear in CVPR’08
Three Measures of Sparsity: Bits, L_0 and L1-Norm




Reason: High-dimensional data, like images, do have compact,
compressible, sparse structures, in terms of their geometry,
statistics, and semantics.
Conclusions

   Most imagery data are high-dimensional, statistically or
   geometrically heterogeneous, and have multi-scale
   structures.

   Imagery data require hybrid models that can adaptively
   represent different subsets of the data with different
   (sparse) linear models.

   Mathematically, it is possible to estimate and segment
   hybrid (linear) models non-iteratively. GPCA offers one such
   method.

   Hybrid models lead to new paradigms, new principles, and
   new applications for image representation, compression,
   and segmentation.
Future Directions
 Mathematical Theory
  – Subspace arrangements (algebraic properties).
  – Extension of GPCA to more complex algebraic varieties (e.g.,
    hybrid multilinear, high-order tensors).
  – Representation & approximation of vector-valued functions.

 Computation & Algorithm Development
  – Efficiency, noise sensitivity, outlier elimination.
  – Other ways to combine with wavelets and curvelets.

 Applications to Other Data
  – Medical imaging (ultra-sonic, MRI, diffusion tensor…)
  – Satellite hyper-spectral imaging.
  – Audio, video, faces, and digits.
  – Sensor networks (location, temperature, pressure, RFID…)
  – Bioinformatics (gene expression data…)
Acknowledgement

People
 – Wei Hong, Allen Yang, John Wright, University of Illinois
 – Rene Vidal of Biomedical Engineering Dept., Johns Hopkins
   University
 – Kun Huang of Biomedical & Informatics Science Dept., Ohio-
   State University

Funding
 – Research Board, University of Illinois at Urbana-Champaign
 – National Science Foundation (NSF CAREER IIS-0347456)
 – Office of Naval Research (ONR YIP N000140510633)
 – National Science Foundation (NSF CRS-EHS0509151)
 – National Science Foundation (NSF CCF-TF0514955)
Generalized Principal Component Analysis:
Modeling and Segmentation of Multivariate Mixed
Data

Rene Vidal, Yi Ma, and Shankar Sastry
Springer-Verlag, to appear

                   Thank You!



Yi Ma, CVPR 2008

Weitere ähnliche Inhalte

Was ist angesagt?

Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsMathias Niepert
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesJinwon Lee
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networksananth
 
Synthetic aperture radar images
Synthetic aperture radar imagesSynthetic aperture radar images
Synthetic aperture radar imagessipij
 
MLIP - Chapter 6 - Generation, Super-Resolution, Style transfer
MLIP - Chapter 6 - Generation, Super-Resolution, Style transferMLIP - Chapter 6 - Generation, Super-Resolution, Style transfer
MLIP - Chapter 6 - Generation, Super-Resolution, Style transferCharles Deledalle
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketakiKetaki Patwari
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Seonho Park
 
Tutorial on convolutional neural networks
Tutorial on convolutional neural networksTutorial on convolutional neural networks
Tutorial on convolutional neural networksHojin Yang
 
Multimodal Residual Learning for Visual QA
Multimodal Residual Learning for Visual QAMultimodal Residual Learning for Visual QA
Multimodal Residual Learning for Visual QANamHyuk Ahn
 
Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyNUPUR YADAV
 
Vis03 Workshop. DT-MRI Visualization
Vis03 Workshop. DT-MRI VisualizationVis03 Workshop. DT-MRI Visualization
Vis03 Workshop. DT-MRI VisualizationLeonid Zhukov
 
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.GeeksLab Odessa
 
A Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model MatchingA Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model Matchingrafi
 
Blind Source Separation using Dictionary Learning
Blind Source Separation using Dictionary LearningBlind Source Separation using Dictionary Learning
Blind Source Separation using Dictionary LearningDavide Nardone
 
Convolutional neural networks deepa
Convolutional neural networks deepaConvolutional neural networks deepa
Convolutional neural networks deepadeepa4466
 
Project_report_BSS
Project_report_BSSProject_report_BSS
Project_report_BSSKamal Bhagat
 
Oriented Tensor Reconstruction. Tracing Neural Pathways from DT-MRI
Oriented Tensor Reconstruction. Tracing Neural Pathways from DT-MRIOriented Tensor Reconstruction. Tracing Neural Pathways from DT-MRI
Oriented Tensor Reconstruction. Tracing Neural Pathways from DT-MRILeonid Zhukov
 

Was ist angesagt? (20)

Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for Graphs
 
PR243: Designing Network Design Spaces
PR243: Designing Network Design SpacesPR243: Designing Network Design Spaces
PR243: Designing Network Design Spaces
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
Bayesian Core: Chapter 8
Bayesian Core: Chapter 8Bayesian Core: Chapter 8
Bayesian Core: Chapter 8
 
Synthetic aperture radar images
Synthetic aperture radar imagesSynthetic aperture radar images
Synthetic aperture radar images
 
190 195
190 195190 195
190 195
 
MLIP - Chapter 6 - Generation, Super-Resolution, Style transfer
MLIP - Chapter 6 - Generation, Super-Resolution, Style transferMLIP - Chapter 6 - Generation, Super-Resolution, Style transfer
MLIP - Chapter 6 - Generation, Super-Resolution, Style transfer
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
 
Tutorial on convolutional neural networks
Tutorial on convolutional neural networksTutorial on convolutional neural networks
Tutorial on convolutional neural networks
 
Multimodal Residual Learning for Visual QA
Multimodal Residual Learning for Visual QAMultimodal Residual Learning for Visual QA
Multimodal Residual Learning for Visual QA
 
Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A Survey
 
Vis03 Workshop. DT-MRI Visualization
Vis03 Workshop. DT-MRI VisualizationVis03 Workshop. DT-MRI Visualization
Vis03 Workshop. DT-MRI Visualization
 
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
 
A Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model MatchingA Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model Matching
 
Blind Source Separation using Dictionary Learning
Blind Source Separation using Dictionary LearningBlind Source Separation using Dictionary Learning
Blind Source Separation using Dictionary Learning
 
Convolutional neural networks deepa
Convolutional neural networks deepaConvolutional neural networks deepa
Convolutional neural networks deepa
 
Project_report_BSS
Project_report_BSSProject_report_BSS
Project_report_BSS
 
Talk Norway Aug2016
Talk Norway Aug2016Talk Norway Aug2016
Talk Norway Aug2016
 
Oriented Tensor Reconstruction. Tracing Neural Pathways from DT-MRI
Oriented Tensor Reconstruction. Tracing Neural Pathways from DT-MRIOriented Tensor Reconstruction. Tracing Neural Pathways from DT-MRI
Oriented Tensor Reconstruction. Tracing Neural Pathways from DT-MRI
 

Andere mochten auch

Subspace_Discriminant_Approach_Hyperspectral.ppt
Subspace_Discriminant_Approach_Hyperspectral.pptSubspace_Discriminant_Approach_Hyperspectral.ppt
Subspace_Discriminant_Approach_Hyperspectral.pptgrssieee
 
Pawel FORCZMANSKI "Dimensionality reduction methods applied to digital image ...
Pawel FORCZMANSKI "Dimensionality reduction methods applied to digital image ...Pawel FORCZMANSKI "Dimensionality reduction methods applied to digital image ...
Pawel FORCZMANSKI "Dimensionality reduction methods applied to digital image ...Lietuvos kompiuterininkų sąjunga
 
Part 2: Unsupervised Learning Machine Learning Techniques
Part 2: Unsupervised Learning Machine Learning Techniques Part 2: Unsupervised Learning Machine Learning Techniques
Part 2: Unsupervised Learning Machine Learning Techniques butest
 
How to come up with new research ideas
How to come up with new research ideasHow to come up with new research ideas
How to come up with new research ideasJia-Bin Huang
 

Andere mochten auch (8)

Foreground Detection : Combining Background Subspace Learning with Object Smo...
Foreground Detection : Combining Background Subspace Learning with Object Smo...Foreground Detection : Combining Background Subspace Learning with Object Smo...
Foreground Detection : Combining Background Subspace Learning with Object Smo...
 
Part2
Part2Part2
Part2
 
Subspace_Discriminant_Approach_Hyperspectral.ppt
Subspace_Discriminant_Approach_Hyperspectral.pptSubspace_Discriminant_Approach_Hyperspectral.ppt
Subspace_Discriminant_Approach_Hyperspectral.ppt
 
Sparse and Redundant Representations: Theory and Applications
Sparse and Redundant Representations: Theory and ApplicationsSparse and Redundant Representations: Theory and Applications
Sparse and Redundant Representations: Theory and Applications
 
Seminarppt
SeminarpptSeminarppt
Seminarppt
 
Pawel FORCZMANSKI "Dimensionality reduction methods applied to digital image ...
Pawel FORCZMANSKI "Dimensionality reduction methods applied to digital image ...Pawel FORCZMANSKI "Dimensionality reduction methods applied to digital image ...
Pawel FORCZMANSKI "Dimensionality reduction methods applied to digital image ...
 
Part 2: Unsupervised Learning Machine Learning Techniques
Part 2: Unsupervised Learning Machine Learning Techniques Part 2: Unsupervised Learning Machine Learning Techniques
Part 2: Unsupervised Learning Machine Learning Techniques
 
How to come up with new research ideas
How to come up with new research ideasHow to come up with new research ideas
How to come up with new research ideas
 

Ähnlich wie CVPR2008 tutorial generalized pca

Computational Giants_nhom.pptx
Computational Giants_nhom.pptxComputational Giants_nhom.pptx
Computational Giants_nhom.pptxThAnhonc
 
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptxLecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptxCuongnc220592
 
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleKuldeep Jiwani
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooJaeJun Yoo
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reductionYan Xu
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Maninda Edirisooriya
 
pattern_recognition2.ppt
pattern_recognition2.pptpattern_recognition2.ppt
pattern_recognition2.pptEricBacconi1
 
cnn.pptx
cnn.pptxcnn.pptx
cnn.pptxsghorai
 
data clean.ppt
data clean.pptdata clean.ppt
data clean.pptchatbot9
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 
about data mining and Exp about data mining and Exp.
about data mining and Exp about data mining and Exp.about data mining and Exp about data mining and Exp.
about data mining and Exp about data mining and Exp.MohammadMoreb
 
Multi-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceMulti-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceDiego Tosato
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?Tuan Yang
 
Aitf 2014 pem_introduction_presentation_feb28_ram_version2
Aitf 2014 pem_introduction_presentation_feb28_ram_version2Aitf 2014 pem_introduction_presentation_feb28_ram_version2
Aitf 2014 pem_introduction_presentation_feb28_ram_version2Bob MacMillan
 
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptx
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptxFassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptx
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptxHannesFesswald
 

Ähnlich wie CVPR2008 tutorial generalized pca (20)

KNN
KNNKNN
KNN
 
Computational Giants_nhom.pptx
Computational Giants_nhom.pptxComputational Giants_nhom.pptx
Computational Giants_nhom.pptx
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptxLecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
Lecture 17 - Grouping and Segmentation - Vision_Spring2017.pptx
 
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scaleODSC India 2018: Topological space creation &amp; Clustering at BigData scale
ODSC India 2018: Topological space creation &amp; Clustering at BigData scale
 
DM_clustering.ppt
DM_clustering.pptDM_clustering.ppt
DM_clustering.ppt
 
Super resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun YooSuper resolution in deep learning era - Jaejun Yoo
Super resolution in deep learning era - Jaejun Yoo
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 
Reza talk
Reza talkReza talk
Reza talk
 
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
Lecture 11 - KNN and Clustering, a lecture in subject module Statistical & Ma...
 
pattern_recognition2.ppt
pattern_recognition2.pptpattern_recognition2.ppt
pattern_recognition2.ppt
 
cnn.pptx
cnn.pptxcnn.pptx
cnn.pptx
 
data clean.ppt
data clean.pptdata clean.ppt
data clean.ppt
 
Clustering - ACM 2013 02-25
Clustering - ACM 2013 02-25Clustering - ACM 2013 02-25
Clustering - ACM 2013 02-25
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
about data mining and Exp about data mining and Exp.
about data mining and Exp about data mining and Exp.about data mining and Exp about data mining and Exp.
about data mining and Exp about data mining and Exp.
 
Multi-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video SurveillanceMulti-class Classification on Riemannian Manifolds for Video Surveillance
Multi-class Classification on Riemannian Manifolds for Video Surveillance
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
Aitf 2014 pem_introduction_presentation_feb28_ram_version2
Aitf 2014 pem_introduction_presentation_feb28_ram_version2Aitf 2014 pem_introduction_presentation_feb28_ram_version2
Aitf 2014 pem_introduction_presentation_feb28_ram_version2
 
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptx
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptxFassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptx
Fassold-MMAsia2023-Tutorial-GeometricDL-Part1.pptx
 

Mehr von zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

Mehr von zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Kürzlich hochgeladen

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 

Kürzlich hochgeladen (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 

CVPR2008 tutorial generalized pca

  • 1. Generalized Principal Component Analysis Tutorial @ CVPR 2008 Yi Ma René Vidal ECE Department Center for Imaging Science University of Illinois Institute for Computational Medicine Urbana Champaign Johns Hopkins University
  • 2. Data segmentation and clustering • Given a set of points, separate them into multiple groups • Discriminative methods: learn boundary • Generative methods: learn mixture model, using, e.g. Expectation Maximization
  • 3. Dimensionality reduction and clustering • In many problems data is high-dimensional: can reduce dimensionality using, e.g. Principal Component Analysis • Image compression • Recognition – Faces (Eigenfaces) • Image segmentation – Intensity (black-white) – Texture
  • 4. Segmentation problems in dynamic vision • Segmentation of video and dynamic textures • Segmentation of rigid-body motions
  • 5. Segmentation problems in dynamic vision • Segmentation of rigid-body motions from dynamic textures
  • 6. Clustering data on non Euclidean spaces • Clustering data on non Euclidean spaces – Mixtures of linear spaces – Mixtures of algebraic varieties – Mixtures of Lie groups • “Chicken-and-egg” problems – Given segmentation, estimate models – Given models, segment the data – Initialization? • Need to combine – Algebra/geometry, dynamics and statistics
  • 7. Outline of the tutorial • Introduction (8.00-8.15) • Part I: Theory (8.15-9.45) – Basic GPCA theory and algorithms (8.15-9.00) – Advanced statistical methods for GPCA (9.00-9.45) • Questions (9.45-10.00) • Break (10.00-10.30) • Part II: Applications (10.30-12.00) – Applications to motion and video segmentation (10.30-11.15) – Applications to image representation & segmentation (11.15-12.00) • Questions (12.00-12.15)
  • 8. Part I: Theory • Introduction to GPCA (8.00-8.15) • Basic GPCA theory and algorithms (8.15-9.00) – Review of PCA and extensions – Introductory cases: line, plane and hyperplane segmentation – Segmentation of a known number of subspaces – Segmentation of an unknown number of subspaces • Advanced statistical and methods for GPCA (9.00-9.45) – Lossy coding of samples from a subspace – Minimum coding length principle for data segmentation – Agglomerative lossy coding for subspace clustering
  • 9. Part II: Applications in computer vision • Applications to motion & video segmentation (10.30-11.15) – 2-D and 3-D motion segmentation – Temporal video segmentation – Dynamic texture segmentation • Applications to image representation and segmentation (11.15-12.00) – Multi-scale hybrid linear models for sparse image representation – Hybrid linear models for image segmentation
  • 11. Slides, MATLAB code, papers Slides: http://www.vision.jhu.edu/gpca/cvpr08-tutorial-gpca.htm Code: http://perception.csl.uiuc.edu/gpca
  • 12. Part I Generalized Principal Component Analysis René Vidal Center for Imaging Science Institute for Computational Medicine Johns Hopkins University
  • 13. Principal Component Analysis (PCA) • Given a set of points x1, x2, …, xN – Geometric PCA: find a subspace S passing through them – Statistical PCA: find projection directions that maximize the variance • Solution (Beltrami’1873, Jordan’1874, Hotelling’33, Eckart-Householder-Young’36) Basis for S • Applications: data compression, regression, computer vision (eigenfaces), pattern recognition, genomics
  • 14. Extensions of PCA • Higher order SVD (Tucker’66, Davis’02) • Independent Component Analysis (Common ‘94) • Probabilistic PCA (Tipping-Bishop ’99) – Identify subspace from noisy data – Gaussian noise: standard PCA – Noise in exponential family (Collins et al.’01) • Nonlinear dimensionality reduction – Multidimensional scaling (Torgerson’58) – Locally linear embedding (Roweis-Saul ’00) – Isomap (Tenenbaum ’00) • Nonlinear PCA (Scholkopf-Smola-Muller ’98) – Identify nonlinear manifold by applying PCA to data embedded in high-dimensional space • Principal Curves and Principal Geodesic Analysis (Hastie-Stuetzle’89, Tishbirany ‘92, Fletcher ‘04)
  • 15. Generalized Principal Component Analysis • Given a set of points lying in multiple subspaces, identify – The number of subspaces and their dimensions – A basis for each subspace – The segmentation of the data points • “Chicken-and-egg” problem – Given segmentation, estimate subspaces – Given subspaces, segment the data
  • 16. Prior work on subspace clustering • Iterative algorithms: – K-subspace (Ho et al. ’03), – RANSAC, subspace selection and growing (Leonardis et al. ’02) • Probabilistic approaches: learn the parameters of a mixture model using e.g. EM – Mixtures of PPCA: (Tipping-Bishop ‘99): – Multi-Stage Learning (Kanatani’04) • Initialization – Geometric approaches: 2 planes in R3 (Shizawa-Maze ’91) – Factorization approaches: independent subspaces of equal dimension (Boult-Brown ‘91, Costeira-Kanade ‘98, Kanatani ’01) – Spectral clustering based approaches: (Yan-Pollefeys’06)
  • 17. Basic ideas behind GPCA • Towards an analytic solution to subspace clustering – Can we estimate ALL models simultaneously using ALL data? – When can we do so analytically? In closed form? – Is there a formula for the number of models? • Will consider the most general case – Subspaces of unknown and possibly different dimensions – Subspaces may intersect arbitrarily (not only at the origin) • GPCA is an algebraic geometric approach to data segmentation – Number of subspaces = degree of a polynomial – Subspace basis = derivatives of a polynomial – Subspace clustering is algebraically equivalent to • Polynomial fitting • Polynomial differentiation
  • 18. Applications of GPCA in computer vision • Geometry – Vanishing points • Image compression • Segmentation – Intensity (black-white) – Texture – Motion (2-D, 3-D) – Video (host-guest) • Recognition – Faces (Eigenfaces) • Man - Woman – Human Gaits – Dynamic Textures • Water-bird • Biomedical imaging • Hybrid systems identification
  • 19. Introductory example: algebraic clustering in 1D • Number of groups?
  • 20. Introductory example: algebraic clustering in 1D • How to compute n, c, b’s? – Number of clusters – Cluster centers – Solution is unique if – Solution is closed form if
  • 21. Introductory example: algebraic clustering in 2D • What about dimension 2? • What about higher dimensions? – Complex numbers in higher dimensions? – How to find roots of a polynomial of quaternions? • Instead – Project data onto one or two dimensional space – Apply same algorithm to projected data
  • 22. Representing one subspace • One plane • One line • One subspace can be represented with – Set of linear equations – Set of polynomials of degree 1
  • 23. Representing n subspaces • Two planes • One plane and one line – Plane: – Line: De Morgan’s rule • A union of n subspaces can be represented with a set of homogeneous polynomials of degree n
  • 24. Fitting polynomials to data points • Polynomials can be written linearly in terms of the vector of coefficients by using polynomial embedding Veronese map • Coefficients of the polynomials can be computed from nullspace of embedded data – Solve using least squares – N = #data points
  • 25. Finding a basis for each subspace • Case of hyperplanes: – Only one polynomial – Number of subspaces – Basis are normal vectors Polynomial Factorization (GPCA-PFA) [CVPR 2003] • Find roots of polynomial of degree in one variable • Solve linear systems in variables • Solution obtained in closed form for • Problems – Computing roots may be sensitive to noise – The estimated polynomial may not perfectly factor with noisy – Cannot be applied to subspaces of different dimensions • Polynomials are estimated up to change of basis, hence they may not factor, even with perfect data
  • 26. Finding a basis for each subspace Polynomial Differentiation (GPCA-PDA) [CVPR’04] • To learn a mixture of subspaces we just need one positive example per class
  • 27. Choosing one point per subspace • With noise and outliers – Polynomials may not be a perfect union of subspaces – Normals can estimated correctly by choosing points optimally • Distance to closest subspace without knowing segmentation?
  • 28. GPCA for hyperplane segmentation • Coefficients of the polynomial can be computed from null space of embedded data matrix – Solve using least squares – N = #data points • Number of subspaces can be computed from the rank of embedded data matrix • Normal to the subspaces can be computed from the derivatives of the polynomial
  • 29. GPCA for subspaces of different dimensions • There are multiple polynomials fitting the data • The derivative of each polynomial gives a different normal vector • Can obtain a basis for the subspace by applying PCA to normal vectors
  • 30. GPCA for subspaces of different dimensions • Apply polynomial embedding to projected data • Obtain multiple subspace model by polynomial fitting – Solve to obtain – Need to know number of subspaces • Obtain bases & dimensions by polynomial differentiation • Optimally choose one point per subspace using distance
  • 31. An example • Given data lying in the union of the two subspaces • We can write the union as • Therefore, the union can be represented with the two polynomials
  • 32. An example • Can compute polynomials from • Can compute normals from
  • 33. Dealing with high-dimensional data • Minimum number of points – K = dimension of ambient space – n = number of subspaces Subspace 1 • In practice the dimension of each subspace ki is much smaller than K Subspace 2 – Number and dimension of the subspaces is preserved by a linear projection onto a subspace of dimension • Open problem: how to choose – Can remove outliers by robustly projection? fitting the subspace – PCA?
  • 34. GPCA with spectral clustering • Spectral clustering – Build a similarity matrix between pairs of points – Use eigenvectors to cluster data • How to define a similarity for subspaces? – Want points in the same subspace to be close – Want points in different subspace to be far • Use GPCA to get basis • Distance: subspace angles
  • 35. Comparison of PFA, PDA, K-sub, EM 18 PFA K−sub 16 PDA Error in the normals [degrees] EM 14 PDA+K−sub PDA+EM 12 PDA+K−sub+EM 10 8 6 4 2 0 0 1 2 3 4 5 Noise level [%]
  • 36. Dealing with outliers • GPCA with perfect data • GPCA with outliers • GPCA fails because PCA fails seek a robust estimate of Null(Ln ) where Ln = [ n (x1 ), . . . , n (xN )].
  • 37. Three approaches to tackle outliers • Probability-based: small-probability samples – Probability plots: [Healy 1968, Cox 1968] – PCs: [Rao 1964, Ganadesikan & Kettenring 1972] – M-estimators: [Huber 1981, Camplbell 1980] – Multivariate-trimming (MVT): [Ganadesikan & Kettenring 1972] • Influence-based: large influence on model parameters – Parameter difference with and without a sample: [Hampel et al. 1986, Critchley 1985] • Consensus-based: not consistent with models of high consensus. – Hough: [Ballard 1981, Lowe 1999] – RANSAC: [Fischler & Bolles 1981, Torr 1997] – Least Median Estimate (LME): [Rousseeuw 1984, Steward 1999]
  • 39. Robust GPCA Simulation on Robust GPCA (parameters fixed at = 0.3rad and = 0.4 • RGPCA – Influence (e) 12% (f) 32% (g) 48% (h) 12% (i) 32% (j) 48% • RGPCA - MVT (k) 12% (l) 32% (m) 48% (n) 12% (o) 32% (p) 48%
  • 40. Robust GPCA Comparison with RANSAC • Accuracy (q) (2,2,1) in 3 (r) (4,2,2,1) in 5 (s) (5,5,5) in 6 • Speed Table: Average time of RANSAC and RGPCA with 24% outliers. Arrangement (2,2,1) in 3 (4,2,2,1) in 5 (5,5,5) in 6 RANSAC 44s 5.1min 3.4min MVT 46s 23min 8min Influence 3min 58min 146min
  • 41. Summary • GPCA: algorithm for clustering subspaces – Deals with unknown and possibly different dimensions – Deals with arbitrary intersections among the subspaces • Our approach is based on – Projecting data onto a low-dimensional subspace – Fitting polynomials to projected subspaces – Differentiating polynomials to obtain a basis • Applications in image processing and computer vision – Image segmentation: intensity and texture – Image compression – Face recognition under varying illumination
  • 42. For more information, Vision, Dynamics and Learning Lab @ Johns Hopkins University Thank You!
  • 43. Generalized Principal Component Analysis via Lossy Coding and Compression Yi Ma Image Formation & Processing Group, Beckman Decision & Control Group, Coordinated Science Lab. Electrical & Computer Engineering Department University of Illinois at Urbana-Champaign
  • 44. OUTLINE MOTIVATION PROBLEM FORMULATION AND EXISTING APPROACHES SEGMENTATION VIA LOSSY DATA COMPRESSION SIMULATIONS (AND EXPERIMENTS) CONCLUSIONS AND FUTURE DIRECTIONS
  • 45. MOTIVATION – Motion Segmentation in Computer Vision Goal: Given a sequence of images of multiple moving objects, determine: – 1. the number and types of motions (rigid-body, affine, linear, etc.) 2. the features that belong to the same motion. QuickTime™ and a Cinepak decompressor are needed to see this picture. The “chicken-and-egg” difficulty: – Knowing the segmentation, estimating the motions is easy; – Knowing the motions, segmenting the features is easy. A Unified Algebraic Approach to 2D and 3D Motion Segmentation, [Vidal-Ma, ECCV’
  • 46. MOTIVATION – Image Segmentation Goal: segment an image into multiple regions with homogeneous texture. feature s Computer Human Difficulty: A mixture of models of different dimensions or complexities. Multiscale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP’
  • 47. MOTIVATION – Video Segmentation Goal: segmenting a video sequence into segments with “stationary” dynamics Model: different segments as outputs from different (linear) dynamical systems: QuickTime™ and a H.264 decompressor are needed to see this picture. Identification of Hybrid Linear Systems via Subspace Segmentation, [Huang-Wagner-Ma, C
  • 48. MOTIVATION – Massive Multivariate Mixed Data QuickTime™ and a BMP decompressor are needed to see this picture. Face database Hyperspectral images Articulate motions Hand written digits Microarrays
  • 49. SUBSPACE SEGMENTATION – Problem Formulation Assumption: the data are noisy samples from an arrangement of linear subspaces: noise-free samples noisy samples samples with outliers Difficulties: – the dimensions of the subspaces can be different – the data can be corrupted by noise or contaminated by outliers – the number and dimensions of subspaces may be unknown
  • 50. SUBSPACE SEGMENTATION – Statistical Approaches Assume that the data are i.i.d. samples from a mixture of probabilistic distributions: Solutions: • Expectation Maximization (EM) for the maximum-likelihood estimate [Dempster et. al.’77], e.g., Probabilistic PCA [Tipping-Bishop’99]: • K-Means for a minimax-like estimate [Forgy’65, Jancey’66, MacQueen’67], e.g., K-Subspaces [Ho and Kriegman’03]: Essentially iterate between data segmentation and model estimation.
  • 51. SUBSPACE SEGMENTATION – An Algebro-Geometric Approach Idea: a union of linear subspaces is an algebraic set -- the zero set of a set of (homogeneous) polynomials: Solution: • Identify the set of polynomials of degree n that vanish on • Gradients of the vanishing polynomials are normals to the subspaces Complexity exponential in the dimension and number of subspaces. Generalized Principal Component Analysis, [Vidal-Ma-Sastry, IEEE Transactions PAMI’0
  • 52. SUBSPACE SEGMENTATION – An Information-Theoretic Approach Problem: If the number/dimension of subspaces not given and data corrupted by noise and outliers, how to determine the optimal subspaces that fit Solutions: Model Selection Criteria? the data? – Minimum message length (MML) [Wallace-Boulton’68] – Minimum description length (MDL) [Rissanen’78] – Bayesian information criterion (BIC) – Akaike information criterion (AIC) [Akaike’77] – Geometric AIC [Kanatani’03], Robust AIC [Torr’98] Key idea (MDL): • a good balance between model complexity and data fidelity. • minimize the length of codes that describe the model and the data: with a quantization error optimal for the model.
  • 53. LOSSY DATA COMPRESSION Questions: – What is the “gain” or “loss” of segmenting or merging data? – How does tolerance of error affect segmentation results? Basic idea: whether the number of bits required to store “the whole is more than the sum of its parts”?
  • 54. LOSSY DATA COMPRESSION – Problem Formulation – A coding scheme maps a set of vectors to a sequence of bits, from which we can decode The coding length is denoted as: – Given a set of real-valued mixed data the optimal segmentation minimizes the overall coding length: where
  • 55. LOSSY DATA COMPRESSION – Coding Length for Multivariate Data Theorem. Given with is the number of bits needed to encode the data s.t. . A nearly optimal bound for even a small number of vectors drawn from a subspace or a Gaussian source. Segmentation of Multivariate Mixed Data, [Ma-Derksen-Hong-Wright, PAMI’
  • 56. LOSSY DATA COMPRESSION – Two Coding Schemes Goal: code s.t. a mean squared error Linear subspace Gaussian source
  • 57. LOSSY DATA COMPRESSION – Properties of the Coding Length 1. Commutative Property: For high-dimensional data, computing the coding length only needs the kernel matrix: 2. Asymptotic Property: At high SNR, this is the optimal rate distortion for a Gaussian source. 3. Invariant Property: Harmonic Analysis is useful for data compression only when the data are non-Gaussian or nonlinear ……… so is segmentation!
  • 58. LOSSY DATA COMPRESSION – Why Segment? partitioning: sifting:
  • 59. LOSSY DATA COMPRESSION – Probabilistic Segmentation? Assign the ith point to the jth group with probability Theorem. The expected coding length of the segmented data is a concave function in Π over the domain of a convex polytope. Minima are reached at the vertexes of the polytope -- no probabilistic segmentation! Segmentation of Multivariate Mixed Data, [Ma-Derksen-Hong-Wright, PAMI’
  • 60. LOSSY DATA COMPRESSION – Segmentation & Channel Capacity A MIMO additive white Gaussian noise (AWGN) channel has the capacity: If allowing probabilistic grouping of transmitters, the expected capacity is a concave function in Π over a convex polytope. Maximizing such a capacity is a convex problem. On Coding and Segmentation of Multivariate Mixed Data, [Ma-Derksen-Hong-Wright, PAMI
  • 61. LOSSY DATA COMPRESSION – A Greedy (Agglomerative) Algorithm Objective: minimizing the overall coding length Input: “Bottom-up” merge while true do choose two sets such that is minimal QuickTime™ and a if PNG decompressor are needed to see this picture. then else break endif end Output: Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
  • 62. SIMULATIONS – Mixture of Almost Degenerate Gaussians Noisy samples from two lines and one plane in <3 Given Data Segmentation Results ε0 = 0.01 ε0 = 0.08 Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
  • 63. SIMULATIONS – “Phase Transition” #group v.s. distortion Rate v.s. distortion ε0 = 0.0 0.08 8 ice cubes steam Stability: the same segmentation water for ε across 3 magnitudes! 0.08 Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
  • 64. SIMULATIONS – Comparison with EM 100 x d uniformly distributed random samples from each subspace, corrupte with 4% noise. Classification rate averaged over 25 trials for each case. Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
  • 65. SIMULATIONS – Comparison with EM Segmenting three degenerate or non-degenerate Gaussian clusters for 50 tria Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
  • 66. SIMULATIONS – Robustness with Outliers 35.8% outliers 45.6% 71.5% 73.6% Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
  • 67. SIMULATIONS – Affine Subspaces with Outliers 35.8% outliers 45.6% 66.2% 69.1% Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
  • 68. SIMULATIONS – Piecewise-Linear Approximation of Manifolds Swiss roll Mobius strip Torus Klein bottle
  • 69. SIMULATIONS – Summary – The minimum coding length objective automatically addresses the model selection issue: the optimal solution is very stable and robust. – The segmentation/merging is physically meaningful (measured in bits). The results resemble phase transition in statistical physics. – The greedy algorithm is scalable (polynomial in both K and N) and converges well when ε is not too small w.r.t. the sample density.
  • 70. Clustering from a Classification Perspective Assumption: The training data are drawn from a distribution Goal: Construct a classifier such that the misclassification error reaches minimum. Solution: Knowing the distributions and , the optimal classifier is the maximum a posteriori (MAP) classifier: Difficulties: How to learn the two distributions from samples? (parametric, non-parametric, model selection, high-dimension, outliers…)
  • 71. MINIMUM INCREMENTAL CODING LENGTH – Problem Formulation Ideas: Using the lossy coding length as a surrogate for the Shannon lossless coding length w.r.t. true distributions. Additional bits need to encode the test sample with the jth training set is Classification Criterion: Minimum Incremental Coding Length (MICL)
  • 72. MICL (“Michael”) – Asymptotic Properties Theorem: As the number of samples goes to infinity, the MICL criterion converges with probability one to the following criterion: where ? is the “number of effective parameters” of the j-th model (class). Theorem: The MICL classifier converges to the above asymptotic form at the rate of for some constant . Minimum Incremental Coding Length (MICL), [Wright and Ma et. a.., NIPS’07]
  • 73. SIMULATIONS – Interpolation and Extrapolation via MICL MICL SVM k-NN Minimum Incremental Coding Length (MICL), [Wright and Ma et. a.., NIPS’07]
  • 74. SIMULATIONS – Improvement over MAP and RDA [Friedman1989] Two Gaussians in R2 isotropic (left) anisotropic (right) (500 trials) Three Gaussians in Rn dim = n dim = n/2 dim = 1 (500 trials) Minimum Incremental Coding Length (MICL), [Wright and Ma et. a.., NIPS’07]
  • 75. SIMULATIONS – Local and Kernel MICL Local MICL (LMICL): Applying MICL locally to the k-nearest neighbors of the test sample (frequencylist + Bayesianist). Kernel MICL (KMICL): Incorporating MICL with a nonlinear kernel naturally through the identity (“kernelized” RDA): LMICL k- KMICL-RBF SVM-RBF NN Minimum Incremental Coding Length (MICL), [Wright and Ma et. a.., NIPS’07]
  • 76. CONCLUSIONS Assumptions: Data are in a high-dimensional space but have low-dimensional structures (subspaces or submanifolds). Compression => Clustering & Classification: – Minimum (incremental) coding length subject to distortion. – Asymptotically optimal clustering and classification. – Greedy clustering algorithm (bottom-up, agglomerative). – MICL corroborates MAP, RDA, k-NN, and kernel methods. Applications (Next Lectures): – Video segmentation, motion segmentation (Vidal) – Image representation & segmentation (Ma) – Others: microarray clustering, recognition of faces and handwritten digits (Ma)
  • 77. FUTURE DIRECTIONS Theory – More complex structures: manifolds, systems, random fields… – Regularization (ridge, lasso, banding etc.) – Sparse representation and subspace arrangements Computation – Global optimality (random techniques, convex optimization…) – Scalability: random sampling, approximation… Future Application Domains – Image/video/audio classification, indexing, and retrieval – Hyper-spectral images and videos – Biomedical images, microarrays – Autonomous navigation, surveillance, and 3D mapping – Identification of hybrid linear/nonlinear systems
  • 78. REFERENCES & ACKNOWLEGMENT References: – Segmentation of Multivariate Mixed Data via Lossy Data Compression, Yi Ma, Harm Derksen, Wei Hong, John Wright, PAMI, 2007. – Classification via Minimum Incremental Coding Length (MICL), John Wright et. al., NIPS, 2007. – Website: http://perception.csl.uiuc.edu/coding/home.htm People: – John Wright, PhD Student, ECE Department, University of Illinois – Prof. Harm Derksen, Mathematics Department, University of Michigan – Allen Yang (UC Berkeley) and Wei Hong (Texas Instruments R&D) – Zhoucheng Lin and Harry Shum, Microsoft Research Asia, China Funding: – ONR YIP N00014-05-1-0633 – NSF CAREER IIS-0347456, CCF-TF-0514955, CRS-EHS-0509151
  • 79. 11/2003 “The whole is more than the sum of its parts.” -- Aristotle Questions, please? Yi Ma, CVPR 2008
  • 80. Part II Applications of GPCA in Computer Vision René Vidal Center for Imaging Science Institute for Computational Medicine Johns Hopkins University
  • 81. Part II: Applications in computer vision • Applications to motion & video segmentation (10.30-11.15) – 2-D and 3-D motion segmentation – Temporal video segmentation – Dynamic texture segmentation • Applications to image representation and segmentation (11.15-12.00) – Multi-scale hybrid linear models for sparse image representation – Hybrid linear models for image segmentation
  • 82. Applications to motion and and video segmentation René Vidal Center for Imaging Science Institute for Computational Medicine Johns Hopkins University
  • 83. 3-D motion segmentation problem • Given a set of point correspondences in multiple views, determine – Number of motion models – Motion model: affine, homography, fundamental matrix, trifocal tensor – Segmentation: model to which each pixel belongs • Mathematics of the problem depends on – Number of frames (2, 3, multiple) – Projection model (affine, perspective) – Motion model (affine, translational, homography, fundamental matrix, etc.) – 3-D structure (planar or not)
  • 84. Taxonomy of problems • 2-D Layered representation – Probabilistic approaches: Jepson-Black’93, Ayer-Sawhney’95, Darrel-Pentland’95, Weiss- Adelson’96, Weiss’97, Torr-Szeliski-Anandan’99 – Variational approaches: Cremers-Soatto ICCV’03 – Initialization: Wang-Adelson’94, Irani-Peleg’92, Shi-Malik‘98, Vidal-Singaraju’05-’06 • Multiple rigid motions in two perspective views – Probabilistic approaches: Feng-Perona’98, Torr’98 – Particular cases: Izawa-Mase’92, Shashua-Levin’01, Sturm’02, – Multibody fundamental matrix: Wolf-Shashua CVPR’01, Vidal et al. ECCV’02, CVPR’03, IJCV’06 – Motions of different types: Vidal-Ma-ECCV’04, Rao-Ma-ICCV’05 • Multiple rigid motions in three perspective views – Multibody trifocal tensor: Hartley-Vidal-CVPR’04 • Multiple rigid motions in multiple affine views – Factorization-based: Costeira-Kanade’98, Gear’98, Wu et al.’01, Kanatani’ et al.’01-02-04 – Algebraic: Yan-Pollefeys-ECCV’06, Vidal-Hartley-CVPR’04 • Multiple rigid motions in multiple perspective views – Schindler et al. ECCV’06, Li et al. CVPR’07
  • 85. A unified approach to motion segmentation • Estimation of multiple motion models equivalent to estimation of one multibody motion model chicken-and-egg – Eliminate feature clustering: multiplication – Estimate a single multibody motion model: polynomial fitting – Segment multibody motion model: polynomial differentiation
  • 86. A unified approach to motion segmentation • Applies to most motion models in computer vision • All motion models can be segmented algebraically by – Fitting multibody model: real or complex polynomial to all data – Fitting individual model: differentiate polynomial at a data point
  • 87. Segmentation of 3-D translational motions • Multiple epipoles (translation) • Epipolar constraint: plane in – Plane normal = epipoles – Data = epipolar lines • Multibody epipolar constraint • Epipoles are derivatives of at epipolar lines
  • 88. Segmentation of 3-D translational motions
  • 89. Single-body factorization Structure = 3D surface • Affine camera model – p = point – f = frame Motion = camera position and orientation • Motion of one rigid-body lives in a 4-D subspace (Boult and Brown ’91, Tomasi and Kanade ‘92) – P = #points – F = #frames
  • 90. Multi-body factorization • Given n rigid motions • Motion segmentation is obtained from – Leading singular vector of (Boult and Brown ’91) – Shape interaction matrix (Costeira & Kanade ’95, Gear ’94) – Number of motions (if fully-dimensional) • Motion subspaces need to be independent (Kanatani ’01)
  • 91. Multi-body factorization • Sensitive to noise – Kanatani (ICCV ’01): use model selection to scale Q – Wu et al. (CVPR’01): project data onto subspaces and iterate • Fails with partially dependent motions – Zelnik-Manor and Irani (CVPR’03) • Build similarity matrix from normalized Q • Apply spectral clustering to similarity matrix – Yan and Pollefeys (ECCV’06) • Local subspace estimation + spectral clustering – Kanatani (ECCV’04) • Assume degeneracy is known: pure translation in the image • Segment data by multi-stage optimization (multiple EM problems) • Cannot handle missing data – Gruber and Weiss (CVPR’04) • Expectation Maximization
  • 92. PowerFactorization+GPCA • A motion segmentation algorithm that – Is provably correct with perfect data – Handles both independent and degenerate motions – Handles both complete and incomplete data • Project trajectories onto a 5-D subspace of – Complete data: PCA or SVD – Incomplete data: PowerFactorization • Cluster projected subspaces using GPCA – Handles both independent and degenerate motions – Non-iterative: can be used to initialize EM
  • 93. Projection onto a 5-D subspace • Motion of one rigid-body lives in 4-D subspace of Motion 1 • By projecting onto a 5-D subspace of Motion 2 – Number and dimensions of subspaces are preserved – Motion segmentation is equivalent to clustering subspaces of dimension 2, 3 or 4 in – Minimum #frames = 3 (CK needs a minimum of 2n frames for n motions) • What projection to use? – Can remove outliers by robustly – PCA: 5 principal components fitting the 5-D subspace using Robust SVD (DeLaTorre-Black) – RPCA: with outliers
  • 94. Projection onto a 5-D subspace PowerFactorization algorithm: Given , factor it as • Complete data • Incomplete data – Given A solve for B – Orthonormalize B Linear problem – Given B solve for A – Iterate • It diverges in some cases • Converges to rank-r approximation with rate • Works well with up to 30% of missing data
  • 95. Motion segmentation using GPCA • Apply polynomial embedding to 5-D points Veronese map
  • 96. Hopkins 155 motion segmentation database • Collected 155 sequences – 120 with 2 motions – 35 with 3 motions • Types of sequences – Checkerboard sequences: mostly full dimensional and independent motions – Traffic sequences: mostly degenerate (linear, planar) and partially dependent motions – Articulated sequences: mostly full dimensional and partially dependent motions • Point correspondences – In few cases, provided by Kanatani & Pollefeys – In most cases, extracted semi-automatically with OpenCV
  • 97. Experimental results: Hopkins 155 database • 2 motions, 120 sequences, 266 points, 30 frames
  • 98. Experimental results: Hopkins 155 database • 3 motions, 35 sequences, 398 points, 29 frames
  • 99. Experimental results: missing data sequences • There is no clear correlation between amount of missing data and percentage of misclassification • This could be because convergence of PF depends more on “where” missing data is located than on “how much” missing data there is
  • 100. Conclusions • For two motions – Algebraic methods (GPCA and LSA) are more accurate than statistical methods (RANSAC and MSL) – LSA performs better on full and independent sequences, while GPCA performs better on degenerate and partially dependent – LSA is sensitive to dimension of projection: d=4n better than d=5 – MSL is very slow, RANSAC and GPCA are fast • For three motions – GPCA is not very accurate, but is very fast – MSL is the most accurate, but it is very slow – LSA is almost as accurate as MSL and almost as fast as GPCA
  • 101. Segmentation of Dynamic Textures René Vidal Center for Imaging Science Institute for Computational Medicine Johns Hopkins University
  • 102. Modeling a dynamic texture: fixed boundary • Examples of dynamic textures: • Model temporal evolution as the output of a linear dynamical system (LDS): Soatto et al. ‘01 dynamics zt+1 = Azt + vt images yt = Czt + wt appearance
  • 103. Segmenting non-moving dynamic textures • One dynamic texture lives in the observability subspace zt+1 = Azt + vt yt = Czt + wt • Multiple textures live in multiple subspaces water steam • Cluster the data using GPCA
  • 105. Segmenting moving dynamic textures Ocean-bird
  • 106. Level-set intensity-based segmentation • Chan-Vese energy functional • Implicit methods – Represent C as the zero level set of an implicit function , i.e. C = {(x, y) : (x, y) = 0} • Solution – The solution to the gradient descent algorithm for is given by – c1 and c2 are the mean intensities inside and outside the contour C.
  • 107. Dynamics & intensity-based energy • We represent the intensities of the pixels in the images as the output of a mixture of AR models of order p • We propose the following spatial-temporal extension of the Chan-Vese energy functional where
  • 108. Variational segmentation of dynamic textures • Given the ARX parameters, we can solve for the implicit function by solving the PDE • Given the implicit function , we can solve for the ARX parameters of the jth region by solving the linear system
  • 109. Variational segmentation of dynamic textures • Fixed boundary segmentation results and comparison Ocean-smoke Ocean-dynamics Ocean-appearance
  • 110. Variational segmentation of dynamic textures • Moving boundary segmentation results and comparison Ocean-fire
  • 111. Variational segmentation of dynamic textures • Results on a real sequence Raccoon on River
  • 112. Temporal video segmentation • Segmenting N=30 frames of a sequence containing n=3 scenes – Host – Guest – Both • Image intensities are output of linear system dynamics xt+1 = Axt +vt • y =C Apply GPCA totfit n=3 xt +wt images observability subspaces appearance
  • 113. Temporal video segmentation • Segmenting N=60 frames of a sequence containing n=3 scenes – Burning wheel – Burnt car with people – Burning car • Image intensities are output of linear system dynamics xt+1 = Axt +vt yt = Cxt +wt images • Apply GPCA to fit n=3 appearance observability subspaces
  • 114. Conclusions • Many problems in computer vision can be posed as subspace clustering problems – Temporal video segmentation – 2-D and 3-D motion segmentation – Dynamic texture segmentation – Nonrigid motion segmentation • These problems can be solved using GPCA: an algorithm for clustering subspaces – Deals with unknown and possibly different dimensions – Deals with arbitrary intersections among the subspaces • GPCA is based on – Projecting data onto a low-dimensional subspace – Recursively fitting polynomials to projected subspaces – Differentiating polynomials to obtain a basis
  • 115. For more information, Vision, Dynamics and Learning Lab @ Johns Hopkins University Thank You!
  • 116. Generalized Principal Component Analysis for Image Representation & Segmentation Yi Ma Control & Decision, Coordinated Science Laboratory Image Formation & Processing Group, Beckman Department of Electrical & Computer Engineering University of Illinois at Urbana-Champaign
  • 117. INTRODUCTION GPCA FOR LOSSY IMAGE REPRESENTATION IMAGE SEGMENTATION VIA LOSSY COMPRESSION OTHER APPLICATIONS CONCLUSIONS AND FUTURE DIRECTIONS
  • 118. Introduction – Image Representation via Linear Transformations better representations? pixel-based representation three matrixes of RGB-values a more compact linear transformation representation
  • 119. Introduction Fixed Orthogonal Bases (representation, approximation, compression) - Discrete Fourier transform (DFT) or discrete cosine transform (DCT) (Ahmed ’74): JPEG. - Wavelets (multi-resolution) (Daubechies’88, Mallat’92): JPEG-2000. - Curvelets and contourlets (Candes & Donoho’99, Do & Veterlli’00) Discrete Fourier transform (DFT) 6.25% coefficients. Wavelet transform Unorthogonal Bases (for redundant representations) - Extended lapped transforms, frames, sparse representations (Lp geometry)…
  • 120. Introduction Adaptive Bases (optimal if imagery data are uni-modal) - Karhunen-Loeve transform (KLT), also known as PCA (Pearson’1901, Hotelling’33, Jolliffe’86) stack adaptive bases
  • 121. Introduction – Principal Component Analysis (PCA) Dimensionality Reduction Find a low-dimensional representation (model) for high-dimensional data. Principal Component Analysis (Pearson’1901, Hotelling’1933, Eckart & Young’1936) or Karhunen-Loeve transform (KLT). Basis for S SVD Variations of PCA – Nonlinear Kernel PCA (Scholkopf-Smola-Muller’98) – Probabilistic PCA (Tipping-Bishop’99, Collins et.al’01) – Higher-Order SVD (HOSVD) (Tucker’66, Davis’02) – Independent Component Analysis (Hyvarinen-Karhunen-Oja’01)
  • 122. Hybrid Linear Models – Multi-Modal Characteristics Distribution of the first three principal components of the Baboon image: A clear multi-modal distribution
  • 123. Hybrid Linear Models – Multi-Modal Characteristics Vector Quantization (VQ) - multiple 0-dimensional affine subspaces (i.e. cluster means) - existing clustering algorithms are iterative (EM, K-means)
  • 124. Hybrid Linear Models – Versus Linear Models A single linear model Linear stack Hybrid linear models Hybrid linear stack
  • 125. Hybrid Linear Models – Characteristics of Natural Images Multivariate Hybrid Hierarchical High-dimensio 1D 2D (multi-modal) (multi-scale) (vector-value Fourier (DCT) X X Wavelets X X Curvelets X Random fields X X X PCA/KLT X X X VQ X X X X Hybrid linear X X X X X We need a new & simple paradigm to effectively account for all these characteristics simultaneously.
  • 126. Hybrid Linear Models – Subspace Estimation and Segmentation Hybrid Linear Models (or Subspace Arrangements) – the number of subspaces is unknown – the dimensions of the subspaces are unknown – the basis of the subspaces are unknown – the segmentation of the data points is unknown “Chicken-and-Egg” Coupling – Given segmentation, estimate subspaces – Given subspaces, segment the data
  • 127. Hybrid Linear Models – Recursive GPCA (an Example)
  • 128. Hybrid Linear Models – Effective Dimension Model Selection (for Noisy Data) Model complexity; Data fidelity; Number of subspaces Total Dimension Number of number of of each points in each points subspace subspace Model selection criterion: minimizing effective dimension subject to a given error tolerance (or PSNR)
  • 129. Hybrid Linear Models – Simulation Results (5% Noise) ED=3 ED=2.0067 ED=1.6717
  • 130. Hybrid Linear Models – Subspaces of the Barbara Image
  • 131. Hybrid Linear Models – Lossy Image Representation (Baboon) GPCA Original PCA (8x8) DCT (JPEG) Harr Wavelet GPCA (8x8)
  • 132. Multi-Scale Implementation – Algorithm Diagram Diagram for a level-3 implementation of hybrid linear models for image representation Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
  • 133. Multi-Scale Implementation – The Baboon Image The Baboon image downsample by two twice segmentation of 2 by 2 blocks Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
  • 134. Multi-Scale Implementation – Comparison with Other Methods The Baboon image Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
  • 135. Multi-Scale Implementation – Image Approximation Comparison with level-3 wavelet (7.5% coefficients) Level-3 bior-4.4 wavelets Level-3 hybrid linear model PSNR=23.94 PSNR=24.64 Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
  • 136. Multi-Scale Implementation – Block Size Effect The Baboon image Some problems with the multi-scale hybrid linear model: 1. has minor block effect; 2. is computationally more costly (than Fourier, wavelets, PCA); 3. does not fully exploit spatial smoothness as wavelets. Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
  • 137. Multi-Scale Implementation – The Wavelet Domain The Baboon image HL LH HH segmentation at each scale Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
  • 138. Multi-Scale Implementation – Wavelets v.s. Hybrid Linear Wavelets The Baboon image Advantages of the hybrid linear model in wavelet domain: 1. eliminates block effect; 2. is computationally less costly (than in the spatial domain); 3. achieves higher PSNR. Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
  • 139. Multi-Scale Implementation – Visual Comparison Comparison among several models (7.5% coefficients) Original Wavelets Image PSNR=23.94 Hybrid model Hybrid model in spatial in wavelet domain domain PSNR=24.64 PSNR=24.88 Multi-Scale Hybrid Linear Models for Lossy Image Representation, [Hong-Wright-Ma, TIP
  • 140. Image Segmentation – via Lossy Data Compression stack QuickTime™ and a PNG decompressor are needed to see this picture.
  • 141. APPLICATIONS – Texture-Based Image Segmentation Naïve approach: – Take a 7x7 Gaussian window around every pixel. – Stack these windows as vectors. – Clustering the vectors using our algorithm. A few results: Segmentation of Multivariate Mixed Data via Lossy Coding and Compression, [Ma-Derksen-Hong-Wright, PAMI’07]
  • 142. APPLICATIONS – Distribution of Texture Features Question: why does such a simple algorithm work at all? Answer: Compression (MDL/MCL) is well suited to mid-level texture segmentation. Using a single representation (e.g. windows, filterbank responses) for textures different complexity ⇒ redundancy and degeneracy, which can be exploited fo clustering / compression. QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Above: singular values of feature vectors from two different segments of the image at left.
  • 143. APPLICATIONS – Compression-based Texture Merging (CTM) Problem with the naïve approach: QuickTime™ and a TIFF (LZW) decompressor QuickTime™ and a QuickTime™ and a are needed to see this picture. Strong edges, segment boundaries TIFF (LZW) decompressor TIFF (LZW) decompressor are needed to see this picture. are needed to see this picture. Solution: Low-level, edge-preserving over-segmentation into small homogeneous regions. Simple features: stacked Gaussian windows (7x7 in our experiments). Merge adjacent regions to minimize coding length (“compress” the features).
  • 144. APPLICATIONS – Hierarchical Image Segmentation via CTM ε = 0.1 ε = 0.2 ε = 0.4 Lossy coding with varying distortion ε => hierarchy of segmentations
  • 145. APPLICATIONS – CTM: Qualitative Results
  • 146. APPLICATIONS – CTM: Quantitative Evaluation and Comparison Berkeley Image Segmentation Database PRI: Probabilistic Rand Index [Pantofaru 2005] VoI: Variation of Information [Meila 2005] GCE: Global Consistency Error [Martin 2001] BDE: Boundary Displacement Error [Freixenet 2002] Unsupervised Segmentation of Natural Images via Lossy Data Compression, CVIU, 200
  • 147. Other Applications: Multiple Motion Segmentation (on Hopkins155) QuickTime™ and a QuickTime™ and a Cinepak decompressor Cinepak decompressor are needed to see this picture. are needed to see this picture. Two Motions: MSL 4.14%, LSA 3.45%, ALC 2.40%, and work with up to 25% outliers. Three Motions: MSL 8.32%, LSA 9.73%, ALC 6.26%. Shankar Rao, Roberton Tron, Rene Vidal, and Yi Ma, to appear in CVPR’08
  • 148. Other Applications – Clustering of Microarray Data Segmentation of Multivariate Mixed Data, [Ma-Derksen-Hong-Wright, PAMI’
  • 149. Other Applications – Clustering of Microarray Data Segmentation of Multivariate Mixed Data, [Ma-Derksen-Hong-Wright, PAMI’
  • 150. Other Applications – Supervised Classification Premises: Data lie on an arrangement of subspaces Unsupervised Clustering Supervised Classification – Generalized PCA – Sparse Representation
  • 151. Other Applications – Robust Face Recognition Robust Face Recognition via Sparse Representation, to appear in PAMI 2008
  • 152. Other Applications: Robust Motion Segmentation (on Hopkins155) Dealing with incomplete or mistracked features with dataset 80% corrupted! Shankar Rao, Roberto Tron, Rene Vidal, and Yi Ma, to appear in CVPR’08
  • 153. Three Measures of Sparsity: Bits, L_0 and L1-Norm Reason: High-dimensional data, like images, do have compact, compressible, sparse structures, in terms of their geometry, statistics, and semantics.
  • 154. Conclusions Most imagery data are high-dimensional, statistically or geometrically heterogeneous, and have multi-scale structures. Imagery data require hybrid models that can adaptively represent different subsets of the data with different (sparse) linear models. Mathematically, it is possible to estimate and segment hybrid (linear) models non-iteratively. GPCA offers one such method. Hybrid models lead to new paradigms, new principles, and new applications for image representation, compression, and segmentation.
  • 155. Future Directions Mathematical Theory – Subspace arrangements (algebraic properties). – Extension of GPCA to more complex algebraic varieties (e.g., hybrid multilinear, high-order tensors). – Representation & approximation of vector-valued functions. Computation & Algorithm Development – Efficiency, noise sensitivity, outlier elimination. – Other ways to combine with wavelets and curvelets. Applications to Other Data – Medical imaging (ultra-sonic, MRI, diffusion tensor…) – Satellite hyper-spectral imaging. – Audio, video, faces, and digits. – Sensor networks (location, temperature, pressure, RFID…) – Bioinformatics (gene expression data…)
  • 156. Acknowledgement People – Wei Hong, Allen Yang, John Wright, University of Illinois – Rene Vidal of Biomedical Engineering Dept., Johns Hopkins University – Kun Huang of Biomedical & Informatics Science Dept., Ohio- State University Funding – Research Board, University of Illinois at Urbana-Champaign – National Science Foundation (NSF CAREER IIS-0347456) – Office of Naval Research (ONR YIP N000140510633) – National Science Foundation (NSF CRS-EHS0509151) – National Science Foundation (NSF CCF-TF0514955)
  • 157. Generalized Principal Component Analysis: Modeling and Segmentation of Multivariate Mixed Data Rene Vidal, Yi Ma, and Shankar Sastry Springer-Verlag, to appear Thank You! Yi Ma, CVPR 2008