SlideShare ist ein Scribd-Unternehmen logo
1 von 47
Object Recognition In Probabilistic 3-D
         Volumetric Scenes


                         Maria Isabel Restrepo
                         Brandon A. Mayer
                         Joseph L. Mundy
Goal: Automated Scene Description




Maria Isabel Restrepo. February 7, 2012                                       2
Goal: Automated Scene Description




Maria Isabel Restrepo. February 7, 2012                                       3
Related Work : 3-d Object Retrieval
   EC


                                                                                                  120            R. Toldo, U. Castellani, and A. Fusiel
  R
SH




Bronstein et al.
                                            Hough Transforms and 3D SURF for robust three dimensional classification    3
                                                  of transformations
                                                              pdif (x) = (log Kατ2 (x, x) − log Kατ1 (x, x), . . . ,
                                                                         log Kατ m (x, x) − log Kατm−1 (x, x)),
                                                                ˆ
                                                                p(x) = |(F pdif (x))(ω1 , . . . , ωn )|,                    (3)
                                                  where F is the discrete Fourier transform, and ω1 , . . . , ωn denotes
                                                  a set of frequencies at which the transformed vector is sampled.
                                                  Taking differences of logarithms removes the scaling constant, and
                                                  the Fourier transform converts the scale-space shift into a complex

     A. M. Bronstein, et.al.
                                                  phase, which is removed by taking the absolute value. Typically,
                                                        (a)                         (b)
                                                                       J.Knopp, et.al.
                                                                                                                 (c)
                                                  a large m is used to make the representation insensitive to large                R. Toldo, et.al.
                                           Fig. 2. Illustration of the detection of 3D SURF features. The shape (a) is voxelized
             2011
                                                  scaling factors and edge effects. Such a descriptor was dubbed
                                                                           2010
                                           into the cube grid (side of length 256) (b). 3D SURF features are detected and back-
                                           projected to the shape (c), where detected features are represented as Kokkinoswith
                                                  Scale-Invariant HKS (SI-HKS) [Bronstein and spheres and 2010].
                                                                                                                                        2009
                                           the radius illustrating the feature scale.
                                                  3.3      Numerical Computation of HKS
 Maria Isabel Restrepo. February 7, 2012                                                                                                              4
crucial because it allows further tasks such as recognition,
                                                          navigation, and data compression to exploit contextual in-

                                                         Related Work: Scene Description In LIDAR
                                   Thommen Korah formation. A keySwarup Medasani
                                                                            contribution is our novel Strip Histogram
                                                                                                            Yuri Owechko
                                                          Grid representation that encodes the scene as a grid of ver-
                        Nokia Research Center, Hollywood                                 HRL Labs, Malibu
                                                          tical 3D population histograms rising up from the locally
                             {thommen.korah}@nokia.com                            {smedasani,yowechko}@hrl.com
                                                          detected ground. This scheme captures the nature of the
                                                          real world, thereby making segmentation tasks intuitive and
                                                          efficient. Our algorithms work across a large spectrum of
                                    Abstract              urban objects ranging from buildings and forested areas to
                                                          cars and other small street side objects. The methods have
              As part of a large-scale 3D recognition system applied to areas spanning several kilometers in mul-
                                                          been for LI-
          DAR data from urban scenes, we describe an tiple citiesfor data collected from both aerial and ground
                                                          approach with
                                                          sensors exhibiting different properties. We processed almost
          segmenting millions of points into coherent regions that ide-
          ally belong to a single real-world object. Segmentation is spanning an area of 3.3 km in less than an
                                                          a billion points                             2

          crucial because it allows further tasks such ashour on a regular desktop.
                                                            recognition,
          navigation, and data compression to exploit contextual in-
          formation. A key contribution is our novel Strip Histogram
          Grid representation that encodes the scene as a grid of ver-
                                                          1. Introduction
          tical 3D population histograms rising up from the locally
          detected ground. This scheme captures the nature of the describes an approach for segmenting 3D ob-
                                                              This work
          real world, thereby making segmentation tasksjects from high-resolution scans of complex urban environ-
                                                           intuitive and
          efficient. Our algorithms work across a largements. Advances in sensor technology have enabled such
                                                            spectrum of
                                                                                       Object Detection from Large-Scale 3D Datasets
               Light Standard buildings and forested areaspoint clouds to be routinely collected using both
          urban objects ranging from                      colorized to                                                                                                                56
                                                                                                                           Figure 1: Top image is an input pointcloud for a 100x100
          cars and other small street side objects. The methods have and airborne LIDAR platforms. The push
                                                          ground-based                                         T. Korah, et.al. 2011
                                                          towards location-based services has increased demand for
          been applied to areas spanning several kilometers in mul-
                                                                                                                           square meter tile color-mapped by height. Bottom shows
                                                                                                                           the result of segmentation. Each colored region ideally cor-
          tiple cities with data collected from both aerial and ground digital maps of urban environments. The
                                                          highly accurate
                                                                                                                           responds to a physical object. This tile has over 3 million
          sensors exhibiting different properties. We processed3D data contains millions of data points p = (x, y, z)
                                                  1       input almost
                                                                                                                           points.
          a billion points spanning an area of 3.3 km2 in lessstore the spatial coordinates and possibly RGB color
                                                          that than an
          hour on a regular desktop.            0.9
                                                Car       information. Segmentation can provide valuable contextual
                                                          information to Post
                                                                  Short subsequent recognition or scene understand-        linearly with the number of points. As a key part of our 3D
                                                0.8                                                                        recognition system that demonstrated over 60% accuracy on
                                                          ing modules, making these tasks more efficient. Millions
    Newspaper Box                                         of 3D points need to be reduced to perceptually “mean-           40 classes, segmentation took less than an hour on a regular
          1. Introduction                       0.7
                                                                                                                           PC to process a collection of nearly 1 billion points.
                                                          ingful” groupings. To be effective for target recognition,
              This work describes an approach0.6 segmenting Carob- disaster planning, processing must scale sub-
                                                 for      simulation, or
                                                                 3D                                                           Detailed geometric data at city-scales has not been pos-
          jects from high-resolution scans of complex urban environ-
                                                0.5
                                       Traffic Light
          ments. Advances in sensor technology have enabled such                                                       74
                       Car
          colorized point clouds to be routinely collected using both
                                                0.4                          Figure 1: Top image is an input pointcloud for a 100x100
                                  (c) Zoomed0.4     view The push
          ground-based and airborne LIDAR platforms. 0.6               0.8        1
                                                                             square meter tile color-mapped by height. Bottom shows
          towards location-based services has increased demand for
                                                                             the result of segmentation. Each colored region ideally cor-
00 manually labeled objects et.al. truth area
                             in the
           A. Golovinskiy,Left: The precision-recall curve for carand P. Mordohai,million points con
                                                    A. Patterson detection on 200 2008
          highly accurate digital maps of urban environments. The
                                                                             responds to a physical object. This tile has over 3 million
                    Fig. 6.
          input 3D data contains millions of data points p = (x, y, z)
d points, with colors representing labels.) A                                points.
                   2009 1221 cars. (Precision is the x-axis and recall the y-axis.) Right: Screenshot o
          that store the spatial coordinates and possibly RGB color
                    taining
          information. Segmentation can provide valuable contextual
s on bottom, is shown in (c).(Automatically
          information to subsequent recognition or scene understand-         linearly with the number of points. As a key part of our 3D
                    detected cars. Cars are in random colors and the background in original colors.
          ing modules, making these tasks more efficient. Millions            recognition system that demonstrated over 60% accuracy on
 Maria Isabel Restrepo. February 7, 2012
          of 3D points need to be reduced to perceptually “mean-             40 classes, segmentation took less than an hour on a regular                                           5
Challenges Of Multi-View Stereo




Maria Isabel Restrepo. February 7, 2012                                     6
Challenges Of Multi-View Stereo
     Scene Ambiguity:




Maria Isabel Restrepo. February 7, 2012                                     6
Challenges Of Multi-View Stereo
     Scene Ambiguity:




Maria Isabel Restrepo. February 7, 2012                                     6
Challenges Of Multi-View Stereo
     Scene Ambiguity:




  Scene Uncertainty:                                                               5




                                          (a) (a)   (b)      (b)
                                                              (c)
                                                              (a)    (c)
                                                                     (d)
                                                                      (b)   (d)
                                                                            (e)
                                                                             (c)       (d)
Maria Isabel Restrepo. February 7, 2012                                                      6
Probabilistic 3-d Volumetric Model: PVM
                Probabilistic representation of 3-d scenes based on
                               volumetric units -voxel.

                                                                          C

                                                                                              RX
                                                                      I
                                                                              IX
                                                     Voxel Volume!

                                                                               V


                                                       S

                                                                     X'
                                                                                      P(IX|V=X’)!

                                                                          Intesity!


                                          Pollard and Mundy, 2007

Maria Isabel Restrepo. February 7, 2012                                                             7
Probabilistic 3-d Volumetric Modeling


                                                      C

                                                                           RX
                                                I
                                                          IX
                               Voxel Volume!

                                                           V


                                      S

                                               X'
                                                                   P(IX|V=X’)!

                                                       Intesity!



Maria Isabel Restrepo. February 7, 2012                                                 8
Probabilistic 3-d Volumetric Modeling


            Surface probability is given by on-line Bayesian learning
                                                      pN (Ix +1 |X 2 S)
                                                           N
                  P N +1 (X 2 S|Ix +1 ) = P N (X 2 S)
                                 N
                                                          pN (Ix +1 )
                                                               N

                                          C

                                                              RX
                                    I
                                              IX
       Voxel Volume!

                                               V


           S

                                  X'
                                                      P(IX|V=X’)!

                                          Intesity!




Maria Isabel Restrepo. February 7, 2012                                                                 9
observed image intensity, as well                              the Gaussian mixture (1) at that voxel explains the intensity
  to contain the observed surface                               observed in the N+1 image better than any other voxel along
 usion. The process of updating the                                       Probabilistic 3-d Volumetric Modeling
                                                                the projection ray.
 pancy probabilities is explained in
                                                                                              pN (IX +1 |X 2 S)
                                                                                                    N
                        Update using information along a projection ray
                                                                P N +1 (X 2 S) = P N (X 2 S)
                                                                                                  p N (I N +1 )
                                                                                                                         (3)
                                                                                                        X
 e model                                                                        X
                                                                                      pN (IX +1 |V = X 0 )P (V = X 0 |X 2 S)
                                                                                           N
voxel is modeledpwith N +1 |X 2 S)
                     N
                       (IX Gaussian
                           a
      N                                    N          X 0 2RX
 en P (X 2 S)
     by (1). I, refers to the +1
                          N (I N
                                  grey- = P (X 2 S)         X
  considered a vector   pwith X )
                                various                           pN (IX +1 |V = X 0 )P N (V = X 0 )
                                                                       N

                                                          X 0 2RX
 or. The quantities, µk , k and !k ,
                                                                                                   (4)
 and mixing parameters associated
                            C
ution. W is the sum of !k for all        To make the PVM representation clear, a term by term
                                      R
  is given by k; for this particular explanation of the update equation in 4 is outlined.
                        I
                                                            X



xture components.             I                X       N N +1
                                         • The term p (IX        |V = X 0 ) is computed using the
      Voxel Volume! !
   1
              (I µk )2                     mixture of Gaussians model stored at the voxel X 0 .
                2 2
p
      2
        exp       k                 (1)  • The probability of a voxel X producing the color in
                                                                             0
  2⇡ k                          V
                                           the image is interpreted geometrically, where a voxel
mixture S learned using a modi-
        are                                                         produces the intensity seen in the image if it is a surface
 on (EM) algorithm similar to that                                  element and it is not occluded by other voxels along the
                   X'
 modeling [45]. The update of |V=X’)!
                            P(I
                                the                     X           ray. Thus,
                                            Intesity!
                                                                    P N (V = X 0 ) = P N (X 0 2 S)P N (X 0 is not occluded) (5)

+1                                                                  The probability of occlusion is defined as the probability
                                                                    that all voxels between X 0 and the sensor are empty,10
! Maria Isabel Restrepo. February 7, 2012
observed image intensity, as well                              the Gaussian mixture (1) at that voxel explains the intensity
  to contain the observed surface                               observed in the N+1 image better than any other voxel along
 usion. The process of updating the                                       Probabilistic 3-d Volumetric Modeling
                                                                the projection ray.
 pancy probabilities is explained in
                                                                                              pN (IX +1 |X 2 S)
                                                                                                    N
                            Every voxel contains appearance information
                                                                P N +1 (X 2 S) = P N (X 2 S)
                                                                                                  p N (I N +1 )
                                                                                                                         (3)
                                                                                                        X
 e model                                                                        X
                                                                                      pN (IX +1 |V = X 0 )P (V = X 0 |X 2 S)
                                                                                           N
voxel is modeledpwith N +1 |X 2 S)
                     N
                       (IX Gaussian
                           a
      N                                    N          X 0 2RX
 en P (X 2 S)
     by (1). I, refers to the +1
                          N (I N
                                  grey- = P (X 2 S)         X
  considered a vector   pwith X )
                                various                            pN (IX +1 |V = X 0 )P N (V = X 0 )
                                                                        N

                                                          X 0 2RX
 or. The quantities, µk , k and !k ,
                                                                                                    (4)
 and mixing parameters associated
                            C
ution. W is the sum of !k for all        To make the PVM representation of the a term by term
                                                                 Probability clear, observed
                                      R
  is given by k; for this particular explanation of the update equation given that the
                        I
                                                            X
                                                                 intensity, in 4 is outlined.
xture components.             I
                                         • The term p (IX voxels produced the color
                                               X       N N +1
                                                                  |V = X 0 ) is computed using the
      Voxel Volume! !
   1
              (I µk )2                     mixture of Gaussians model the image voxel X 0 .
                                                                 seen in stored at the
                  2
p
      2
        exp 2 k                     (1)  • The probability of a voxel X producing the color in
                                                                              0
  2⇡ k                          V
                                           the image is interpreted geometrically, where a voxel
                                                                                   3
                                                                                                                       !
                                                                                 X wk                       (I µk )2
mixture S learned using a modi-
        are                                                         produces the intensity seen in1the image if it is a surface
                                                                                                              2 2
 on (EM) algorithm similar to that                                                            p         e
                                                                    element and it is not occluded by2other voxels along the
                                                                                                                k
                   X'                                                                  W         2⇡ k
 modeling [45]. The update of |V=X’)!
                            P(I
                                the                     X           ray. Thus,   k=1
                                            Intesity!
                                                                    P N (V = X 0 ) = P N (X 0 2 S)P N (X 0 is not occluded) (5)

+1                                                                  The probability of occlusion is defined as the probability
                                                                    that all voxels between X 0 and the sensor are empty,11
! Maria Isabel Restrepo. February 7, 2012
ance model
observed image intensity, as well the N +1                             Gaussian mixture (1) atNthat +1 (Iexplains 2 S) 0
                                                                                               N p (I N voxel = X 0|X (V = X |X 2 S)
                                                                                                            p |V X )P the intensity
 h voxel is modeled with asurface observed(X theS) = 0image better thanpany other)voxel along (
   to contain the observed Gaussian P                                          in 2 N+1 P (X 2 S)        X
                                                                                                                    N (I N +1
 usion. The process of updating grey- = P NProbabilistic
                                                                                          X 2RX 3-d Volumetric Modeling   X
 e model
 given by (1). I, refers to the the the projection 2 S) XX N NN +1+1            (X ray.
 pancy probabilities is explained in                                                               p p X X |V |V = X 0 )P N (V X 0 |X )2
                                                                                                      (I (I N        = X 0 )P (V = = X 0
 oxelconsidered a vectorawith various
    be is modeled with Gaussian
                                                                                             X 0 2R
 color. The quantities, µk , k and !k , = P N (X 2 S) X 0 2RX X pN (I N +1 |X 2 S)
en by (1). I, refers to the grey- N +1                                                    N X
 e, and mixing parameters associatedP                                                                         X
                                                                           (X 2 S) = P (X 2 S)N (I N +1 |V+1 X 0 )P N (V = X 0 )
                                                                                                      p             N =
                                                                                                                                      (3) (4)
   considered a vector with various                                                                         pN (IX )
                                                                                                            X
 ribution. W is the sum of !k for all                                     To make the PVM representation clear, a term by term
or. modelquantities, µk , k and !k ,
  e The                                                                                X X 0 2RX
  res is given by k; forN +1 particular explanation of the updateNequation X 0 )Pis outlined. 2 S) (
                                          this                                                 pN (IX +1 |V = in 4 (V = X 0 |X
voxel is modeledpwith a associated
  nd mixing parameters GaussianS)N
 mixture components. X |X 2
         N                            (I                                              X0 N
                                                                         N The term 2RX N +1 |V = X 0 ) is computed using the
 tion. W is2the sum 2to the +1 all = PTo(X 2 S)the pPVM representation clear, a term 0 by ter
 en P (X I,S)
        by (1).             refers of !N grey-
                                          !
                                         N (I k       for                 •
                                                                              make         XX(I
                                       pwith X )                             mixture of Gaussians(IX +1 |Vstored 0at the voxel X 0 .
                                                                                                         N
                                                                                                   pN model = X )P N (V = X )
   is given by ak; for this particular explanation of the0 update equation in 4 is outlined.
    considered vector                            various
                             (I µk )
         1                      2 2
    pThe quantities, µ ,
 or. 2⇡ 2         exp                                             (1)     • The probability Xof a voxel X producing the color in
                                                                                                                   0
  ture components. k k and !k ,
                                    k                                                    X 2R
                                                                                          N N +1
               k                                                        • The term pis (IX
                                                                             the image               |V =geometrically, where using th
                                                                                              interpreted       X 0 ) is computed a (4)voxel
 and mixing parameters associated    ! C
 e1  mixture are 2 2   (I µk )2
                           learned !k a all                                 mixture ofthe Probability instored at termit by aterm0 .
                                                                             produces Gaussians modelthat image if is X
                                                                                            intensity seen the a voxel
ution. W is the sum ofusingfor modi- To make the PVM representation clear, a the voxel surface
            expby algorithm similar to(1) explanation of the updateof a occluded4by producing the color
                                                                        • The probability not voxel X other voxels along the
                                                                                                                     0
                                                                             element and it is equation color outlined.
                                                                                     produced the in is seen in
                                                              R
 zation (EM) k; for this particularthat
  2⇡ given
    is 2
                              k                 X
                                       I
nd modeling [45]. The update of the • The term pN (I Ninterpreted geometrically, where thevox
xturekcomponents.                    X        I                             the image is +1
                                                                             ray. Thus,
                                                                                                  |V = X ) isthe image
                                                                                                            0
                                                                                                                       computed using
                                                                                                                                      a
          Voxel Volume! !
 ixture are learned using a modi-
                       (I µk )2
                                                                            producesGaussians model stored N image if it is a surfa
                                                                                N of the0 intensity seen in the the voxel X 0 .
                                                                                           X
                                                                         mixture = X ) = P N (X 0 2 S)P at(X 0 is not occluded) (5
     1 (EM) algorithm similar to that                                        P (V
on
p            exp           2 2
                              k                             (1)             element and itof anot occluded by other the color in th
                                                                      • The probability
                                                                                              is voxel X 0 producing voxels along
   2⇡ 2
modeling [45]. The update of the                                         the TheThus, interpreted geometrically, where a voxel
                                                                            ray. probability of occlusion is defined as the probability
                                                V
  N +1 k                                                                      image is
  k                                                                          that all voxels between0 XtheandNthe ifsensora are empty,
                                                                         produces the intensity N
                                                                                                              0
mixture S N +1
    d!         are learnedNusing a modi-                                    Pnamely: X 0 ) = P seen in S)P (X 0 is is occluded)
                                                                              N
                                                                                 (V =                (X 2 image it not surface
              (I
  on (EM) algorithm X '
          N
                                µk similar to that (2)
                                      )                                  element and it is not occluded by other voxels along the
! + !k                                                                                                          Y
                                                                         ray. Thus, 0
                                                                            The (X is not occluded) =
                                                                                N probability of occlusion is defined N the probabili
 modeling [45]. The update of the                                                                                      (1 P as 00 2 S)) (6
                                                      P(I |V=X’)!
 1      d!
                                          X
                                                                             P                                                  (X
                           N +1           N Intesity!
                                              2              N 2
                                                                         P N (V all X 0 ) = P N (X 0 2 S)P00Nand0 the sensor are (5)
                                                                            that = voxels between X <X 0 is not occluded) empt
                      (I                µk )             ( k)                                                  0
   d! + N k !+1 N                                                                                          X      (X
        (I                 µN )
                             k                                (2)           namely:
                                                                          • The term P (V = X |X 2 S) is computed analogously
     N                                                                                    N            0
 ng
 !k weight, d!, upon observing image
+1                                                                       The probability of0 occlusion is defined as the probability
                                                                                                                 Y
                                                                            P N P (V = between X= and instances of P empty, S)
                                                                             toall voxels X ). However, anythe (1 P N (XN (X S))
                                                                                   N0                                              00
  d!analyzing N +1 distributions in other
                      the             N 2                 N 2            that    (X is not occluded) 0                   sensor are 2 212
! Maria Isabel Restrepo. February 7,µ )
                 (I                  2012           ( )
Spatial Optimization: Octree


                                                     empty space




                                                              surface



Maria Isabel Restrepo. February 7, 2012                                 13
Spatial Optimization: Octree


                                                     empty space




                                                              surface



Maria Isabel Restrepo. February 7, 2012                                 14
Spatial Optimization: Octree




                                                                     p(intensity)
                                          p(intensity)
                                                                                    intensity



                                                         intensity




        Crispell, Mundy and Taubin 2011
         Miller, Jain and Mundy 2011

Maria Isabel Restrepo. February 7, 2012                                                         15
Probabilistic 3-d Volumetric Modeling




                                                     Demo:
                                          https://vimeo.com/43729866




Maria Isabel Restrepo. February 7, 2012                                              16
Geometry And Appearance




                                                     Demo:
                                          https://vimeo.com/43690883
                                          https://vimeo.com/45322168




Maria Isabel Restrepo. February 7, 2012                                       17
Expected Appearance Volume Model: EVM

          Voxel’s Expected =
                             E(IX |V = X )P (X 2 S)
                                        0     0
          Appearance




Maria Isabel Restrepo. February 7, 2012                                       18
Object Categorization: Bag Of Volumetric Words



                                                                                          Parking
                                                                                  Car
                                                                                  Plane


                                                                                  Building
                                                                                             House


        Input:                             Feature    Descriptor:   Volumetric      Classifier:
         EVM                              sampling:     Taylor      Vocabulary:    Naive Bayes
                                           Dense        PCA          K-means



Maria Isabel Restrepo. February 7, 2012                                                        19
Experiments: Data Collection




                   http://vision.lems.brown.edu/project_desc/Object-Recognition-in-Probabilistic-3D-Scenes

Maria Isabel Restrepo. February 7, 2012                                                                      20
Experiments: Train And Test Sites



                 Site 1                   Site 2    Site 3           Site 5            Site 6




                 Site 7                   Site 8    Site 10           Site 11          Site 12




                 Site 16                  Site 18   Site 21          Site 22           Site 23




                                          Site 25   Site 26           Site 27

                    http://vision.lems.brown.edu/project_desc/Object-Recognition-in-Probabilistic-3D-Scenes
Maria Isabel Restrepo. February 7, 2012                                                                       21
Experiments: The Input




             Camera matrices were recovered using Bundler: Snavely, N. and Seitz, S. (2006). Photo tourism: exploring photo collections in
                                                      3D. ACM Transactions on Graphics.
Maria Isabel Restrepo. February 7, 2012                                                                                                      22
Feature Description
             394         D. Saupe and D.V. Vrani´
                                                c
                                                     Global Features


                                                      Spherical Harmonics: D. Saupe and D. V. Vrani, 2001
                         Original                 823-d Zernik Moments: M. Novotnia and R. Klein, 2003
                                                     harmonics     162 harmonics    242 harmonics
Transforms and 3D IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 21, NO. 5, 3 Regional Point Descriptors
                  SURF for robust Recognizing Objects in Range Data Using 1999
             Fig. 1. Multi-resolution representationFeatures AND MACHINE INTELLIGENCE,| VOL. 21, NO. 5, MA
                                                                                                                                     227
                                    IEEE TRANSACTIONS ON PATTERN ANALYSIS r(u) = max{r ≥ 0
                                           Local of the function
                                          three dimensional classification                     ru ∈MAY



             I ∪{0}} used to derive feature vectors frommakes thecoefficients for spherical harmonics.
                                Sampling logarithmically Fourier descriptor
                                               more robust to distortions in shape with distance
                                               from the basis point. Bins closer to the center are
                                               smaller in all three spherical dimensions, so we use
                                               a minimum radius (rmin > 0) to avoid being overly
             3 Functions on the Sphere for 3D Shape Feature Vectors
                                               sensitive to small differences in shape very close
                                               to the center. The Θ and Φ divisions are evenly
             In this section we describe the feature vectors used in our comparative study. As
                                               spaced along the 180◦ and 360◦ elevation and az-
onents of our surface representation. A surface described by a polygonal surface mesh can be represented for matching as a set of
and surface normals and (b) spin images.
                                               imuth ranges.
                                                                                                                                                   3
             3D models we take triangle meshes consisting of triangles {T , . . . , T }, T ⊂ R ,
 enes is difficult. The usual method for(b)
                                                      the cency. Given enough points, weighted count w(pi )
ating object-centered coordinate systems inBin(j, k, l) accumulates a any object can be represented
                                                 dealing by points sensed on the object Spin so surface meshes
                                                                                                                                   1
                                                                                                 surface, Images: Johnson and Hebert, 1999
                                                                                                                                           m  i
    (a)                                        for each point pi whose spherical coordinates rela-
                                                                                             (c)                                        3
             given by vertices (geometry) {p , . . . , p }, p = (x , y , z ) ∈ R and an index
r is to segment the scene into object and non- can represent objects of general shape. Surface meshes can
ponents [1], [7]; naturally, this is difficult if the pbe generated from described n a polygonal Rdo i ), SURF: Knopp, et.al. 2010
                                                                                                                       3-D
                                               tive to fall A surface1 different types i [R andi j+1 i Fig. 2. Visualization for matching as
 . 1. Components of our surface representation. within the radius by of sensors j , surface mesh can be represented of the
                                                                                                 interval                   not
ustrationisandthe detection ofand (b) spin images. [Φ The shape (a)elevation interval histogram bins ofmthe 3D
  the object
             table with three vertices per triangle (topology). Then our object is I =
                 unknown. An alternative to seg-3D SURF features. Φ
                                                                                             T,
 3D points of surface normals azimuth generally contain,sensor-specific information; they are sen-    is voxelized
                                                              interval k representations. The useShape mesh
                                                                                    k+1 ) and
ube grid (side of length coordinate sys- SURF features are detected and back- Context: context. et.al. 2004
is to construct object-centered 256) (b). 3D sor-independent                                                  of surface
                                                                                                                                  shape
                                                                                                                                        Frome, i=1 i
   local features detected in the scene [Θl ,[18]; as representations for 3D shapes thebeen avoided in for
                                                [9], Θl+1 ). The contribution to has bin count the
 to the shape (c), where 2012
  Maria Isabel Restrepo. February 7, detected features are represented as spheres and with                                                     23
Feature Formation
                       Volumetric Form of          Vector Form of Voxel
                      Voxel Neighborhoods             Neighborhoods




              E(IX |V = X )P (X 2 S)      0    0




                                              24

Maria Isabel Restrepo. February 7, 2012
Feature Formation
                       Volumetric Form of          Vector Form of Voxel
                      Voxel Neighborhoods             Neighborhoods




              E(IX |V = X )P (X 2 S)      0    0




                                              24

Maria Isabel Restrepo. February 7, 2012
Feature Formation
                       Volumetric Form of          Vector Form of Voxel
                      Voxel Neighborhoods             Neighborhoods




              E(IX |V = X )P (X 2 S)      0    0




                                              24

Maria Isabel Restrepo. February 7, 2012
Feature Formation
                       Volumetric Form of          Vector Form of Voxel
                      Voxel Neighborhoods             Neighborhoods




              E(IX |V = X )P (X 2 S)      0    0




                                              24

Maria Isabel Restrepo. February 7, 2012
ity                                       leaf nodes contain the Gaussian mixture models

)                                                Feature Description: PCA Features
                                                            (c)
tree subdivision of space proposed by Crispell [20]. S. In the PCA spac
  by the eigenvalue decomposition of
            1-dimensional space                   d-dimensional space
  neighborhood (represented by a d-dimensional featur
                     a1                            ⇧d
                               e1
  x) can be exactly expressed as x = x + i=1 ai ei , w
                                              ¯
 by theprincipal axes associated S. In the PCAeigenvalues,
  are   eigenvalue decomposition of with the d space, every
 neighborhood x ⇡ x + a1 e1 by a d-dimensional feature vector
                 (represented
  are the corresponding coefficients.⇧d k-dimensional
                     ¯
                                             A a e , where e
 x) can be exactly expressed as x = x + i=1 i i
                                        ¯                         i
  approximation of the neighborhoodseigenvalues, and ai b
 are principal axes associated with the d     can be obtained  ⇧k
  the first k on the samplecomponents i.e.k-dimensional
          EVD principal                          ˜      ¯
                                                 x = x + i=
 are the corresponding coefficients. A k-dimensional (k < d)
                                           approximation, for k<d
            scatter matrix S a detailed analysis of the recons
 approximationpresents
  Section V of the neighborhoods can be obtained by using
                                                     ⇧k 2
 the firstof local neighborhoods,i.e. x = x + x| , ai ei a.
  error k principal components namely ¯   ˜      |x ˜i=1 as
 Section V presents a detailed analysis of In the remainder
  of dimension and training set size. 2        the reconstruction
 error of localvector arrangement of |x x| , as coefficien
  paper, the     neighborhoods, namely projection a function
                                               ˜
 of dimension and training set size. In the remainder of this
  PCA the vector arrangement of projection coefficients in the
 paper,
         space is referred to as a PCA feature.
Maria Isabel Restrepo. February 7, 2012                                                    25
on
                         ni
                        ⌃ ⌃ ⌃    nj     nk                          ⇥2
es, as the      computation of derivatives in (i, j,expectationj,volume m
                                            V the k) Taylor Features
                                                          ˜
                 E=              Feature Description:     V (i, k)
          (5) EVM, i= ni j= nj k= nk a least square error minimiz
                       can be expressed as
                of the following energy function.
 ’s expected             ˜
                Where V (i, j, k) is the Taylor series approximation of
                            ni     nj    nk
                           ⌃ ⌃ ⌃ a volume V centered on2 the         ⇥
nces, Minimize: E =3-d appearance of V (i, j, k) V (i, j, k)
       as the expected                                      ˜
  identify- point (i, j, k). Using the second degree Taylor expansion o
                         i= ni j= nj k= nk
  st of the about (0, 0, 0), ( 6) becomes
is (PCA)                   ˜ ⇤
                  Where V (i, j, k) is the Taylor series approximation o
                                                                 ⌅2
                          ⌃
 epresents      expected 3-d appearance of axvolume 1 xT Hx
                    E=           V (x) V0      T
                                                 G      V centered on th
by identify- point (i, j, x Using the second degree Taylor expansion
 or sense.                  k).                        2!
most ofof
   order the about (0, 0, 0), ( 6) becomes
 ysis (PCA)     Where V0 , G, H are the zeroth derivative, the grad
  e scatter                  ⌃   ⇤                                ⌅2
   represents vector and the Hessian matrix of the 1 T
                                                 T       volume of expe
                      E=            V (x) V0 x G           x Hx
 error sense. 3-d appearances about the point (0, 0, 0), respectively.
                                                        2!
  obtained coefficients for 3-d derivative operators can be found by
                               x
he octree of imizing (7) withG, H are the zeroth derivative,second o
  ng order
                  Where V0 , respect to the zeroth, first and the gra
mple scatter
aces and derivatives. The computedmatrix of the volume are exp
                vector and the Hessian derivative operators of app
  location, algebraically to neighborhoods in the 0, 0), respectively.
                3-d appearances about the point (0, EVM. The respo
re obtained
 Maria Isabel Restrepo. February 7, 2012                             26
Learning The Codebook

               Learn Volumetric Vocabulary using K-Means Clustering:
          ✤    Determine the best number of means: Heuristically
          ✤    Convergence depends on initialization: P. S. Bradley
               and U. M. Fayyad. 1998




Maria Isabel Restrepo. February 7, 2012                                 27
Vocabulary: Twenty Volumetric Words


  PCA based




  Taylor based



Maria Isabel Restrepo. February 7, 2012                                     28
ssification, the class label with                            i=1   i
                                                                         414
ep a count the number of cluster centerstheLearning Class Distributions
              is obtained, cij , of in the vocabulary.of the 405
                                                 number From                      413

obability isachosen v , ominimizeUsing Bayes 415
 ion meth-
center, vi , times cluster center,
  proposed
                                   tooccursUsing Bayes
             quantization step a count is obtained, c , of the number of
              occurs in object j . in object o .        ij
                                                                                  414
                                                                                  415
 he means.
                                                i
posteriori class probability class probability is given by:
               formula, the a posteriori is given by:
                                                               j
                                                                               406
                                                                               416416
 f the data                                                                       417
                                                                               417
of a particular category be the
  The clus-                 P (Cl |oi ) ⇥ P (oi |Cl )P (Cl )
                                                                (8)
(Cl |oi ) ⇥ P (oi likelihood(Clan object is given by the product of
                        |Cl )P of )
                                                                         (8)   407418
  k-means,        The                                                          418419
                          frequency



is the class label and N is the
  distances    the likelihoods of the independent entries of the vocabulary,   408
                                                                               419
                                                                                  420
 the initialan object ),is given estimated l product ofThe full
 od of         P (vj |Cl which are by the         during learning.                421

 s label l. Then, the set of all
e manage-independent entries posterior becomes:
 of the        expression for the class of the vocabulary,                     409
                                                                               420422

       ⌥
 f subsam-
 d k-means estimated )during learning. )The full
 ch are Nc (C |o ⇥ P (C ) P (v |C cji      k                                   421
                                                                                  423

O=          O , where N is the
                 P l i                                                   (9)   410424
meansclass posteriorlbecomes:                     c
                                        l          j   l
the pro- l=1                              j=1                                  422425

he vocabulary of 3-d expected
etric train-                                             Nm
                                                                     ⇥cji      411
                                                                               423
                                                                                  426
ng parallel
  are avail-⌥    k
                          k
                                              ⇧
                                    cji k ⇧ m=1:om O
                                                               cjm ⌃
                                                                     ⌃         412
                                                                               424
                                                                                  427

ed as V =            v , where k is
                                                                                  428
⇥ not be l )
       P (C          P (vj |ClP (Cl= ⇧ k                      l (9) ⌃
 uld                         ⇥ ) i )          ⇧                      ⌃ (10)       429
                          i=1                 ⇧           Nm         ⌃         425
Therefore,
 s in the vocabulary. From the
               j=1                        j=1 ⇤
                                                           ⇥ ji
                                                                 cnm ⌅         413430
which is a                                       n=1 m=1:om cOl                426431
                     N                      m
obtained, c⇧ ,: number of times accluster i⌃ in object j                       414
            ij of the number occurs         of                                 427
        4 k                        jm
            ⇧                              ⌃                                   415
                                                                               428
curs in object o . Using Bayes
            ⇧ m=1:o O
  Maria Isabel Restrepo. February 7, 2012  ⌃m       l                                   29
appearance patterns be defined as V = i=1 vi , where k is
  the               ⌥N c      Then, the set of Bayes the 409
 withnumber of clusterl.centers in the vocabulary. all Classifier 4
        class label                     Classification: From
efined as O = l=1               is l , where , of is number of 4104
  quantization step a count Oobtained, cijNc thethe
  times a cluster center, vi , occurs in object oj . Using Bayes 4114
es. Let the vocabulary of 3-d expected                              4
                              ⌥k
  formula, the a posteriori class probability is given by:
                                                                                              4124
  be defined as V )= P (o |C vi ,(C )     where k is (8)
                         frequency


               P (Cl |oi ⇥ i=1 l )P l
                                 i
 er centers in the vocabulary. From the                                                       4134
     The likelihood of an object is given by the product of                                      4
 count is obtained, cij , of the number of                                                    414
  the likelihoods of the independent entries of the vocabulary,                                  4
er, (vj |Coccurs in object oj during learning. The full
  P vi , l ), which are estimated . Using Bayes                                               4154
  expression for the class posterior becomes:
eriori class probability is given by:                                                         4164
                                                                                                 4
                                                         k                                    4174
oi )P⇥ P (oi |Cl )P= l ) l )
                    (C
     (Cl |oi ) ⇥ P (C                                         P (vj |Cl )   cji
                                                                                  (8)   (9)
                                                                                              4184
                                                        j=1
                                                                                    ⇥cji      4194
of an object is given by the product of
                               N                                            m
                                                                                                 4
                        ⇧ the vocabulary,⌃
 e independent entries of
                     k ⇧
                                    cjm                                                       4204
                                         ⌃
                                                              ⇧    m=1:om Ol        ⌃
 Maria Isabel Restrepo. February 7, 2012   ⇥   P (C )         ⇧                     ⌃ (10)     30
appearance patterns be defined as V = i=1 vi , where k is
 withnumber of clusterl.centers in theLearning of all the 4094
  the class label   ⌥N c      Then, the set Class Distributions
                                          vocabulary. From
efined as O = l=1               is l , where , of is number of 4104
  quantization step a count Oobtained, cijNc thethe
  times a cluster center, vi , occurs in object oj . Using Bayes 4114
es. Let the vocabulary of 3-d expected                              4
                              ⌥k
  formula, the a posteriori class probability is given by:
                                                                                             4124
  be defined as V )= P (o |C vi ,(C )     where k is (8)
                         frequency


               P (Cl |oi ⇥ i=1 l )P l
                                 i
 er centers in the vocabulary. From the                                                      4134
     The likelihood of an object is given by the product of                                     4
 count is obtained, cij , of the number of                                                   414
  the likelihoods of the independent entries of the vocabulary,                                 4
er, (vj |Coccurs in object oj during learning. The full
  P vi , l ), which are estimated . Using Bayes                                              4154
  expression for the class posterior becomes:
eriori class probability is given by:                                                        4164
                                                                                                4
                                                         k                                   4174
oi )P⇥ P (oi |Cl )P= l ) l )
                    (C
     (Cl |oi ) ⇥ P (C                                         P (vj |Cl )   cji
                                                                                  (8) (9)
                                                                                    Train    4184
                                                        j=1
                                                                                    ⇥cji     4194
of an object is given by the product of
                               N                                            m
                                                                                                4
                        ⇧ the vocabulary,⌃
 e independent entries of
                     k ⇧
                                    cjm                                                      4204
                                         ⌃
                                                              ⇧    m=1:om Ol        ⌃
 Maria Isabel Restrepo. February 7, 2012   ⇥   P (C )         ⇧                     ⌃ (10)    31
appearance patterns be defined as V = i=1 vi , where k is
 withnumber of clusterl.centers in theLearning of all the 4094
  the class label   ⌥N c      Then, the set Class Distributions
                                          vocabulary. From
efined as O = l=1               is l , where , of is number of 4104
  quantization step a count Oobtained, cijNc thethe
  times a cluster center, vi , occurs in object oj . Using Bayes 4114
es. Let the vocabulary of 3-d expected                              4
                              ⌥k
  formula, the a posteriori class probability is given by:
                                                                                               4124
  be defined as V )= P (o |C vi ,(C )     where k is (8)
                         frequency


               P (Cl |oi ⇥ i=1 l )P l
                                 i
 er centers in the vocabulary. From the                                                        4134
     The likelihood of an object is given by the product of                                       4
 count is obtained, cij , of the number of                                                     414
  the likelihoods of the independent entries of the vocabulary,                                   4
er, (vj |Coccurs in object oj during learning. The full
  P vi , l ), which are estimated . Using Bayes                                                4154
  expression for the class posterior becomes:
eriori class probability is given by:                                                          4164
                                                                                                  4
                                                         k
                                                                                        Test   4174
oi )P⇥ P (oi |Cl )P= l ) l )
                    (C
     (Cl |oi ) ⇥ P (C                                         P (vj |Cl )   cji
                                                                                  (8) (9)
                                                                                    Train      4184
                                                        j=1
                                                                                     ⇥cji      4194
of an object is given by the product of
                               N                                            m
                                                                                                  4
                        ⇧ the vocabulary,⌃
 e independent entries of
                     k ⇧
                                    cjm                                                        4204
                                         ⌃
                                                              ⇧    m=1:om Ol         ⌃
 Maria Isabel Restrepo. February 7, 2012   ⇥   P (C )         ⇧                      ⌃ (10)     32
Results: PCA Classes
                                          Buildings




                                           Planes




Maria Isabel Restrepo. February 7, 2012                                  33
Results: Taylor Classes
                                          Buildings




                                           Planes




Maria Isabel Restrepo. February 7, 2012                                     34
during training and classification.
                                           Experiments: Number Of Objects
                          Table 2: Number of objects in every category.

                                     Planes   Cars   Houses   Buildings   Parking Lots
               Train                   18      54     61         24            27
               Test                    16      29     45         15            17

          Two measurements were used to evaluate the clas-
      sification performance: (i) classifier accuracy (i.e the
      fraction of correctly classified objects), and (ii) the
      confusion matrix. During classification experiments,
      the number of clusters in the codebook was varied
      from k = 2 to k = 100. Figure 4 presents classification
      accuracy as a function of the number of clusters. For
                       18 Probabilistic Sites
      both, Taylor-based features and PCA-based features,
Maria Isabel Restrepo. February 7, 2012                                                  35
Results: Classification Accuracy




Maria Isabel Restrepo. February 7, 2012                                 36
row corresponds to those learned with Taylor-based features. The x-axis shows the
         feature. The most probable volumetric featuresResults:class are shown Matrix
                                                       for each Confusion beside each



                                                                                                                                was
            True                                               Parking
            Class
                          Plane       House Building    Car
                                                                 Lot
                                                                          True
                                                                          Class
                                                                                    Plane   House   Building   Car
                                                                                                                      Parking
                                                                                                                        Lot     very
                                                                                                                                are
            Plane         0.86            0.02   0.00   0.03    0.00      Plane     0.86    0.02     0.00      0.03    0.00
                                                                                                                                neg
            House         0.00            0.67   0.27   0.00    0.12      House     0.00    0.64     0.27      0.00    0.12
                                                                                                                                that
                                                                                                                                not
          Building        0.00            0.31   0.67   0.00    0.00     Building   0.00    0.33     0.67      0.00    0.00     num
                                                                                                                                ⇤, i
             Car          0.00            0.00   0.07   0.93    0.00                0.00    0.00     0.07      0.86    0.00
                                                                           Car
                                                                                                                                    F
           Parking
                          0.14            0.00   0.00   0.03    0.88
                                                                         Parking
                                                                                    0.14    0.00     0.00      0.10    0.88
                                                                                                                                mat
             Lot                                                           Lot
                                                                                                                                sam
                                           (a) PCA                                          (b) Taylor                          vari
         Fig. 9. Confusion matrix for a 20-keyword codebook of PCA based features                                               valu
         on the left and Taylor based features on the right                                                                     clas
Maria Isabel Restrepo. February 7, 2012                                                                                         cate
                                                                                                                                 37
Future Work




     ✴ Evaluation   of effectiveness of the EVM, by performing classification
         tasks on different underlying 3-d reconstruction algorithms.

     ✴ Performance                        evaluation of additional feature descriptors.

     ✴ Explore                algorithms for detection.




Maria Isabel Restrepo. February 7, 2012                                                     38
Effectiveness Of Probabilistic Volumetric Learning




Maria Isabel Restrepo. February 7, 2012
                                                        Y. Furukawa and J. Ponce, 2010   39
Effectiveness Of Probabilistic Volumetric Learning




       Probabilistic 3-d Modeling Threshold Based 3-d Modeling
Maria Isabel Restrepo. February 7, 2012                                               40
Effectiveness Of Probabilistic Volumetric Learning




Maria Isabel Restrepo. February 7, 2012                                               41

Weitere ähnliche Inhalte

Was ist angesagt?

Ten years of expertise in integrated earth model steen agerlin petersen
Ten years of expertise in integrated earth model steen agerlin petersenTen years of expertise in integrated earth model steen agerlin petersen
Ten years of expertise in integrated earth model steen agerlin petersenStatoil
 
CS 354 Shadows (cont'd) and Scene Graphs
CS 354 Shadows (cont'd) and Scene GraphsCS 354 Shadows (cont'd) and Scene Graphs
CS 354 Shadows (cont'd) and Scene GraphsMark Kilgard
 
Rotman Lens Performance Analysis
Rotman Lens Performance AnalysisRotman Lens Performance Analysis
Rotman Lens Performance AnalysisIDES Editor
 
Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...
Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...
Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...Kalle
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Jia-Bin Huang
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesIDES Editor
 
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal StructuresLarge Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structuresayubimoak
 
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...Seiya Ito
 
Grindinger Group Wise Similarity And Classification Of Aggregate Scanpaths
Grindinger Group Wise Similarity And Classification Of Aggregate ScanpathsGrindinger Group Wise Similarity And Classification Of Aggregate Scanpaths
Grindinger Group Wise Similarity And Classification Of Aggregate ScanpathsKalle
 
A Practical and Robust Bump-mapping Technique for Today’s GPUs (paper)
A Practical and Robust Bump-mapping Technique for Today’s GPUs (paper)A Practical and Robust Bump-mapping Technique for Today’s GPUs (paper)
A Practical and Robust Bump-mapping Technique for Today’s GPUs (paper)Mark Kilgard
 
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...ijtsrd
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Geolocation techniques
Geolocation techniquesGeolocation techniques
Geolocation techniquesSpringer
 

Was ist angesagt? (20)

Ipta2010
Ipta2010Ipta2010
Ipta2010
 
C g.2010 supply
C g.2010 supplyC g.2010 supply
C g.2010 supply
 
Ten years of expertise in integrated earth model steen agerlin petersen
Ten years of expertise in integrated earth model steen agerlin petersenTen years of expertise in integrated earth model steen agerlin petersen
Ten years of expertise in integrated earth model steen agerlin petersen
 
CS 354 Shadows (cont'd) and Scene Graphs
CS 354 Shadows (cont'd) and Scene GraphsCS 354 Shadows (cont'd) and Scene Graphs
CS 354 Shadows (cont'd) and Scene Graphs
 
Rotman Lens Performance Analysis
Rotman Lens Performance AnalysisRotman Lens Performance Analysis
Rotman Lens Performance Analysis
 
Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...
Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...
Dorr Space Variant Spatio Temporal Filtering Of Video For Gaze Visualization ...
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
 
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral ImagesBand Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
Band Clustering for the Lossless Compression of AVIRIS Hyperspectral Images
 
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal StructuresLarge Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
 
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...
 
CS 354 Shadows
CS 354 ShadowsCS 354 Shadows
CS 354 Shadows
 
Grindinger Group Wise Similarity And Classification Of Aggregate Scanpaths
Grindinger Group Wise Similarity And Classification Of Aggregate ScanpathsGrindinger Group Wise Similarity And Classification Of Aggregate Scanpaths
Grindinger Group Wise Similarity And Classification Of Aggregate Scanpaths
 
A Practical and Robust Bump-mapping Technique for Today’s GPUs (paper)
A Practical and Robust Bump-mapping Technique for Today’s GPUs (paper)A Practical and Robust Bump-mapping Technique for Today’s GPUs (paper)
A Practical and Robust Bump-mapping Technique for Today’s GPUs (paper)
 
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
Shadow Detection and Removal using Tricolor Attenuation Model Based on Featur...
 
Fr2410361039
Fr2410361039Fr2410361039
Fr2410361039
 
Loop snakesiv05
Loop snakesiv05Loop snakesiv05
Loop snakesiv05
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Isvc08
Isvc08Isvc08
Isvc08
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Geolocation techniques
Geolocation techniquesGeolocation techniques
Geolocation techniques
 

Andere mochten auch

Chocolate caliente para le alma d elos adolescentes
Chocolate caliente para le alma d elos adolescentesChocolate caliente para le alma d elos adolescentes
Chocolate caliente para le alma d elos adolescentesmelaniemolina
 
Chocolate caliente para el alma
Chocolate caliente para el almaChocolate caliente para el alma
Chocolate caliente para el almaRamiro Alejo
 
Chocolate caliente para el alma de los adolescentes 2
Chocolate caliente para el alma de los adolescentes 2Chocolate caliente para el alma de los adolescentes 2
Chocolate caliente para el alma de los adolescentes 2mily_20
 
Comprension lectora
Comprension lectoraComprension lectora
Comprension lectorademamoro2
 
Chocolate caliente para el alma de los adolescentes los dos primeros capitulos
Chocolate caliente para el alma de los adolescentes los dos primeros capitulosChocolate caliente para el alma de los adolescentes los dos primeros capitulos
Chocolate caliente para el alma de los adolescentes los dos primeros capitulosAlisson Reynoso Gonzales
 
Chocolate caliente para el alma de los adolescentes
Chocolate caliente para el alma de los adolescentesChocolate caliente para el alma de los adolescentes
Chocolate caliente para el alma de los adolescentesmily_20
 
Sample slides by Garr Reynolds
Sample slides by Garr ReynoldsSample slides by Garr Reynolds
Sample slides by Garr Reynoldsgarr
 

Andere mochten auch (7)

Chocolate caliente para le alma d elos adolescentes
Chocolate caliente para le alma d elos adolescentesChocolate caliente para le alma d elos adolescentes
Chocolate caliente para le alma d elos adolescentes
 
Chocolate caliente para el alma
Chocolate caliente para el almaChocolate caliente para el alma
Chocolate caliente para el alma
 
Chocolate caliente para el alma de los adolescentes 2
Chocolate caliente para el alma de los adolescentes 2Chocolate caliente para el alma de los adolescentes 2
Chocolate caliente para el alma de los adolescentes 2
 
Comprension lectora
Comprension lectoraComprension lectora
Comprension lectora
 
Chocolate caliente para el alma de los adolescentes los dos primeros capitulos
Chocolate caliente para el alma de los adolescentes los dos primeros capitulosChocolate caliente para el alma de los adolescentes los dos primeros capitulos
Chocolate caliente para el alma de los adolescentes los dos primeros capitulos
 
Chocolate caliente para el alma de los adolescentes
Chocolate caliente para el alma de los adolescentesChocolate caliente para el alma de los adolescentes
Chocolate caliente para el alma de los adolescentes
 
Sample slides by Garr Reynolds
Sample slides by Garr ReynoldsSample slides by Garr Reynolds
Sample slides by Garr Reynolds
 

Ähnlich wie ICPRAM 2012

MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)npinto
 
Mit6870 template matching and histograms
Mit6870 template matching and histogramsMit6870 template matching and histograms
Mit6870 template matching and histogramszukun
 
Keynote Virtual Efficiency Congress 2012
Keynote Virtual Efficiency Congress 2012Keynote Virtual Efficiency Congress 2012
Keynote Virtual Efficiency Congress 2012Christian Sandor
 
Semantic Mapping of Road Scenes
Semantic Mapping of Road ScenesSemantic Mapping of Road Scenes
Semantic Mapping of Road ScenesSunando Sengupta
 
30th コンピュータビジョン勉強会@関東 DynamicFusion
30th コンピュータビジョン勉強会@関東 DynamicFusion30th コンピュータビジョン勉強会@関東 DynamicFusion
30th コンピュータビジョン勉強会@関東 DynamicFusionHiroki Mizuno
 
Image Splicing Detection involving Moment-based Feature Extraction and Classi...
Image Splicing Detection involving Moment-based Feature Extraction and Classi...Image Splicing Detection involving Moment-based Feature Extraction and Classi...
Image Splicing Detection involving Moment-based Feature Extraction and Classi...IDES Editor
 
Localization of Objects Using Cross-Correlation of Shadow Fading Noise and Co...
Localization of Objects Using Cross-Correlation of Shadow Fading Noise and Co...Localization of Objects Using Cross-Correlation of Shadow Fading Noise and Co...
Localization of Objects Using Cross-Correlation of Shadow Fading Noise and Co...Rana Basheer
 
Feature Tracking of Objects in Underwater Video Sequences
Feature Tracking of Objects in Underwater Video SequencesFeature Tracking of Objects in Underwater Video Sequences
Feature Tracking of Objects in Underwater Video SequencesIDES Editor
 
Land Cover Feature Extraction using Hybrid Swarm Intelligence Techniques - A ...
Land Cover Feature Extraction using Hybrid Swarm Intelligence Techniques - A ...Land Cover Feature Extraction using Hybrid Swarm Intelligence Techniques - A ...
Land Cover Feature Extraction using Hybrid Swarm Intelligence Techniques - A ...IDES Editor
 
Fcv scene hebert
Fcv scene hebertFcv scene hebert
Fcv scene hebertzukun
 
Visual Odomtery(2)
Visual Odomtery(2)Visual Odomtery(2)
Visual Odomtery(2)Ian Sa
 
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...Susang Kim
 
A Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model MatchingA Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model Matchingrafi
 
Geospatial Data Acquisition Using Unmanned Aerial Systems
Geospatial Data Acquisition Using Unmanned Aerial SystemsGeospatial Data Acquisition Using Unmanned Aerial Systems
Geospatial Data Acquisition Using Unmanned Aerial SystemsIEREK Press
 
SIGGRAPH 2014論文紹介 - Sound & Light + Fabrication Session
SIGGRAPH 2014論文紹介 - Sound & Light + Fabrication SessionSIGGRAPH 2014論文紹介 - Sound & Light + Fabrication Session
SIGGRAPH 2014論文紹介 - Sound & Light + Fabrication Sessionyamo_o
 
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES sipij
 

Ähnlich wie ICPRAM 2012 (20)

MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
MIT 6.870 - Template Matching and Histograms (Nicolas Pinto, MIT)
 
Mit6870 template matching and histograms
Mit6870 template matching and histogramsMit6870 template matching and histograms
Mit6870 template matching and histograms
 
Keynote Virtual Efficiency Congress 2012
Keynote Virtual Efficiency Congress 2012Keynote Virtual Efficiency Congress 2012
Keynote Virtual Efficiency Congress 2012
 
Semantic Mapping of Road Scenes
Semantic Mapping of Road ScenesSemantic Mapping of Road Scenes
Semantic Mapping of Road Scenes
 
30th コンピュータビジョン勉強会@関東 DynamicFusion
30th コンピュータビジョン勉強会@関東 DynamicFusion30th コンピュータビジョン勉強会@関東 DynamicFusion
30th コンピュータビジョン勉強会@関東 DynamicFusion
 
Image Splicing Detection involving Moment-based Feature Extraction and Classi...
Image Splicing Detection involving Moment-based Feature Extraction and Classi...Image Splicing Detection involving Moment-based Feature Extraction and Classi...
Image Splicing Detection involving Moment-based Feature Extraction and Classi...
 
Localization of Objects Using Cross-Correlation of Shadow Fading Noise and Co...
Localization of Objects Using Cross-Correlation of Shadow Fading Noise and Co...Localization of Objects Using Cross-Correlation of Shadow Fading Noise and Co...
Localization of Objects Using Cross-Correlation of Shadow Fading Noise and Co...
 
Feature Tracking of Objects in Underwater Video Sequences
Feature Tracking of Objects in Underwater Video SequencesFeature Tracking of Objects in Underwater Video Sequences
Feature Tracking of Objects in Underwater Video Sequences
 
Land Cover Feature Extraction using Hybrid Swarm Intelligence Techniques - A ...
Land Cover Feature Extraction using Hybrid Swarm Intelligence Techniques - A ...Land Cover Feature Extraction using Hybrid Swarm Intelligence Techniques - A ...
Land Cover Feature Extraction using Hybrid Swarm Intelligence Techniques - A ...
 
Fcv scene hebert
Fcv scene hebertFcv scene hebert
Fcv scene hebert
 
Visual Odomtery(2)
Visual Odomtery(2)Visual Odomtery(2)
Visual Odomtery(2)
 
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
 
56 58
56 5856 58
56 58
 
A Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model MatchingA Diffusion Wavelet Approach For 3 D Model Matching
A Diffusion Wavelet Approach For 3 D Model Matching
 
Geospatial Data Acquisition Using Unmanned Aerial Systems
Geospatial Data Acquisition Using Unmanned Aerial SystemsGeospatial Data Acquisition Using Unmanned Aerial Systems
Geospatial Data Acquisition Using Unmanned Aerial Systems
 
De24686692
De24686692De24686692
De24686692
 
Image transforms
Image transformsImage transforms
Image transforms
 
SIGGRAPH 2014論文紹介 - Sound & Light + Fabrication Session
SIGGRAPH 2014論文紹介 - Sound & Light + Fabrication SessionSIGGRAPH 2014論文紹介 - Sound & Light + Fabrication Session
SIGGRAPH 2014論文紹介 - Sound & Light + Fabrication Session
 
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
 
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
 

Kürzlich hochgeladen

9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Kürzlich hochgeladen (20)

9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

ICPRAM 2012

  • 1. Object Recognition In Probabilistic 3-D Volumetric Scenes Maria Isabel Restrepo Brandon A. Mayer Joseph L. Mundy
  • 2. Goal: Automated Scene Description Maria Isabel Restrepo. February 7, 2012 2
  • 3. Goal: Automated Scene Description Maria Isabel Restrepo. February 7, 2012 3
  • 4. Related Work : 3-d Object Retrieval EC 120 R. Toldo, U. Castellani, and A. Fusiel R SH Bronstein et al. Hough Transforms and 3D SURF for robust three dimensional classification 3 of transformations pdif (x) = (log Kατ2 (x, x) − log Kατ1 (x, x), . . . , log Kατ m (x, x) − log Kατm−1 (x, x)), ˆ p(x) = |(F pdif (x))(ω1 , . . . , ωn )|, (3) where F is the discrete Fourier transform, and ω1 , . . . , ωn denotes a set of frequencies at which the transformed vector is sampled. Taking differences of logarithms removes the scaling constant, and the Fourier transform converts the scale-space shift into a complex A. M. Bronstein, et.al. phase, which is removed by taking the absolute value. Typically, (a) (b) J.Knopp, et.al. (c) a large m is used to make the representation insensitive to large R. Toldo, et.al. Fig. 2. Illustration of the detection of 3D SURF features. The shape (a) is voxelized 2011 scaling factors and edge effects. Such a descriptor was dubbed 2010 into the cube grid (side of length 256) (b). 3D SURF features are detected and back- projected to the shape (c), where detected features are represented as Kokkinoswith Scale-Invariant HKS (SI-HKS) [Bronstein and spheres and 2010]. 2009 the radius illustrating the feature scale. 3.3 Numerical Computation of HKS Maria Isabel Restrepo. February 7, 2012 4
  • 5. crucial because it allows further tasks such as recognition, navigation, and data compression to exploit contextual in- Related Work: Scene Description In LIDAR Thommen Korah formation. A keySwarup Medasani contribution is our novel Strip Histogram Yuri Owechko Grid representation that encodes the scene as a grid of ver- Nokia Research Center, Hollywood HRL Labs, Malibu tical 3D population histograms rising up from the locally {thommen.korah}@nokia.com {smedasani,yowechko}@hrl.com detected ground. This scheme captures the nature of the real world, thereby making segmentation tasks intuitive and efficient. Our algorithms work across a large spectrum of Abstract urban objects ranging from buildings and forested areas to cars and other small street side objects. The methods have As part of a large-scale 3D recognition system applied to areas spanning several kilometers in mul- been for LI- DAR data from urban scenes, we describe an tiple citiesfor data collected from both aerial and ground approach with sensors exhibiting different properties. We processed almost segmenting millions of points into coherent regions that ide- ally belong to a single real-world object. Segmentation is spanning an area of 3.3 km in less than an a billion points 2 crucial because it allows further tasks such ashour on a regular desktop. recognition, navigation, and data compression to exploit contextual in- formation. A key contribution is our novel Strip Histogram Grid representation that encodes the scene as a grid of ver- 1. Introduction tical 3D population histograms rising up from the locally detected ground. This scheme captures the nature of the describes an approach for segmenting 3D ob- This work real world, thereby making segmentation tasksjects from high-resolution scans of complex urban environ- intuitive and efficient. Our algorithms work across a largements. Advances in sensor technology have enabled such spectrum of Object Detection from Large-Scale 3D Datasets Light Standard buildings and forested areaspoint clouds to be routinely collected using both urban objects ranging from colorized to 56 Figure 1: Top image is an input pointcloud for a 100x100 cars and other small street side objects. The methods have and airborne LIDAR platforms. The push ground-based T. Korah, et.al. 2011 towards location-based services has increased demand for been applied to areas spanning several kilometers in mul- square meter tile color-mapped by height. Bottom shows the result of segmentation. Each colored region ideally cor- tiple cities with data collected from both aerial and ground digital maps of urban environments. The highly accurate responds to a physical object. This tile has over 3 million sensors exhibiting different properties. We processed3D data contains millions of data points p = (x, y, z) 1 input almost points. a billion points spanning an area of 3.3 km2 in lessstore the spatial coordinates and possibly RGB color that than an hour on a regular desktop. 0.9 Car information. Segmentation can provide valuable contextual information to Post Short subsequent recognition or scene understand- linearly with the number of points. As a key part of our 3D 0.8 recognition system that demonstrated over 60% accuracy on ing modules, making these tasks more efficient. Millions Newspaper Box of 3D points need to be reduced to perceptually “mean- 40 classes, segmentation took less than an hour on a regular 1. Introduction 0.7 PC to process a collection of nearly 1 billion points. ingful” groupings. To be effective for target recognition, This work describes an approach0.6 segmenting Carob- disaster planning, processing must scale sub- for simulation, or 3D Detailed geometric data at city-scales has not been pos- jects from high-resolution scans of complex urban environ- 0.5 Traffic Light ments. Advances in sensor technology have enabled such 74 Car colorized point clouds to be routinely collected using both 0.4 Figure 1: Top image is an input pointcloud for a 100x100 (c) Zoomed0.4 view The push ground-based and airborne LIDAR platforms. 0.6 0.8 1 square meter tile color-mapped by height. Bottom shows towards location-based services has increased demand for the result of segmentation. Each colored region ideally cor- 00 manually labeled objects et.al. truth area in the A. Golovinskiy,Left: The precision-recall curve for carand P. Mordohai,million points con A. Patterson detection on 200 2008 highly accurate digital maps of urban environments. The responds to a physical object. This tile has over 3 million Fig. 6. input 3D data contains millions of data points p = (x, y, z) d points, with colors representing labels.) A points. 2009 1221 cars. (Precision is the x-axis and recall the y-axis.) Right: Screenshot o that store the spatial coordinates and possibly RGB color taining information. Segmentation can provide valuable contextual s on bottom, is shown in (c).(Automatically information to subsequent recognition or scene understand- linearly with the number of points. As a key part of our 3D detected cars. Cars are in random colors and the background in original colors. ing modules, making these tasks more efficient. Millions recognition system that demonstrated over 60% accuracy on Maria Isabel Restrepo. February 7, 2012 of 3D points need to be reduced to perceptually “mean- 40 classes, segmentation took less than an hour on a regular 5
  • 6. Challenges Of Multi-View Stereo Maria Isabel Restrepo. February 7, 2012 6
  • 7. Challenges Of Multi-View Stereo Scene Ambiguity: Maria Isabel Restrepo. February 7, 2012 6
  • 8. Challenges Of Multi-View Stereo Scene Ambiguity: Maria Isabel Restrepo. February 7, 2012 6
  • 9. Challenges Of Multi-View Stereo Scene Ambiguity: Scene Uncertainty: 5 (a) (a) (b) (b) (c) (a) (c) (d) (b) (d) (e) (c) (d) Maria Isabel Restrepo. February 7, 2012 6
  • 10. Probabilistic 3-d Volumetric Model: PVM Probabilistic representation of 3-d scenes based on volumetric units -voxel. C RX I IX Voxel Volume! V S X' P(IX|V=X’)! Intesity! Pollard and Mundy, 2007 Maria Isabel Restrepo. February 7, 2012 7
  • 11. Probabilistic 3-d Volumetric Modeling C RX I IX Voxel Volume! V S X' P(IX|V=X’)! Intesity! Maria Isabel Restrepo. February 7, 2012 8
  • 12. Probabilistic 3-d Volumetric Modeling Surface probability is given by on-line Bayesian learning pN (Ix +1 |X 2 S) N P N +1 (X 2 S|Ix +1 ) = P N (X 2 S) N pN (Ix +1 ) N C RX I IX Voxel Volume! V S X' P(IX|V=X’)! Intesity! Maria Isabel Restrepo. February 7, 2012 9
  • 13. observed image intensity, as well the Gaussian mixture (1) at that voxel explains the intensity to contain the observed surface observed in the N+1 image better than any other voxel along usion. The process of updating the Probabilistic 3-d Volumetric Modeling the projection ray. pancy probabilities is explained in pN (IX +1 |X 2 S) N Update using information along a projection ray P N +1 (X 2 S) = P N (X 2 S) p N (I N +1 ) (3) X e model X pN (IX +1 |V = X 0 )P (V = X 0 |X 2 S) N voxel is modeledpwith N +1 |X 2 S) N (IX Gaussian a N N X 0 2RX en P (X 2 S) by (1). I, refers to the +1 N (I N grey- = P (X 2 S) X considered a vector pwith X ) various pN (IX +1 |V = X 0 )P N (V = X 0 ) N X 0 2RX or. The quantities, µk , k and !k , (4) and mixing parameters associated C ution. W is the sum of !k for all To make the PVM representation clear, a term by term R is given by k; for this particular explanation of the update equation in 4 is outlined. I X xture components. I X N N +1 • The term p (IX |V = X 0 ) is computed using the Voxel Volume! ! 1 (I µk )2 mixture of Gaussians model stored at the voxel X 0 . 2 2 p 2 exp k (1) • The probability of a voxel X producing the color in 0 2⇡ k V the image is interpreted geometrically, where a voxel mixture S learned using a modi- are produces the intensity seen in the image if it is a surface on (EM) algorithm similar to that element and it is not occluded by other voxels along the X' modeling [45]. The update of |V=X’)! P(I the X ray. Thus, Intesity! P N (V = X 0 ) = P N (X 0 2 S)P N (X 0 is not occluded) (5) +1 The probability of occlusion is defined as the probability that all voxels between X 0 and the sensor are empty,10 ! Maria Isabel Restrepo. February 7, 2012
  • 14. observed image intensity, as well the Gaussian mixture (1) at that voxel explains the intensity to contain the observed surface observed in the N+1 image better than any other voxel along usion. The process of updating the Probabilistic 3-d Volumetric Modeling the projection ray. pancy probabilities is explained in pN (IX +1 |X 2 S) N Every voxel contains appearance information P N +1 (X 2 S) = P N (X 2 S) p N (I N +1 ) (3) X e model X pN (IX +1 |V = X 0 )P (V = X 0 |X 2 S) N voxel is modeledpwith N +1 |X 2 S) N (IX Gaussian a N N X 0 2RX en P (X 2 S) by (1). I, refers to the +1 N (I N grey- = P (X 2 S) X considered a vector pwith X ) various pN (IX +1 |V = X 0 )P N (V = X 0 ) N X 0 2RX or. The quantities, µk , k and !k , (4) and mixing parameters associated C ution. W is the sum of !k for all To make the PVM representation of the a term by term Probability clear, observed R is given by k; for this particular explanation of the update equation given that the I X intensity, in 4 is outlined. xture components. I • The term p (IX voxels produced the color X N N +1 |V = X 0 ) is computed using the Voxel Volume! ! 1 (I µk )2 mixture of Gaussians model the image voxel X 0 . seen in stored at the 2 p 2 exp 2 k (1) • The probability of a voxel X producing the color in 0 2⇡ k V the image is interpreted geometrically, where a voxel 3 ! X wk (I µk )2 mixture S learned using a modi- are produces the intensity seen in1the image if it is a surface 2 2 on (EM) algorithm similar to that p e element and it is not occluded by2other voxels along the k X' W 2⇡ k modeling [45]. The update of |V=X’)! P(I the X ray. Thus, k=1 Intesity! P N (V = X 0 ) = P N (X 0 2 S)P N (X 0 is not occluded) (5) +1 The probability of occlusion is defined as the probability that all voxels between X 0 and the sensor are empty,11 ! Maria Isabel Restrepo. February 7, 2012
  • 15. ance model observed image intensity, as well the N +1 Gaussian mixture (1) atNthat +1 (Iexplains 2 S) 0 N p (I N voxel = X 0|X (V = X |X 2 S) p |V X )P the intensity h voxel is modeled with asurface observed(X theS) = 0image better thanpany other)voxel along ( to contain the observed Gaussian P in 2 N+1 P (X 2 S) X N (I N +1 usion. The process of updating grey- = P NProbabilistic X 2RX 3-d Volumetric Modeling X e model given by (1). I, refers to the the the projection 2 S) XX N NN +1+1 (X ray. pancy probabilities is explained in p p X X |V |V = X 0 )P N (V X 0 |X )2 (I (I N = X 0 )P (V = = X 0 oxelconsidered a vectorawith various be is modeled with Gaussian X 0 2R color. The quantities, µk , k and !k , = P N (X 2 S) X 0 2RX X pN (I N +1 |X 2 S) en by (1). I, refers to the grey- N +1 N X e, and mixing parameters associatedP X (X 2 S) = P (X 2 S)N (I N +1 |V+1 X 0 )P N (V = X 0 ) p N = (3) (4) considered a vector with various pN (IX ) X ribution. W is the sum of !k for all To make the PVM representation clear, a term by term or. modelquantities, µk , k and !k , e The X X 0 2RX res is given by k; forN +1 particular explanation of the updateNequation X 0 )Pis outlined. 2 S) ( this pN (IX +1 |V = in 4 (V = X 0 |X voxel is modeledpwith a associated nd mixing parameters GaussianS)N mixture components. X |X 2 N (I X0 N N The term 2RX N +1 |V = X 0 ) is computed using the tion. W is2the sum 2to the +1 all = PTo(X 2 S)the pPVM representation clear, a term 0 by ter en P (X I,S) by (1). refers of !N grey- ! N (I k for • make XX(I pwith X ) mixture of Gaussians(IX +1 |Vstored 0at the voxel X 0 . N pN model = X )P N (V = X ) is given by ak; for this particular explanation of the0 update equation in 4 is outlined. considered vector various (I µk ) 1 2 2 pThe quantities, µ , or. 2⇡ 2 exp (1) • The probability Xof a voxel X producing the color in 0 ture components. k k and !k , k X 2R N N +1 k • The term pis (IX the image |V =geometrically, where using th interpreted X 0 ) is computed a (4)voxel and mixing parameters associated ! C e1 mixture are 2 2 (I µk )2 learned !k a all mixture ofthe Probability instored at termit by aterm0 . produces Gaussians modelthat image if is X intensity seen the a voxel ution. W is the sum ofusingfor modi- To make the PVM representation clear, a the voxel surface expby algorithm similar to(1) explanation of the updateof a occluded4by producing the color • The probability not voxel X other voxels along the 0 element and it is equation color outlined. produced the in is seen in R zation (EM) k; for this particularthat 2⇡ given is 2 k X I nd modeling [45]. The update of the • The term pN (I Ninterpreted geometrically, where thevox xturekcomponents. X I the image is +1 ray. Thus, |V = X ) isthe image 0 computed using a Voxel Volume! ! ixture are learned using a modi- (I µk )2 producesGaussians model stored N image if it is a surfa N of the0 intensity seen in the the voxel X 0 . X mixture = X ) = P N (X 0 2 S)P at(X 0 is not occluded) (5 1 (EM) algorithm similar to that P (V on p exp 2 2 k (1) element and itof anot occluded by other the color in th • The probability is voxel X 0 producing voxels along 2⇡ 2 modeling [45]. The update of the the TheThus, interpreted geometrically, where a voxel ray. probability of occlusion is defined as the probability V N +1 k image is k that all voxels between0 XtheandNthe ifsensora are empty, produces the intensity N 0 mixture S N +1 d! are learnedNusing a modi- Pnamely: X 0 ) = P seen in S)P (X 0 is is occluded) N (V = (X 2 image it not surface (I on (EM) algorithm X ' N µk similar to that (2) ) element and it is not occluded by other voxels along the ! + !k Y ray. Thus, 0 The (X is not occluded) = N probability of occlusion is defined N the probabili modeling [45]. The update of the (1 P as 00 2 S)) (6 P(I |V=X’)! 1 d! X P (X N +1 N Intesity! 2 N 2 P N (V all X 0 ) = P N (X 0 2 S)P00Nand0 the sensor are (5) that = voxels between X <X 0 is not occluded) empt (I µk ) ( k) 0 d! + N k !+1 N X (X (I µN ) k (2) namely: • The term P (V = X |X 2 S) is computed analogously N N 0 ng !k weight, d!, upon observing image +1 The probability of0 occlusion is defined as the probability Y P N P (V = between X= and instances of P empty, S) toall voxels X ). However, anythe (1 P N (XN (X S)) N0 00 d!analyzing N +1 distributions in other the N 2 N 2 that (X is not occluded) 0 sensor are 2 212 ! Maria Isabel Restrepo. February 7,µ ) (I 2012 ( )
  • 16. Spatial Optimization: Octree empty space surface Maria Isabel Restrepo. February 7, 2012 13
  • 17. Spatial Optimization: Octree empty space surface Maria Isabel Restrepo. February 7, 2012 14
  • 18. Spatial Optimization: Octree p(intensity) p(intensity) intensity intensity Crispell, Mundy and Taubin 2011 Miller, Jain and Mundy 2011 Maria Isabel Restrepo. February 7, 2012 15
  • 19. Probabilistic 3-d Volumetric Modeling Demo: https://vimeo.com/43729866 Maria Isabel Restrepo. February 7, 2012 16
  • 20. Geometry And Appearance Demo: https://vimeo.com/43690883 https://vimeo.com/45322168 Maria Isabel Restrepo. February 7, 2012 17
  • 21. Expected Appearance Volume Model: EVM Voxel’s Expected = E(IX |V = X )P (X 2 S) 0 0 Appearance Maria Isabel Restrepo. February 7, 2012 18
  • 22. Object Categorization: Bag Of Volumetric Words Parking Car Plane Building House Input: Feature Descriptor: Volumetric Classifier: EVM sampling: Taylor Vocabulary: Naive Bayes Dense PCA K-means Maria Isabel Restrepo. February 7, 2012 19
  • 23. Experiments: Data Collection http://vision.lems.brown.edu/project_desc/Object-Recognition-in-Probabilistic-3D-Scenes Maria Isabel Restrepo. February 7, 2012 20
  • 24. Experiments: Train And Test Sites Site 1 Site 2 Site 3 Site 5 Site 6 Site 7 Site 8 Site 10 Site 11 Site 12 Site 16 Site 18 Site 21 Site 22 Site 23 Site 25 Site 26 Site 27 http://vision.lems.brown.edu/project_desc/Object-Recognition-in-Probabilistic-3D-Scenes Maria Isabel Restrepo. February 7, 2012 21
  • 25. Experiments: The Input Camera matrices were recovered using Bundler: Snavely, N. and Seitz, S. (2006). Photo tourism: exploring photo collections in 3D. ACM Transactions on Graphics. Maria Isabel Restrepo. February 7, 2012 22
  • 26. Feature Description 394 D. Saupe and D.V. Vrani´ c Global Features Spherical Harmonics: D. Saupe and D. V. Vrani, 2001 Original 823-d Zernik Moments: M. Novotnia and R. Klein, 2003 harmonics 162 harmonics 242 harmonics Transforms and 3D IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 21, NO. 5, 3 Regional Point Descriptors SURF for robust Recognizing Objects in Range Data Using 1999 Fig. 1. Multi-resolution representationFeatures AND MACHINE INTELLIGENCE,| VOL. 21, NO. 5, MA 227 IEEE TRANSACTIONS ON PATTERN ANALYSIS r(u) = max{r ≥ 0 Local of the function three dimensional classification ru ∈MAY I ∪{0}} used to derive feature vectors frommakes thecoefficients for spherical harmonics. Sampling logarithmically Fourier descriptor more robust to distortions in shape with distance from the basis point. Bins closer to the center are smaller in all three spherical dimensions, so we use a minimum radius (rmin > 0) to avoid being overly 3 Functions on the Sphere for 3D Shape Feature Vectors sensitive to small differences in shape very close to the center. The Θ and Φ divisions are evenly In this section we describe the feature vectors used in our comparative study. As spaced along the 180◦ and 360◦ elevation and az- onents of our surface representation. A surface described by a polygonal surface mesh can be represented for matching as a set of and surface normals and (b) spin images. imuth ranges. 3 3D models we take triangle meshes consisting of triangles {T , . . . , T }, T ⊂ R , enes is difficult. The usual method for(b) the cency. Given enough points, weighted count w(pi ) ating object-centered coordinate systems inBin(j, k, l) accumulates a any object can be represented dealing by points sensed on the object Spin so surface meshes 1 surface, Images: Johnson and Hebert, 1999 m i (a) for each point pi whose spherical coordinates rela- (c) 3 given by vertices (geometry) {p , . . . , p }, p = (x , y , z ) ∈ R and an index r is to segment the scene into object and non- can represent objects of general shape. Surface meshes can ponents [1], [7]; naturally, this is difficult if the pbe generated from described n a polygonal Rdo i ), SURF: Knopp, et.al. 2010 3-D tive to fall A surface1 different types i [R andi j+1 i Fig. 2. Visualization for matching as . 1. Components of our surface representation. within the radius by of sensors j , surface mesh can be represented of the interval not ustrationisandthe detection ofand (b) spin images. [Φ The shape (a)elevation interval histogram bins ofmthe 3D the object table with three vertices per triangle (topology). Then our object is I = unknown. An alternative to seg-3D SURF features. Φ T, 3D points of surface normals azimuth generally contain,sensor-specific information; they are sen- is voxelized interval k representations. The useShape mesh k+1 ) and ube grid (side of length coordinate sys- SURF features are detected and back- Context: context. et.al. 2004 is to construct object-centered 256) (b). 3D sor-independent of surface shape Frome, i=1 i local features detected in the scene [Θl ,[18]; as representations for 3D shapes thebeen avoided in for [9], Θl+1 ). The contribution to has bin count the to the shape (c), where 2012 Maria Isabel Restrepo. February 7, detected features are represented as spheres and with 23
  • 27. Feature Formation Volumetric Form of Vector Form of Voxel Voxel Neighborhoods Neighborhoods E(IX |V = X )P (X 2 S) 0 0 24 Maria Isabel Restrepo. February 7, 2012
  • 28. Feature Formation Volumetric Form of Vector Form of Voxel Voxel Neighborhoods Neighborhoods E(IX |V = X )P (X 2 S) 0 0 24 Maria Isabel Restrepo. February 7, 2012
  • 29. Feature Formation Volumetric Form of Vector Form of Voxel Voxel Neighborhoods Neighborhoods E(IX |V = X )P (X 2 S) 0 0 24 Maria Isabel Restrepo. February 7, 2012
  • 30. Feature Formation Volumetric Form of Vector Form of Voxel Voxel Neighborhoods Neighborhoods E(IX |V = X )P (X 2 S) 0 0 24 Maria Isabel Restrepo. February 7, 2012
  • 31. ity leaf nodes contain the Gaussian mixture models ) Feature Description: PCA Features (c) tree subdivision of space proposed by Crispell [20]. S. In the PCA spac by the eigenvalue decomposition of 1-dimensional space d-dimensional space neighborhood (represented by a d-dimensional featur a1 ⇧d e1 x) can be exactly expressed as x = x + i=1 ai ei , w ¯ by theprincipal axes associated S. In the PCAeigenvalues, are eigenvalue decomposition of with the d space, every neighborhood x ⇡ x + a1 e1 by a d-dimensional feature vector (represented are the corresponding coefficients.⇧d k-dimensional ¯ A a e , where e x) can be exactly expressed as x = x + i=1 i i ¯ i approximation of the neighborhoodseigenvalues, and ai b are principal axes associated with the d can be obtained ⇧k the first k on the samplecomponents i.e.k-dimensional EVD principal ˜ ¯ x = x + i= are the corresponding coefficients. A k-dimensional (k < d) approximation, for k<d scatter matrix S a detailed analysis of the recons approximationpresents Section V of the neighborhoods can be obtained by using ⇧k 2 the firstof local neighborhoods,i.e. x = x + x| , ai ei a. error k principal components namely ¯ ˜ |x ˜i=1 as Section V presents a detailed analysis of In the remainder of dimension and training set size. 2 the reconstruction error of localvector arrangement of |x x| , as coefficien paper, the neighborhoods, namely projection a function ˜ of dimension and training set size. In the remainder of this PCA the vector arrangement of projection coefficients in the paper, space is referred to as a PCA feature. Maria Isabel Restrepo. February 7, 2012 25
  • 32. on ni ⌃ ⌃ ⌃ nj nk ⇥2 es, as the computation of derivatives in (i, j,expectationj,volume m V the k) Taylor Features ˜ E= Feature Description: V (i, k) (5) EVM, i= ni j= nj k= nk a least square error minimiz can be expressed as of the following energy function. ’s expected ˜ Where V (i, j, k) is the Taylor series approximation of ni nj nk ⌃ ⌃ ⌃ a volume V centered on2 the ⇥ nces, Minimize: E =3-d appearance of V (i, j, k) V (i, j, k) as the expected ˜ identify- point (i, j, k). Using the second degree Taylor expansion o i= ni j= nj k= nk st of the about (0, 0, 0), ( 6) becomes is (PCA) ˜ ⇤ Where V (i, j, k) is the Taylor series approximation o ⌅2 ⌃ epresents expected 3-d appearance of axvolume 1 xT Hx E= V (x) V0 T G V centered on th by identify- point (i, j, x Using the second degree Taylor expansion or sense. k). 2! most ofof order the about (0, 0, 0), ( 6) becomes ysis (PCA) Where V0 , G, H are the zeroth derivative, the grad e scatter ⌃ ⇤ ⌅2 represents vector and the Hessian matrix of the 1 T T volume of expe E= V (x) V0 x G x Hx error sense. 3-d appearances about the point (0, 0, 0), respectively. 2! obtained coefficients for 3-d derivative operators can be found by x he octree of imizing (7) withG, H are the zeroth derivative,second o ng order Where V0 , respect to the zeroth, first and the gra mple scatter aces and derivatives. The computedmatrix of the volume are exp vector and the Hessian derivative operators of app location, algebraically to neighborhoods in the 0, 0), respectively. 3-d appearances about the point (0, EVM. The respo re obtained Maria Isabel Restrepo. February 7, 2012 26
  • 33. Learning The Codebook Learn Volumetric Vocabulary using K-Means Clustering: ✤ Determine the best number of means: Heuristically ✤ Convergence depends on initialization: P. S. Bradley and U. M. Fayyad. 1998 Maria Isabel Restrepo. February 7, 2012 27
  • 34. Vocabulary: Twenty Volumetric Words PCA based Taylor based Maria Isabel Restrepo. February 7, 2012 28
  • 35. ssification, the class label with i=1 i 414 ep a count the number of cluster centerstheLearning Class Distributions is obtained, cij , of in the vocabulary.of the 405 number From 413 obability isachosen v , ominimizeUsing Bayes 415 ion meth- center, vi , times cluster center, proposed tooccursUsing Bayes quantization step a count is obtained, c , of the number of occurs in object j . in object o . ij 414 415 he means. i posteriori class probability class probability is given by: formula, the a posteriori is given by: j 406 416416 f the data 417 417 of a particular category be the The clus- P (Cl |oi ) ⇥ P (oi |Cl )P (Cl ) (8) (Cl |oi ) ⇥ P (oi likelihood(Clan object is given by the product of |Cl )P of ) (8) 407418 k-means, The 418419 frequency is the class label and N is the distances the likelihoods of the independent entries of the vocabulary, 408 419 420 the initialan object ),is given estimated l product ofThe full od of P (vj |Cl which are by the during learning. 421 s label l. Then, the set of all e manage-independent entries posterior becomes: of the expression for the class of the vocabulary, 409 420422 ⌥ f subsam- d k-means estimated )during learning. )The full ch are Nc (C |o ⇥ P (C ) P (v |C cji k 421 423 O= O , where N is the P l i (9) 410424 meansclass posteriorlbecomes: c l j l the pro- l=1 j=1 422425 he vocabulary of 3-d expected etric train- Nm ⇥cji 411 423 426 ng parallel are avail-⌥ k k ⇧ cji k ⇧ m=1:om O cjm ⌃ ⌃ 412 424 427 ed as V = v , where k is 428 ⇥ not be l ) P (C P (vj |ClP (Cl= ⇧ k l (9) ⌃ uld ⇥ ) i ) ⇧ ⌃ (10) 429 i=1 ⇧ Nm ⌃ 425 Therefore, s in the vocabulary. From the j=1 j=1 ⇤ ⇥ ji cnm ⌅ 413430 which is a n=1 m=1:om cOl 426431 N m obtained, c⇧ ,: number of times accluster i⌃ in object j 414 ij of the number occurs of 427 4 k jm ⇧ ⌃ 415 428 curs in object o . Using Bayes ⇧ m=1:o O Maria Isabel Restrepo. February 7, 2012 ⌃m l 29
  • 36. appearance patterns be defined as V = i=1 vi , where k is the ⌥N c Then, the set of Bayes the 409 withnumber of clusterl.centers in the vocabulary. all Classifier 4 class label Classification: From efined as O = l=1 is l , where , of is number of 4104 quantization step a count Oobtained, cijNc thethe times a cluster center, vi , occurs in object oj . Using Bayes 4114 es. Let the vocabulary of 3-d expected 4 ⌥k formula, the a posteriori class probability is given by: 4124 be defined as V )= P (o |C vi ,(C ) where k is (8) frequency P (Cl |oi ⇥ i=1 l )P l i er centers in the vocabulary. From the 4134 The likelihood of an object is given by the product of 4 count is obtained, cij , of the number of 414 the likelihoods of the independent entries of the vocabulary, 4 er, (vj |Coccurs in object oj during learning. The full P vi , l ), which are estimated . Using Bayes 4154 expression for the class posterior becomes: eriori class probability is given by: 4164 4 k 4174 oi )P⇥ P (oi |Cl )P= l ) l ) (C (Cl |oi ) ⇥ P (C P (vj |Cl ) cji (8) (9) 4184 j=1 ⇥cji 4194 of an object is given by the product of N m 4 ⇧ the vocabulary,⌃ e independent entries of k ⇧ cjm 4204 ⌃ ⇧ m=1:om Ol ⌃ Maria Isabel Restrepo. February 7, 2012 ⇥ P (C ) ⇧ ⌃ (10) 30
  • 37. appearance patterns be defined as V = i=1 vi , where k is withnumber of clusterl.centers in theLearning of all the 4094 the class label ⌥N c Then, the set Class Distributions vocabulary. From efined as O = l=1 is l , where , of is number of 4104 quantization step a count Oobtained, cijNc thethe times a cluster center, vi , occurs in object oj . Using Bayes 4114 es. Let the vocabulary of 3-d expected 4 ⌥k formula, the a posteriori class probability is given by: 4124 be defined as V )= P (o |C vi ,(C ) where k is (8) frequency P (Cl |oi ⇥ i=1 l )P l i er centers in the vocabulary. From the 4134 The likelihood of an object is given by the product of 4 count is obtained, cij , of the number of 414 the likelihoods of the independent entries of the vocabulary, 4 er, (vj |Coccurs in object oj during learning. The full P vi , l ), which are estimated . Using Bayes 4154 expression for the class posterior becomes: eriori class probability is given by: 4164 4 k 4174 oi )P⇥ P (oi |Cl )P= l ) l ) (C (Cl |oi ) ⇥ P (C P (vj |Cl ) cji (8) (9) Train 4184 j=1 ⇥cji 4194 of an object is given by the product of N m 4 ⇧ the vocabulary,⌃ e independent entries of k ⇧ cjm 4204 ⌃ ⇧ m=1:om Ol ⌃ Maria Isabel Restrepo. February 7, 2012 ⇥ P (C ) ⇧ ⌃ (10) 31
  • 38. appearance patterns be defined as V = i=1 vi , where k is withnumber of clusterl.centers in theLearning of all the 4094 the class label ⌥N c Then, the set Class Distributions vocabulary. From efined as O = l=1 is l , where , of is number of 4104 quantization step a count Oobtained, cijNc thethe times a cluster center, vi , occurs in object oj . Using Bayes 4114 es. Let the vocabulary of 3-d expected 4 ⌥k formula, the a posteriori class probability is given by: 4124 be defined as V )= P (o |C vi ,(C ) where k is (8) frequency P (Cl |oi ⇥ i=1 l )P l i er centers in the vocabulary. From the 4134 The likelihood of an object is given by the product of 4 count is obtained, cij , of the number of 414 the likelihoods of the independent entries of the vocabulary, 4 er, (vj |Coccurs in object oj during learning. The full P vi , l ), which are estimated . Using Bayes 4154 expression for the class posterior becomes: eriori class probability is given by: 4164 4 k Test 4174 oi )P⇥ P (oi |Cl )P= l ) l ) (C (Cl |oi ) ⇥ P (C P (vj |Cl ) cji (8) (9) Train 4184 j=1 ⇥cji 4194 of an object is given by the product of N m 4 ⇧ the vocabulary,⌃ e independent entries of k ⇧ cjm 4204 ⌃ ⇧ m=1:om Ol ⌃ Maria Isabel Restrepo. February 7, 2012 ⇥ P (C ) ⇧ ⌃ (10) 32
  • 39. Results: PCA Classes Buildings Planes Maria Isabel Restrepo. February 7, 2012 33
  • 40. Results: Taylor Classes Buildings Planes Maria Isabel Restrepo. February 7, 2012 34
  • 41. during training and classification. Experiments: Number Of Objects Table 2: Number of objects in every category. Planes Cars Houses Buildings Parking Lots Train 18 54 61 24 27 Test 16 29 45 15 17 Two measurements were used to evaluate the clas- sification performance: (i) classifier accuracy (i.e the fraction of correctly classified objects), and (ii) the confusion matrix. During classification experiments, the number of clusters in the codebook was varied from k = 2 to k = 100. Figure 4 presents classification accuracy as a function of the number of clusters. For 18 Probabilistic Sites both, Taylor-based features and PCA-based features, Maria Isabel Restrepo. February 7, 2012 35
  • 42. Results: Classification Accuracy Maria Isabel Restrepo. February 7, 2012 36
  • 43. row corresponds to those learned with Taylor-based features. The x-axis shows the feature. The most probable volumetric featuresResults:class are shown Matrix for each Confusion beside each was True Parking Class Plane House Building Car Lot True Class Plane House Building Car Parking Lot very are Plane 0.86 0.02 0.00 0.03 0.00 Plane 0.86 0.02 0.00 0.03 0.00 neg House 0.00 0.67 0.27 0.00 0.12 House 0.00 0.64 0.27 0.00 0.12 that not Building 0.00 0.31 0.67 0.00 0.00 Building 0.00 0.33 0.67 0.00 0.00 num ⇤, i Car 0.00 0.00 0.07 0.93 0.00 0.00 0.00 0.07 0.86 0.00 Car F Parking 0.14 0.00 0.00 0.03 0.88 Parking 0.14 0.00 0.00 0.10 0.88 mat Lot Lot sam (a) PCA (b) Taylor vari Fig. 9. Confusion matrix for a 20-keyword codebook of PCA based features valu on the left and Taylor based features on the right clas Maria Isabel Restrepo. February 7, 2012 cate 37
  • 44. Future Work ✴ Evaluation of effectiveness of the EVM, by performing classification tasks on different underlying 3-d reconstruction algorithms. ✴ Performance evaluation of additional feature descriptors. ✴ Explore algorithms for detection. Maria Isabel Restrepo. February 7, 2012 38
  • 45. Effectiveness Of Probabilistic Volumetric Learning Maria Isabel Restrepo. February 7, 2012 Y. Furukawa and J. Ponce, 2010 39
  • 46. Effectiveness Of Probabilistic Volumetric Learning Probabilistic 3-d Modeling Threshold Based 3-d Modeling Maria Isabel Restrepo. February 7, 2012 40
  • 47. Effectiveness Of Probabilistic Volumetric Learning Maria Isabel Restrepo. February 7, 2012 41

Hinweis der Redaktion

  1. Welcome Everyone. My name is M.R I come from Brown U and I am please to our work on ORIP3-dS. This is a joint work with BM and Mundy\n
  2. Let me start by explaining the goal of the our worK:\nSupposed we are given an image sequence or video of realistic, large scale scene. Where images are collected under unrestricted conditions in terms of illumination conditions, weather, resolution and so on.\n
  3. Then we are interested in characterizing the infromation of the three dimensional scene, such that we can provide an automated descriptions of the objects present in this three-dimensional world.\nWe are interested in being able to tell where are the buildings, the streets, the trees the water and so on.\n\nBefore, I move on to explain the set of methods that we propose to achieve this goal, I would like to briefly discuss some related work in the area of 3-d object recognition\n
  4. In recent years, there has been an exponential growth of the number of 3-d models. Typically these models are obtained from 3-d scanners or CAD models. \nTherefore, a lot of the work in 3-d shape understanding has focused on the problem of object retrieval. During object retrieval the task to be performed is given a query objects an algorithm needs to retrieve the closest match in the database. The main difference between these works and what our work, is that we are not working with isolated objects, and shape information is collected under very different conditions.\nAlso, we are not doing instance recognition but class recognition\n
  5. Another body of work, that is more inline with what we want to achieve, and that is the area of segmentation and object recognition in large scale point clouds that are obtained using LIDAR sensors.\nVery encouraging results have been recently reported in these area. As mentioned, we have a very similar goal but we operate on geometry is learned from images. We believe that future models could use combination of imagery and lidar to achieve more accurate representations.\n
  6. The first challenge is known a scene ambiguity. And it can be described as follows: Suppose we have a surface with three regions of constant appearance, observed from two different view points. Now the surfaces in the constant color regions can be reconstructed anywhere within the diamond shape regions. Therefore, whenever featureless surfacess are observed the 3-d geometry cannot be precisely modeled. Of course the number and positions of the cameras determined the area of the ambiguous regions.\n\nThe second difficulty is known as scene uncertainty and it happens when the same 3-d structure has a different appearance in diffent viewpoints, due to transient objects, the reflevtivity proberties or sensor noise. \nFinally I would like to enphiseze that when choosing a 3-d reconstruction technique it is important to handle or model the ambiguities just explained.\n
  7. The first challenge is known a scene ambiguity. And it can be described as follows: Suppose we have a surface with three regions of constant appearance, observed from two different view points. Now the surfaces in the constant color regions can be reconstructed anywhere within the diamond shape regions. Therefore, whenever featureless surfacess are observed the 3-d geometry cannot be precisely modeled. Of course the number and positions of the cameras determined the area of the ambiguous regions.\n\nThe second difficulty is known as scene uncertainty and it happens when the same 3-d structure has a different appearance in diffent viewpoints, due to transient objects, the reflevtivity proberties or sensor noise. \nFinally I would like to enphiseze that when choosing a 3-d reconstruction technique it is important to handle or model the ambiguities just explained.\n
  8. The first challenge is known a scene ambiguity. And it can be described as follows: Suppose we have a surface with three regions of constant appearance, observed from two different view points. Now the surfaces in the constant color regions can be reconstructed anywhere within the diamond shape regions. Therefore, whenever featureless surfacess are observed the 3-d geometry cannot be precisely modeled. Of course the number and positions of the cameras determined the area of the ambiguous regions.\n\nThe second difficulty is known as scene uncertainty and it happens when the same 3-d structure has a different appearance in diffent viewpoints, due to transient objects, the reflevtivity proberties or sensor noise. \nFinally I would like to enphiseze that when choosing a 3-d reconstruction technique it is important to handle or model the ambiguities just explained.\n
  9. The first challenge is known a scene ambiguity. And it can be described as follows: Suppose we have a surface with three regions of constant appearance, observed from two different view points. Now the surfaces in the constant color regions can be reconstructed anywhere within the diamond shape regions. Therefore, whenever featureless surfacess are observed the 3-d geometry cannot be precisely modeled. Of course the number and positions of the cameras determined the area of the ambiguous regions.\n\nThe second difficulty is known as scene uncertainty and it happens when the same 3-d structure has a different appearance in diffent viewpoints, due to transient objects, the reflevtivity proberties or sensor noise. \nFinally I would like to enphiseze that when choosing a 3-d reconstruction technique it is important to handle or model the ambiguities just explained.\n
  10. The first challenge is known a scene ambiguity. And it can be described as follows: Suppose we have a surface with three regions of constant appearance, observed from two different view points. Now the surfaces in the constant color regions can be reconstructed anywhere within the diamond shape regions. Therefore, whenever featureless surfacess are observed the 3-d geometry cannot be precisely modeled. Of course the number and positions of the cameras determined the area of the ambiguous regions.\n\nThe second difficulty is known as scene uncertainty and it happens when the same 3-d structure has a different appearance in diffent viewpoints, due to transient objects, the reflevtivity proberties or sensor noise. \nFinally I would like to enphiseze that when choosing a 3-d reconstruction technique it is important to handle or model the ambiguities just explained.\n
  11. In our work we propose model scene geometry and appearance through a probabilistin volumetric model. This model was first proposed by pollard and mundy in 2007\nIn its original from a region of 3-d space is decomposed into regular 3-d cells called voxels. A voxel contains information about the geometry and appearance of that portion of space and the information on voxels is learned using input images, calibrated camera matrices and corresponding projection rays.\n
  12. The problem set up is a follows: For every pixel in an image the is an associated projection ray. Here denoted RX\nAt every voxel in that ray, the the geometry and appearance models are updated using the intensity in the corresponding pixel. Along the ray there is only one voxel that produces the color scene in the image\n
  13. Geometry is models as surface probability. At every point in time a voxels has 2 possible states - It is a surface element or its not.\nThe probability of a voxel being a surface is updated with the information in an image manner using Bayesian learning. The update is done in an online fasion, by this i mean using only one image at a time. This allows the model to adapt to the ever-changing world surfaces.\n
  14. \nThe bayesian update can be expressed using the probability and appearance information along the projection ray. The interpretation of this equation is a that surface probability at a particular voxel increases if the appearance model at that voxel explains the given intensity better than any other voxel along the projection ray. \n
  15. At every voxel, appearance is modeled by a mixture of gaussians that is updated using expectation-maximization.\n
  16. The other term in this equation corresponds to the probability of a voxel being the one that caused the intensity seen in the image. This term is interpreted geometrically, where a voxels caused the color in the image if it is a surface element and it is not occludded. Occlussion is modeled as the probability that the space between that voxel and the sensor is empty\n
  17. One difficulty of Pollar&amp;#x2019;d model is that the storage requierements are high.\nIn practice most of the voxels in a scene correspond to empty space\n
  18. Ideally, we would like represent the information near surfaces with high resolution and use a coarse voxels on empty space \n
  19. In 2011, C,M, T proposed a Variable resolution model. Crispell&amp;#x2019;s model is based on an octree subdivision of space. Where geometry and appearance information is stored at the leave cells. The screenshot on the bottom right compares details of two models, One reconstructed with a egular grid, and the other using the octree model.\nFinally, Miller&amp;#x2026; proposed a GPU implememetation of Crispell&amp;#x2019;s model. And that is the implementation that we use in our current work\n
  20. This is a volumetric rendering of a scene&amp;#x2019;s reconstructed geometry. Where white corresponds to surface probability 1 and black to empty space\n
  21. Even more exciting is to show you renderings of the combined surface and appearance information, pay attention to the level of detail and sharpness achieved by the model\n\nExplain the streaks in the air\n
  22. Now that we have learned the geometry and appearance at every voxel we would like to combine this information. To do so, the expected appearance is multiplied by the occupancy. This allows us to explore not only interesting features in the geometry of objects but also in their appearance\n
  23. At this point let me move on to explain the object categorization pipeline. In this work we propose to use a bag of features representation to learn and classify objects.\nI have explained the how input objects are represented using the volume expected appearances. \nFor each object neighborhoods are sampled in a dense manner and describe\nMotivated by the success os this bag of features methods in 2-d images\n
  24. Before explaining the details of our bag-of-features models, let me talk about the data and inputs used in the experiments.\nFor this work, we had some fun and flew in a helicopter \n
  25. \n
  26. We only use grey scale images\n
  27. In general objects can be described using either local feature or global ones.\nGlobal features describe the overal shape of the object. Examples are 3-d moments, sheprical harmonics among others. \nLocal features on the other hand describe neighbohoods with local suppport. \nLocal images tend to be more robust in the precesnse of occlusions. Also, global images depend on succesful presegmentation ....\n
  28. Let me explain how the volumetric rendering of voxel neighborhoods should be interpreted: First, recall that the underlying function is the expected appearance of a voxel. Then White regions in this volumes correspond \n
  29. Let me explain how the volumetric rendering of voxel neighborhoods should be interpreted: First, recall that the underlying function is the expected appearance of a voxel. Then White regions in this volumes correspond \n
  30. Let me explain how the volumetric rendering of voxel neighborhoods should be interpreted: First, recall that the underlying function is the expected appearance of a voxel. Then White regions in this volumes correspond \n
  31. Let me explain how the volumetric rendering of voxel neighborhoods should be interpreted: First, recall that the underlying function is the expected appearance of a voxel. Then White regions in this volumes correspond \n
  32. Let me explain how the volumetric rendering of voxel neighborhoods should be interpreted: First, recall that the underlying function is the expected appearance of a voxel. Then White regions in this volumes correspond \n
  33. Let me explain how the volumetric rendering of voxel neighborhoods should be interpreted: First, recall that the underlying function is the expected appearance of a voxel. Then White regions in this volumes correspond \n
  34. Let me explain how the volumetric rendering of voxel neighborhoods should be interpreted: First, recall that the underlying function is the expected appearance of a voxel. Then White regions in this volumes correspond \n
  35. Let me explain how the volumetric rendering of voxel neighborhoods should be interpreted: First, recall that the underlying function is the expected appearance of a voxel. Then White regions in this volumes correspond \n
  36. Let me explain how the volumetric rendering of voxel neighborhoods should be interpreted: First, recall that the underlying function is the expected appearance of a voxel. Then White regions in this volumes correspond \n
  37. Let me explain how the volumetric rendering of voxel neighborhoods should be interpreted: First, recall that the underlying function is the expected appearance of a voxel. Then White regions in this volumes correspond \n
  38. The first feature we propose is based on the principal comp... anl of local neighborhoods. \nPCA finds the directions that best represents the data in the least square scene. \nThe data can be expressed exactly using all principal directions, or approximates using a smaller number of them\nIn our experiment the original space had 125 dimensions, and the approximation was achieved using 10 dimensions\n\n
  39. The second type of feature that we propose is based on the Taylor series approximation of the local volumetric function.\nDifferential kernels are found by minimizing the suqare distance between the volumetric function and its&amp;#x2019;s teaylor series approximation.\nAt every location in space we assign a descripto that is made etheir form PCA projection coefficients of the 10 responses to the taylor kernels\n
  40. After computing descriptors at every location in the objects, for all objects in a training. We would like to find a small number of descriptors that represent all samples\n
  41. Mostly composed of slowly varying first order derivatives\n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. The\n
  51. Just as preview of further work that we have performed&amp;#x2026;. if there is time\n
  52. \n
  53. \n
  54. Explain better the error bars\n