SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Downloaden Sie, um offline zu lesen
Foundations & Core

Workshop on Frontiers in Computer Vision

          MIT, Cambridge, MA

             Aug. 22, 2011

       Harpreet S. Sawhney

  Vision & Learning Technologies
     SRI-Sarnoff, Princeton, NJ
Inspirations

“Work on Important Problems, not just Interesting ones!”

 – Curt Carlson



“Vision is Hard, Vision is Fun, Vision is Probabilistic, and Vision Works!”

 – Andrew Blake at his ICCV Achievement Award acceptance




                                Š SRI International Sarnoff.
Impressive Computer Vision Apps
… behind each lie fundamental advances in Theory & Practice of Vision…


    • Human-Machine Interaction with Kinect like sensors

    • “Building Rome in a Day” class of applications

    • MatchMove with 3D camera motion estimation

    • Visual Odometry for Robot Navigation

    • DARPA Rural and Urban Grand Challenges

    • SnapTell’s “Snap a Book Cover and Find the Book” app.

    • Geo-Registration of Aerial Video

    • First down line in Football

    • Emerging Augmented Reality Apps

                                            Š SRI International Sarnoff.   3
Most Impressive Computer Vision Apps
… behind each lie fundamental advances in Theory & Practice of Vision…


    • Human-Machine Interaction with Kinect like sensors

    • “Building Rome in a Day” class of applications

    • MatchMove with 3D camera motion estimation

    • Visual Odometry for Robot Navigation

    • DARPA Rural and Urban Grand Challenges

    • SnapTell’s “Snap a Book Cover and Find the Book” app.

    • Geo-Registration of Aerial Video
                              Will Employ both Geometry &
    • First down line in Football     Recognition

    • Emerging Augmented Reality Apps

                                            Š SRI International Sarnoff.   4
Fundamental Techniques at the Core of Computer Vision I
– Low-level image processing functions
  • Gradients, Edges, Histograms, etc.
– Aggregate feature computation
  • SIFT / HoG / MSER and their variants.
– Short-range video / image alignment
  • Optical flow, (Quasi-)Parametric image matching
– Image Warping:
  • Parametric, Quasi-parametric and Non-parametric warping.
– Image / Video Mosaicking:
  • Alignment, warping, stitching and blending.
– Long-range / Wide Baseline Matching:
  • Features, RANSAC and matching models.
– Stereo Processing:
  • Constraints, Matching Models and features, MRF like algorithms.
– Camera Calibration
  • Intrinsic and Extrinsic, Single camera, Stereo, and Multi-camera rigs.
– Two-/Three-frame Monocular Camera Motion Models and Algorithms
  • Fundamental / Essential Matrix, Tri-focal tensor, Closed-form methods, Optimization methods, Common /
    practical ambiguities, Numerical Considerations.

                                              Š SRI International Sarnoff.                                  5
Fundamental Techniques at the Core of Computer Vision II
– Multi-frame Structure and Motion Estimation:
  • Camera and 3D Point initializations, Bundle-block adjustment, Practical considerations.
– Moving Object Detection and Tracking:
  • Statistical Background Modeling with Static and Moving Camera; Moving Object Detection with Static and
    Moving Cameras; Online models for objects: Appearance, Geometry, Motion, Bayesian Frameworks for
    Tracking: K-filter, Condensation, Layers, Online optimization, etc.
– Color/Texture/Shape/Motion Segmentation:
  • Bottom-up techniques; Segmentation by classification; Graph models and optimization.
– Non-linear Optimization Techniques:
  • Closed-form, Iterative methods, Numerical considerations, Convex Constrained Methods.
– Basic Statistical Learning Techniques:
  • Understanding of probabilities, distributions, marginals, conditionals, etc.; Regression, Linear / Non-linear
    classifiers, PCA / LDA, FLD, etc.; SVMs; Boosting, bagging and their variants; Practical considerations
– High-dimensional Indexing and Structures for Efficient / Accurate Processing:
  • Randomized trees and forests, and variants; Locality-Sensitive Hashing (LSH), and variants; Vocabulary trees
    and variants
– Basic Graph Modeling, Learning and Inferencing Techniques:
  • Formulating a model; Learning and Inferencing with HMMs, CRFs, MRFs, and their variants; Bayesian networks,
    their variants and Learning and Inferencing techniques.



                                              Š SRI International Sarnoff.                                          6
Fundamental Techniques at the Core of Computer Vision II
– Multi-frame Structure and Motion Estimation:
  • Camera and 3D Point initializations, Bundle-block adjustment, Practical considerations.
– Moving Object Detection and Tracking:
  • Statistical Background Modeling with Static and Moving Camera; Moving Object Detection with Static and
    Moving Cameras; Online models for objects: Appearance, Geometry, Motion, Bayesian Jay
                                                                        Gould, Stephen Frameworks for
                                                            Misunderstanding of probability may
    Tracking: K-filter, Condensation, Layers, Online optimization, etc.
– Color/Texture/Shape/Motion Segmentation:be the greatest of all impediments to
                                                                    scientific literacy.
  • Bottom-up techniques; Segmentation by classification; Graph models and optimization.
– Non-linear Optimization Techniques:
  • Closed-form, Iterative methods, Numerical considerations, Convex Constrained Methods.
– Basic Statistical Learning Techniques:
  • Understanding of probabilities, distributions, marginals, conditionals, etc.; Regression, Linear / Non-linear
    classifiers, PCA / LDA, FLD, etc.; SVMs; Boosting, bagging and their variants; Practical considerations
– High-dimensional Indexing and Structures for Efficient / Accurate Processing:
  • Randomized trees and forests, and variants; Locality-Sensitive Hashing (LSH), and variants; Vocabulary trees
    and variants
– Basic Graph Modeling, Learning and Inferencing Techniques:
  • Formulating a model; Learning and Inferencing with HMMs, CRFs, MRFs, and their variants; Bayesian networks,
    their variants and Learning and Inferencing techniques.



                                              Š SRI International Sarnoff.                                          7
Fundamental Techniques at the Core of Computer Vision III
– Object Recognition:
  • Representations and Models: Bag-of-Words, Appearance and Geometry, Part-based, Global, etc.
  • Features and processing
  • Optimization for Training and Inferencing.
– Action / Activity Recognition:
  • Representations and Models: Bag-of-Words, Motion, Appearance, and Geometry, Part-based,
    Global, etc.
  • Features and processing
  • Optimization for Training and Inferencing.
– Basic Applications:
  •   Image / Video Mosaicking
  •   Geo-registration: Alignment of images / videos to reference imagery
  •   Match Move: Insertion of Objects/Patches in Videos.
  •   Moving Object Detection and Tracking
  •   Frontal Face Recognition




                                         Š SRI International Sarnoff.                             8
Challenges
• High Quality 3D Understanding / Modeling from Images/Videos
  – Mensuration, manipulation, recognition etc. need the world to be represented in terms of crisp
    boundaries of objects, surfaces and their connectivity and topology


• Large Scale (Classes and Scale of Data) Multi-Class Object / Scene / Material Detection
  & Classification
  – The world is getting used to applications that work at the scale of the Web and the Earth. This
    problem will require not just the extension of Bag-of-Words techniques and large-scale classifiers
    but insights into representations of classes of objects in terms of their geometry and appearance.


• Event Modeling, Detection and Classification
  – Videos are probably the single fastest growing data type on the Web that is most relevant to
    Computer Vision. Modeling of Events in videos is a relatively unexplored area.


• Computer Vision in Education and Training
  – We as teachers, mentors, and social beings routinely use our eyes and ears to sense and assess
    the level of engagement, proficiency and areas of deficiency in the people we interact with in our
    professions and life.
  – Converting our human prowess into science and engineering of education and training will
    require capturing, modeling, analyzing and reporting on human movements, actions, expressions,
    and interactions.
                                          Š SRI International Sarnoff.                                   9
Hurdles

• Computer Vision needs to whole-heartedly embrace Applications and
  Systems as first class activities within the community and not as activities
  to do once you leave the fold of vision.
  – Learn from Graphics, Signal Processing, Medical communities


• Significantly greater effort needs to be devoted to collecting, organizing,
  annotating and disseminating large scale datasets that are representative
  of the diversity and complexity of the real world in terms of scenes, objects,
  activities, events, 3D structure, illumination and viewpoints.

• Computer Vision approaches need to have distinct Representation /
  Modeling, Algorithms and Implementation steps so that the strengths and
  weaknesses of approaches can be assessed at various levels.
  – Many current Input-BlackBox-Output approaches do not follow this discipline.
    Hence, it is hard to assess research advances and their impact on applications.

                                   Š SRI International Sarnoff.                       10

Weitere ähnliche Inhalte

Andere mochten auch

Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 

Andere mochten auch (8)

Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 

Ähnlich wie Fcv core sawhney

Computer vision and robotics
Computer vision and roboticsComputer vision and robotics
Computer vision and roboticsBiniam Asnake
 
Fcv core szeliski_zisserman
Fcv core szeliski_zissermanFcv core szeliski_zisserman
Fcv core szeliski_zissermanzukun
 
Word
WordWord
Wordbutest
 
SĂŠminaire IA & VA- Yassine Ruichek, UTBM
SĂŠminaire IA & VA- Yassine Ruichek, UTBMSĂŠminaire IA & VA- Yassine Ruichek, UTBM
SĂŠminaire IA & VA- Yassine Ruichek, UTBMMahdi Zarg Ayouna
 
Application of image processing.ppt
Application of image processing.pptApplication of image processing.ppt
Application of image processing.pptDevesh448679
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingYu Huang
 
Lecture 1 computer vision introduction
Lecture 1 computer vision introductionLecture 1 computer vision introduction
Lecture 1 computer vision introductioncairo university
 
Movie on face recognition in e attendace
Movie on face recognition in e attendaceMovie on face recognition in e attendace
Movie on face recognition in e attendacesbk50000
 
Artificial Intelligence_Strategy.pptx
Artificial Intelligence_Strategy.pptxArtificial Intelligence_Strategy.pptx
Artificial Intelligence_Strategy.pptxSureshMaddi1
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...Tulipp. Eu
 
2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR InteractionMark Billinghurst
 
Face Recongnition using Machine Learning
Face Recongnition using Machine LearningFace Recongnition using Machine Learning
Face Recongnition using Machine Learningroysarthak272002
 
Object tracking final
Object tracking finalObject tracking final
Object tracking finalMrsShwetaBanait1
 
Object tracking presentation
Object tracking  presentationObject tracking  presentation
Object tracking presentationMrsShwetaBanait1
 
Context is King: AR, AI, Salience, and the Constant Next Scenario
Context is King: AR, AI, Salience, and the Constant Next ScenarioContext is King: AR, AI, Salience, and the Constant Next Scenario
Context is King: AR, AI, Salience, and the Constant Next ScenarioClark Dodsworth
 
[0122]seunghyeong
[0122]seunghyeong[0122]seunghyeong
[0122]seunghyeongivaderivader
 

Ähnlich wie Fcv core sawhney (20)

Computer vision and robotics
Computer vision and roboticsComputer vision and robotics
Computer vision and robotics
 
Fcv core szeliski_zisserman
Fcv core szeliski_zissermanFcv core szeliski_zisserman
Fcv core szeliski_zisserman
 
Word
WordWord
Word
 
SĂŠminaire IA & VA- Yassine Ruichek, UTBM
SĂŠminaire IA & VA- Yassine Ruichek, UTBMSĂŠminaire IA & VA- Yassine Ruichek, UTBM
SĂŠminaire IA & VA- Yassine Ruichek, UTBM
 
PPT s01-machine vision-s2
PPT s01-machine vision-s2PPT s01-machine vision-s2
PPT s01-machine vision-s2
 
ICS1020 CV
ICS1020 CVICS1020 CV
ICS1020 CV
 
Application of image processing.ppt
Application of image processing.pptApplication of image processing.ppt
Application of image processing.ppt
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous Driving
 
ICS1020CV_2022.pdf
ICS1020CV_2022.pdfICS1020CV_2022.pdf
ICS1020CV_2022.pdf
 
Lecture 1 computer vision introduction
Lecture 1 computer vision introductionLecture 1 computer vision introduction
Lecture 1 computer vision introduction
 
Movie on face recognition in e attendace
Movie on face recognition in e attendaceMovie on face recognition in e attendace
Movie on face recognition in e attendace
 
Artificial Intelligence_Strategy.pptx
Artificial Intelligence_Strategy.pptxArtificial Intelligence_Strategy.pptx
Artificial Intelligence_Strategy.pptx
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
 
2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction
 
Face Recongnition using Machine Learning
Face Recongnition using Machine LearningFace Recongnition using Machine Learning
Face Recongnition using Machine Learning
 
Object tracking final
Object tracking finalObject tracking final
Object tracking final
 
Object tracking presentation
Object tracking  presentationObject tracking  presentation
Object tracking presentation
 
Context is King: AR, AI, Salience, and the Constant Next Scenario
Context is King: AR, AI, Salience, and the Constant Next ScenarioContext is King: AR, AI, Salience, and the Constant Next Scenario
Context is King: AR, AI, Salience, and the Constant Next Scenario
 
[0122]seunghyeong
[0122]seunghyeong[0122]seunghyeong
[0122]seunghyeong
 
Anits dip
Anits dipAnits dip
Anits dip
 

Mehr von zukun

Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...zukun
 
Quoc le tera-scale deep learning
Quoc le   tera-scale deep learningQuoc le   tera-scale deep learning
Quoc le tera-scale deep learningzukun
 
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...zukun
 
Lecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningLecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningzukun
 
Lecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognitionLecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognitionzukun
 
P05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for visionP05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for visionzukun
 
P06 motion and video cvpr2012 deep learning methods for vision
P06 motion and video cvpr2012 deep learning methods for visionP06 motion and video cvpr2012 deep learning methods for vision
P06 motion and video cvpr2012 deep learning methods for visionzukun
 

Mehr von zukun (20)

Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
 
Quoc le tera-scale deep learning
Quoc le   tera-scale deep learningQuoc le   tera-scale deep learning
Quoc le tera-scale deep learning
 
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
 
Lecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningLecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learning
 
Lecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognitionLecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognition
 
P05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for visionP05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for vision
 
P06 motion and video cvpr2012 deep learning methods for vision
P06 motion and video cvpr2012 deep learning methods for visionP06 motion and video cvpr2012 deep learning methods for vision
P06 motion and video cvpr2012 deep learning methods for vision
 

KĂźrzlich hochgeladen

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂşjo
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

KĂźrzlich hochgeladen (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Fcv core sawhney

  • 1. Foundations & Core Workshop on Frontiers in Computer Vision MIT, Cambridge, MA Aug. 22, 2011 Harpreet S. Sawhney Vision & Learning Technologies SRI-Sarnoff, Princeton, NJ
  • 2. Inspirations “Work on Important Problems, not just Interesting ones!” – Curt Carlson “Vision is Hard, Vision is Fun, Vision is Probabilistic, and Vision Works!” – Andrew Blake at his ICCV Achievement Award acceptance Š SRI International Sarnoff.
  • 3. Impressive Computer Vision Apps … behind each lie fundamental advances in Theory & Practice of Vision… • Human-Machine Interaction with Kinect like sensors • “Building Rome in a Day” class of applications • MatchMove with 3D camera motion estimation • Visual Odometry for Robot Navigation • DARPA Rural and Urban Grand Challenges • SnapTell’s “Snap a Book Cover and Find the Book” app. • Geo-Registration of Aerial Video • First down line in Football • Emerging Augmented Reality Apps Š SRI International Sarnoff. 3
  • 4. Most Impressive Computer Vision Apps … behind each lie fundamental advances in Theory & Practice of Vision… • Human-Machine Interaction with Kinect like sensors • “Building Rome in a Day” class of applications • MatchMove with 3D camera motion estimation • Visual Odometry for Robot Navigation • DARPA Rural and Urban Grand Challenges • SnapTell’s “Snap a Book Cover and Find the Book” app. • Geo-Registration of Aerial Video Will Employ both Geometry & • First down line in Football Recognition • Emerging Augmented Reality Apps Š SRI International Sarnoff. 4
  • 5. Fundamental Techniques at the Core of Computer Vision I – Low-level image processing functions • Gradients, Edges, Histograms, etc. – Aggregate feature computation • SIFT / HoG / MSER and their variants. – Short-range video / image alignment • Optical flow, (Quasi-)Parametric image matching – Image Warping: • Parametric, Quasi-parametric and Non-parametric warping. – Image / Video Mosaicking: • Alignment, warping, stitching and blending. – Long-range / Wide Baseline Matching: • Features, RANSAC and matching models. – Stereo Processing: • Constraints, Matching Models and features, MRF like algorithms. – Camera Calibration • Intrinsic and Extrinsic, Single camera, Stereo, and Multi-camera rigs. – Two-/Three-frame Monocular Camera Motion Models and Algorithms • Fundamental / Essential Matrix, Tri-focal tensor, Closed-form methods, Optimization methods, Common / practical ambiguities, Numerical Considerations. Š SRI International Sarnoff. 5
  • 6. Fundamental Techniques at the Core of Computer Vision II – Multi-frame Structure and Motion Estimation: • Camera and 3D Point initializations, Bundle-block adjustment, Practical considerations. – Moving Object Detection and Tracking: • Statistical Background Modeling with Static and Moving Camera; Moving Object Detection with Static and Moving Cameras; Online models for objects: Appearance, Geometry, Motion, Bayesian Frameworks for Tracking: K-filter, Condensation, Layers, Online optimization, etc. – Color/Texture/Shape/Motion Segmentation: • Bottom-up techniques; Segmentation by classification; Graph models and optimization. – Non-linear Optimization Techniques: • Closed-form, Iterative methods, Numerical considerations, Convex Constrained Methods. – Basic Statistical Learning Techniques: • Understanding of probabilities, distributions, marginals, conditionals, etc.; Regression, Linear / Non-linear classifiers, PCA / LDA, FLD, etc.; SVMs; Boosting, bagging and their variants; Practical considerations – High-dimensional Indexing and Structures for Efficient / Accurate Processing: • Randomized trees and forests, and variants; Locality-Sensitive Hashing (LSH), and variants; Vocabulary trees and variants – Basic Graph Modeling, Learning and Inferencing Techniques: • Formulating a model; Learning and Inferencing with HMMs, CRFs, MRFs, and their variants; Bayesian networks, their variants and Learning and Inferencing techniques. Š SRI International Sarnoff. 6
  • 7. Fundamental Techniques at the Core of Computer Vision II – Multi-frame Structure and Motion Estimation: • Camera and 3D Point initializations, Bundle-block adjustment, Practical considerations. – Moving Object Detection and Tracking: • Statistical Background Modeling with Static and Moving Camera; Moving Object Detection with Static and Moving Cameras; Online models for objects: Appearance, Geometry, Motion, Bayesian Jay Gould, Stephen Frameworks for Misunderstanding of probability may Tracking: K-filter, Condensation, Layers, Online optimization, etc. – Color/Texture/Shape/Motion Segmentation:be the greatest of all impediments to scientific literacy. • Bottom-up techniques; Segmentation by classification; Graph models and optimization. – Non-linear Optimization Techniques: • Closed-form, Iterative methods, Numerical considerations, Convex Constrained Methods. – Basic Statistical Learning Techniques: • Understanding of probabilities, distributions, marginals, conditionals, etc.; Regression, Linear / Non-linear classifiers, PCA / LDA, FLD, etc.; SVMs; Boosting, bagging and their variants; Practical considerations – High-dimensional Indexing and Structures for Efficient / Accurate Processing: • Randomized trees and forests, and variants; Locality-Sensitive Hashing (LSH), and variants; Vocabulary trees and variants – Basic Graph Modeling, Learning and Inferencing Techniques: • Formulating a model; Learning and Inferencing with HMMs, CRFs, MRFs, and their variants; Bayesian networks, their variants and Learning and Inferencing techniques. Š SRI International Sarnoff. 7
  • 8. Fundamental Techniques at the Core of Computer Vision III – Object Recognition: • Representations and Models: Bag-of-Words, Appearance and Geometry, Part-based, Global, etc. • Features and processing • Optimization for Training and Inferencing. – Action / Activity Recognition: • Representations and Models: Bag-of-Words, Motion, Appearance, and Geometry, Part-based, Global, etc. • Features and processing • Optimization for Training and Inferencing. – Basic Applications: • Image / Video Mosaicking • Geo-registration: Alignment of images / videos to reference imagery • Match Move: Insertion of Objects/Patches in Videos. • Moving Object Detection and Tracking • Frontal Face Recognition Š SRI International Sarnoff. 8
  • 9. Challenges • High Quality 3D Understanding / Modeling from Images/Videos – Mensuration, manipulation, recognition etc. need the world to be represented in terms of crisp boundaries of objects, surfaces and their connectivity and topology • Large Scale (Classes and Scale of Data) Multi-Class Object / Scene / Material Detection & Classification – The world is getting used to applications that work at the scale of the Web and the Earth. This problem will require not just the extension of Bag-of-Words techniques and large-scale classifiers but insights into representations of classes of objects in terms of their geometry and appearance. • Event Modeling, Detection and Classification – Videos are probably the single fastest growing data type on the Web that is most relevant to Computer Vision. Modeling of Events in videos is a relatively unexplored area. • Computer Vision in Education and Training – We as teachers, mentors, and social beings routinely use our eyes and ears to sense and assess the level of engagement, proficiency and areas of deficiency in the people we interact with in our professions and life. – Converting our human prowess into science and engineering of education and training will require capturing, modeling, analyzing and reporting on human movements, actions, expressions, and interactions. Š SRI International Sarnoff. 9
  • 10. Hurdles • Computer Vision needs to whole-heartedly embrace Applications and Systems as first class activities within the community and not as activities to do once you leave the fold of vision. – Learn from Graphics, Signal Processing, Medical communities • Significantly greater effort needs to be devoted to collecting, organizing, annotating and disseminating large scale datasets that are representative of the diversity and complexity of the real world in terms of scenes, objects, activities, events, 3D structure, illumination and viewpoints. • Computer Vision approaches need to have distinct Representation / Modeling, Algorithms and Implementation steps so that the strengths and weaknesses of approaches can be assessed at various levels. – Many current Input-BlackBox-Output approaches do not follow this discipline. Hence, it is hard to assess research advances and their impact on applications. Š SRI International Sarnoff. 10