SlideShare ist ein Scribd-Unternehmen logo
1 von 119
Dr. Oge Marques
      Associate Professor
Computer Science and Engineering
   Florida Atlantic University
     Boca Raton, FL (USA)

           June 2009
Take-home message

   We postulate that many
   challenging problems in
   human and computer vision
   research can be approached
   in a truly interdisciplinary
   way and show examples of
   recent work on the topic of
   “objects in context” that
   support our claim.!
Outline
•  Background and motivation
•  Visual perception
•  Object detection and recognition
•  Scene recognition and analysis
•  The role of context
•  Representative work
•  Concluding remarks
Background and motivation
Computer vision is not as easy as it seemed 40+ years ago
Background and motivation
•  Computer vision has many open research
   questions
   –  Object detection, recognition, and categorization
   –  Scene analysis, recognition, and understanding
   –  Objects in context

•  Research in human vision has grown
   tremendously
   –  Computational models of selected visual processes
      have emerged

•  A truly interdisciplinary effort can help bring the
   best of human vision research into selected
   problems in computer vision.
The fundamental question of
          vision


   How are we able so quickly and
  effortlessly to perceive meaningful,
       coherent, 3D scenes from
   incomplete, 2D patterns of light
          that enter our eyes?
A selected related question

   
 How
are
we
able
to
perceive,

   detect,
categorize
and
recognize

         objects
and
scenes?


Vision Science
Interdisciplinary
study
of
many
areas
of
visual

            processing
and
function


  Areas
of
Research
           Disciplines

• Detection
               • Psychology

• Attention
               • Neuroscience

• Memory
                  • Biology

• Recognition
             • Computer
Science

• Motion
perception
       • Engineering

• etc.
                    • etc.

Reverse engineering the
         perceptual system

•  We know the
   visual system
   works
•  But how?
We don’t ‘see’ with our eyes




       We see with our brains!
The hierarchical nature of the
scientific knowledge of the visual
              system




The deeper in the system you go, the less we know…
What do we know about visual
        perception?
 Not much compared to what we don’t know



               Ignorance




                 Knowledge
Outline
•  Background and motivation
•  Visual perception
•  Object detection and recognition
•  Scene recognition and analysis
•  The role of context
•  Representative work
•  Concluding remarks
The perceptual process




           Source: E.B. Goldstein, “Sensation and Perception”
Four Stages of Visual Perception
Inspired by work by David Marr
  (1945-1980)

•  One of the most influential neuroscientists of vision.
•  Thought of vision as an information-processing
   task.
•  In his book Vision (1982), he distinguished three
   different levels of description involved in
   understanding complex information processing
   systems:
    –  Computational level
    –  Algorithmic level
    –  Implementation level
         •  An important point is that the levels can be
            considered independently.
Four Stages of Visual Perception
Four Stages of Visual Perception
Four Stages of Visual Perception
Four Stages of Visual Perception
Four Stages of Visual Perception

           “cup”
Outline
•  Background and motivation
•  Visual perception
•  Object detection and recognition
•  Scene recognition and analysis
•  The role of context
•  Representative work
•  Concluding remarks
The challenge of object
           recognition
•  Why is it so difficult for computers to carry
   out object recognition tasks that humans
   can perform easily?

•  Although most human visual perception
   appears to be almost effortless, it involves
   complex “behind the scenes” processes.
The challenge of object
     recognition
         Human vision scientist: “Let’s
          look at selected behavioral
           and neural processes that
         make it possible for people to
            perceive (i.e., detect and
              recognize) objects.”




            Computer vision scientist:
          “Let’s model what is known –
           and reasonable – and try it
           out on standard databases
         containing real-world images.”
The challenge of object
             perception
•  The stimulus on the receptors is ambiguous
The challenge of object
             perception
•  The stimulus on the receptors is ambiguous




            The inverse projection problem
The challenge of object
             perception
•  The stimulus on the receptors is ambiguous
The challenge of object
             perception
•  The stimulus on the receptors is ambiguous




          http://users.skynet.be/J.Beever/pave.htm
The challenge of object
            perception
•  Objects can be hidden or blurred



  Can you find…
   - the pencil?
  - the glasses?
The challenge of object
            perception
•  Objects can be hidden or blurred


           Who are these people?
The challenge of object
                perception
Objects look different from different viewpoints

   The ability of humans to recognize an object seen from different
   viewpoints is called viewpoint invariance.
The challenge of object
                perception
Objects look different from different viewpoints

   Q: Which two faces correspond to the same person?




   A1 (human): (a) and (c)
   A2 (computer): (a) and (b)
Research question
  How do we recognize objects from different
                viewpoints?
Structural-Description Models               Image-Description Models

Propose that our ability to recognize       Propose that our ability to recognize
3D objects is based on 3D volumes           objects from different viewpoints is
(called volumetric features) that can       based on stored 2D views of the
be combined to create the overall           object as it would appear from different
shape of an object.                         viewpoints.




  Which Model Is Correct? The actual mechanism for object recognition probably
  involves elements of both the structural-description and image-description models
  (Palmeri & Gauthier, 2004)
Why do we care about object
       recognition?


Because object recognition leads
              to
   perception of function.
So, what do we use direct or
           indirect?
“It seems exceedingly unlikely (though
logically possible) that we categorize
everything in our visual fields”, Palmer.

Hypothesis: we categorize the objects
that are relevant for a specific task that we
have at hand, but we only extract
affordances from the others.
Object detection and the
“Head in the coffee beans
        problem”
“Head in the coffee beans
        problem”
     Can you find the head in this image?
“Head in the coffee beans
        problem”
     Can you find the head in this image?
So what does object recognition involve?




                            Slide by Fei-Fei, Fergus, Torralba
Verification: is that a lamp?




                                Slide by Fei-Fei, Fergus, Torralba
Detection: are there people?




                               Slide by Fei-Fei, Fergus, Torralba
Identification: is that Potala Palace?




                              Slide by Fei-Fei, Fergus, Torralba
Object categorization

                             mountain



         tree
                           building
          banner

                         street lamp

                               vendor
                people
                                Slide by Fei-Fei, Fergus, Torralba
Scene and context categorization
                        •  outdoor
                        •  city
                        •  …




                               Slide by Fei-Fei, Fergus, Torralba
Is this space large or small?
How far are the buildings in the back?




                             Slide by Fei-Fei, Fergus, Torralba
Activity




What is this person doing?
                             What are these two doing??




                                             Slide by Fei-Fei, Fergus, Torralba
Outline
•  Background and motivation
•  Visual perception
•  Object detection and recognition
•  Scene recognition and analysis
•  The role of context
•  Representative work
•  Concluding remarks
What is a scene?
•  A scene is a view of a real-world
   environment that contains multiples
   surfaces and objects, organized in a
   meaningful way.

  –  A tour of scene understanding literature:
     http://cvcl.mit.edu/SUNSarticles.htm
The “gist” of a scene
•  Mary Potter (1975, 1976) demonstrated that
   during a rapid sequential visual presentation (100
   msec per image), a novel scene picture is indeed
   instantly understood and observers seem to
   comprehend a lot of visual information, but a
   delay of a few hundreds msec (~ 300 msec) is
   required for the picture to be consolidated in
   memory.
•  The “gist” (a summary) refers to the visual
   information perceived after/during a glance at an
   image.
•  To simplify, the gist is often synonymous with the
   basic level category of the scene or event (e.g.
   wedding, bathroom, beach, forest, street)
What we (don’t) know about scene
   analysis, recognition, and
          classification
•  Humans are very good at recognizing and
   classifying scenes
•  We are also very fast (100 ms or less)
•  We often sacrifice accuracy in the name of
   speed (we capture the gist but miss many
   details)

•  How exactly do we do it?
What is the basis for scene
         identification?

•  Different schools of thought:
  – Scene-centered
  – Part-based (i.e., object-centered)
  – Holistic
Outline
•  Background and motivation
•  Visual perception
•  Object detection and recognition
•  Scene recognition and analysis
•  The role of context
•  Representative work
•  Concluding remarks
Objects in context
•  Objects do not exist isolated from a
   context

•  Torralba’s challenge: “How far can you
   go without using an object detector?”
Objects in context
The multiple personalities of a blob
The multiple personalities of a blob
Look-Alikes by Joan Steiner
Why is context important?
What are the hidden objects?
What are the hidden objects?
Biederman 1982

•  Pictures shown for 150
   ms.
•  Objects in appropriate
   context were detected
   more accurately than
   objects in an
   inappropriate context.
•  Scene consistency
   affects object detection.
Objects and Scenes
Biederman’s violations (1981):
Support
Interposition
Size
Position, Probability
Biederman’s classes in Computer Vision
            Galleguillos & Belongie, Tech Report (2008)



•  Interposition and support can be coded by
   reference to physical space.
•  Probability, position and size are defined as
   semantic relations because they require access
   to the referential meaning of the object.
•  Semantic relations include information about
   detailed interactions among objects in the scene
   and they are often used as contextual features.
Dreaming of an ideal computer vision solution…
           Galleguillos & Belongie, Tech Report (2008)
Types of context
                Galleguillos & Belongie, Tech Report (2008)


•  Contextual features can be grouped into 3
   categories:
   –  semantic context (probability)
   –  spatial context (position)
   –  scale context (size).

•  Contextual knowledge can be any information that is
   not directly produced by the appearance of an object.

•  It can be obtained from:
   –  the nearby image data;
   –  image tags or annotations;
   –  the presence and location of other objects.
Acquiring and modeling context
            Galleguillos & Belongie, Tech Report (2008)


•  Which of the three should one use?
  –  Spatial and scale context are the most
     exploited types of context by recognition
     frameworks.
  –  Generally, semantic context is implicitly
     present in spatial context, as information of
     object co-occurrences come from identifying
     objects for the spatial relations in the scene.
  –  The same happens to scale context, as scale
     is measured with respect to others objects.
  –  Therefore, using spatial and scale context
     involve using all forms of contextual
     information in the scene.
Outline
•  Background and motivation
•  Visual perception
•  Object detection and recognition
•  Scene recognition and analysis
•  The role of context
•  Representative work
•  Concluding remarks
Representative work
•  There are many research groups working
   on the intersection of human and
   computer vision in numerous topics,
   including “objects in context”.
•  Most expressive example: work by
   Aude Oliva and Antonio Torralba (and
   collaborators) at MIT.
Representative work
•  A case study:

  –  L.W. Renninger and J. Malik (2004). When is
     scene recognition just texture
     recognition? Vision Research, 44,
     2301-2311.
Renninger and Malik

•  Basic idea
  –  Consider texture as an
     early cue for scene
     perception.
     •  It’s simple
     •  It’s fast (pre-attentive)
        (Julesz, 1981)
Renninger and Malik
•  Approach
     How well do
       humans             Build a texture-based
 discriminate scenes        model for scene
   with very limited         discrimination.
      exposure?



                   Compare
                 performance!
Scene categories
Scene categories
Renninger and Malik
•  Task
  –  2AFC
  –  Subjects are shown an image
     •  Image exposure time: 37, 50 and 69ms
  –  Image followed by a jumbled scene mask
  –  The task is to select one of two word choices
     that best describes the image
  –  Subject performance: 77%, 82% and 92%
     correct
•  Get ready…
Texture Discrimination Model
 –  Cluster response distributions from V1-like
    filters to get prototypical responses (textons)
 –  Remember what types of textons occur in
    particular scenes (build histogram)
 –  Label new image using a nearest neighbor
    classifier
    •  Compare texton histogram for new image to stored
       representations (χ2 distance)
                                             (Malik and Perona, 1990)
                                                   (Malik, et. al., 1999)
Texture Discrimination Model
•  V1-like filters
Texture Discrimination Model
•  Textons
Texture
Discrimination
    Model
Confusion matrix

                         Outdoor/
               Natural              Indoor
                           MM


  Natural      50.56     33.26      16.19


Outdoor / MM   23.14     46.54      30.33


   Indoor      8.12      18.69      73.18
Discrimination of
Superordinate Categories
Renninger and Malik
•  Conclusion
  –  Early scene identification can be mostly
     explained by a simple texture model
Outline
•  Background and motivation
•  Visual perception
•  Object detection and recognition
•  Scene recognition and analysis
•  The role of context
•  Representative work
•  Concluding remarks
Our experience
•  Working with Dept of Psychology @ FAU
  –  Two joint graduate-level courses
  –  Joint student supervision
  –  Joint grant proposals
  –  Joint papers (in preparation)
  –  Constant discussions
  –  Promising days ahead…
•  Imaging Science & Technology Center
•  Multidisciplinary Vision Program
Our focus
•  To establish quantitative measures
   of the importance of context
  – Method: present subjects with degraded
    (blocky, blurry, etc.) objects against a
    context and ask them to recognize the
    objet as it becomes progressively more
    visible.
  – Human vision: behavioral experiments
  – Computer vision: stimuli creation
Concluding remarks
•  Great potential
•  Cultural barriers
•  Open problems and challenges on both
   sides
•  The time is ripe for interdisciplinary
   research on vision, particularly “objects in
   context”
Acknowledgments
•  Thanks to Prof. Elan Barenholtz (Dept of
   Psychology, FAU) for allowing me to use some
   of his slides and for the many interesting
   discussions on the topics presented in this talk.

•  Many slides for this talk contain material made
   publicly available on the Web by Antonio
   Torralba and Aude Oliva (MIT) and Fei-Fei Li
   (UIUC).
Thank you for attending my talk!

            Questions?




                Email: omarques@fau.edu

Weitere ähnliche Inhalte

Was ist angesagt?

Psic comunicazione
Psic comunicazione Psic comunicazione
Psic comunicazione imartini
 
NN-Nearest Neighbor and PDAF-Probabilistic Data Association Filters
NN-Nearest Neighbor and PDAF-Probabilistic Data Association FiltersNN-Nearest Neighbor and PDAF-Probabilistic Data Association Filters
NN-Nearest Neighbor and PDAF-Probabilistic Data Association FiltersEngin Gul
 
Image Registration (Digital Image Processing)
Image Registration (Digital Image Processing)Image Registration (Digital Image Processing)
Image Registration (Digital Image Processing)VARUN KUMAR
 
Elements of visual perception
Elements of visual perceptionElements of visual perception
Elements of visual perceptionDr INBAMALAR T M
 
Educare ai nuovi media: La sfida delle scienze cognitive
Educare ai nuovi media: La sfida delle scienze cognitiveEducare ai nuovi media: La sfida delle scienze cognitive
Educare ai nuovi media: La sfida delle scienze cognitiveRiva Giuseppe
 
digital image processing, image processing
digital image processing, image processingdigital image processing, image processing
digital image processing, image processingKalyan Acharjya
 
Artificial Neural Networks Lect7: Neural networks based on competition
Artificial Neural Networks Lect7: Neural networks based on competitionArtificial Neural Networks Lect7: Neural networks based on competition
Artificial Neural Networks Lect7: Neural networks based on competitionMohammed Bennamoun
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolutionRevanth Kumar
 
Implementation and comparison of Low pass filters in Frequency domain
Implementation and comparison of Low pass filters in Frequency domainImplementation and comparison of Low pass filters in Frequency domain
Implementation and comparison of Low pass filters in Frequency domainZara Tariq
 
ITK Tutorial Presentation Slides-947
ITK Tutorial Presentation Slides-947ITK Tutorial Presentation Slides-947
ITK Tutorial Presentation Slides-947Kitware Kitware
 
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...ravi sharma
 
Gsd iimmagine
Gsd iimmagineGsd iimmagine
Gsd iimmagineimartini
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing mapsraphaelkiminya
 

Was ist angesagt? (20)

Psic comunicazione
Psic comunicazione Psic comunicazione
Psic comunicazione
 
NN-Nearest Neighbor and PDAF-Probabilistic Data Association Filters
NN-Nearest Neighbor and PDAF-Probabilistic Data Association FiltersNN-Nearest Neighbor and PDAF-Probabilistic Data Association Filters
NN-Nearest Neighbor and PDAF-Probabilistic Data Association Filters
 
Image Restoration
Image RestorationImage Restoration
Image Restoration
 
Image Registration (Digital Image Processing)
Image Registration (Digital Image Processing)Image Registration (Digital Image Processing)
Image Registration (Digital Image Processing)
 
Elements of visual perception
Elements of visual perceptionElements of visual perception
Elements of visual perception
 
Cognitive biases
Cognitive biasesCognitive biases
Cognitive biases
 
Educare ai nuovi media: La sfida delle scienze cognitive
Educare ai nuovi media: La sfida delle scienze cognitiveEducare ai nuovi media: La sfida delle scienze cognitive
Educare ai nuovi media: La sfida delle scienze cognitive
 
digital image processing, image processing
digital image processing, image processingdigital image processing, image processing
digital image processing, image processing
 
Artificial Intelligence and Expert System
Artificial Intelligence  and Expert SystemArtificial Intelligence  and Expert System
Artificial Intelligence and Expert System
 
Deep learning
Deep learningDeep learning
Deep learning
 
Artificial Neural Networks Lect7: Neural networks based on competition
Artificial Neural Networks Lect7: Neural networks based on competitionArtificial Neural Networks Lect7: Neural networks based on competition
Artificial Neural Networks Lect7: Neural networks based on competition
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolution
 
Implementation and comparison of Low pass filters in Frequency domain
Implementation and comparison of Low pass filters in Frequency domainImplementation and comparison of Low pass filters in Frequency domain
Implementation and comparison of Low pass filters in Frequency domain
 
ITK Tutorial Presentation Slides-947
ITK Tutorial Presentation Slides-947ITK Tutorial Presentation Slides-947
ITK Tutorial Presentation Slides-947
 
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
 
Self-organizing map
Self-organizing mapSelf-organizing map
Self-organizing map
 
Object Recognition
Object RecognitionObject Recognition
Object Recognition
 
Gsd iimmagine
Gsd iimmagineGsd iimmagine
Gsd iimmagine
 
Classical Planning
Classical PlanningClassical Planning
Classical Planning
 
Kohonen self organizing maps
Kohonen self organizing mapsKohonen self organizing maps
Kohonen self organizing maps
 

Andere mochten auch

Wills Clinical Vision Research Training and Mentoring Program: Implementation...
Wills Clinical Vision Research Training and Mentoring Program: Implementation...Wills Clinical Vision Research Training and Mentoring Program: Implementation...
Wills Clinical Vision Research Training and Mentoring Program: Implementation...Susan Umfer
 
Algılama Nedir ? Tüketici Davranışları ve Algılama
Algılama Nedir ? Tüketici Davranışları ve AlgılamaAlgılama Nedir ? Tüketici Davranışları ve Algılama
Algılama Nedir ? Tüketici Davranışları ve AlgılamaBurak Gümüşay
 
Gestalt Kuramı-Mekan
Gestalt Kuramı-MekanGestalt Kuramı-Mekan
Gestalt Kuramı-Mekanayseguly
 
Gestalt kurami
Gestalt kuramiGestalt kurami
Gestalt kuramimassive501
 

Andere mochten auch (9)

Wills Clinical Vision Research Training and Mentoring Program: Implementation...
Wills Clinical Vision Research Training and Mentoring Program: Implementation...Wills Clinical Vision Research Training and Mentoring Program: Implementation...
Wills Clinical Vision Research Training and Mentoring Program: Implementation...
 
The power of digital minds 張瑞雄
The power of digital minds 張瑞雄The power of digital minds 張瑞雄
The power of digital minds 張瑞雄
 
Perceptual process
Perceptual processPerceptual process
Perceptual process
 
Algılama Nedir ? Tüketici Davranışları ve Algılama
Algılama Nedir ? Tüketici Davranışları ve AlgılamaAlgılama Nedir ? Tüketici Davranışları ve Algılama
Algılama Nedir ? Tüketici Davranışları ve Algılama
 
Gestalt Kuramı-Mekan
Gestalt Kuramı-MekanGestalt Kuramı-Mekan
Gestalt Kuramı-Mekan
 
Perception
PerceptionPerception
Perception
 
Perceptual process
Perceptual  processPerceptual  process
Perceptual process
 
Visual perceptual
Visual perceptualVisual perceptual
Visual perceptual
 
Gestalt kurami
Gestalt kuramiGestalt kurami
Gestalt kurami
 

Ähnlich wie Promising avenues for interdisciplinary research in vision

Ch 5 Object Recognition.pptx
Ch 5 Object Recognition.pptxCh 5 Object Recognition.pptx
Ch 5 Object Recognition.pptxLarry195181
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015Jia-Bin Huang
 
The Art of Data Visualization Seminar - Webcast Recording
The Art of Data Visualization Seminar - Webcast RecordingThe Art of Data Visualization Seminar - Webcast Recording
The Art of Data Visualization Seminar - Webcast RecordingAndrés Fortino, PhD
 
The art of data visualization slideset
The art of data visualization slidesetThe art of data visualization slideset
The art of data visualization slidesetAndrés Fortino, PhD
 
Look Around: Question Answering, Serendipity, and the Research Process of Sch...
Look Around: Question Answering, Serendipity, and the Research Process of Sch...Look Around: Question Answering, Serendipity, and the Research Process of Sch...
Look Around: Question Answering, Serendipity, and the Research Process of Sch...KimberleyMartin
 
Fcv scene efros
Fcv scene efrosFcv scene efros
Fcv scene efroszukun
 
Design the future of the Australian Web Industry with Design Thinking
Design the future of the Australian Web Industry with Design ThinkingDesign the future of the Australian Web Industry with Design Thinking
Design the future of the Australian Web Industry with Design ThinkingWilliam Donovan
 
Researching people: using questionnaires and interviews
Researching people: using questionnaires and interviewsResearching people: using questionnaires and interviews
Researching people: using questionnaires and interviewsJenna Condie
 
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...Oge Marques
 
Understanding Human Player: Attention, Perception and Motivation / Sergei Sav...
Understanding Human Player: Attention, Perception and Motivation / Sergei Sav...Understanding Human Player: Attention, Perception and Motivation / Sergei Sav...
Understanding Human Player: Attention, Perception and Motivation / Sergei Sav...DevGAMM Conference
 
VR in Education to ARNY Oct. 25th, 2016
VR in Education to ARNY Oct. 25th, 2016VR in Education to ARNY Oct. 25th, 2016
VR in Education to ARNY Oct. 25th, 2016Hugh Seaton
 
Coast to Coast March 2013
Coast to Coast March 2013Coast to Coast March 2013
Coast to Coast March 2013Brian Fisher
 
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1zukun
 
Iccv2009 recognition and learning object categories p0 c00 - introduction
Iccv2009 recognition and learning object categories   p0 c00 - introductionIccv2009 recognition and learning object categories   p0 c00 - introduction
Iccv2009 recognition and learning object categories p0 c00 - introductionzukun
 

Ähnlich wie Promising avenues for interdisciplinary research in vision (20)

Ch 5 Object Recognition.pptx
Ch 5 Object Recognition.pptxCh 5 Object Recognition.pptx
Ch 5 Object Recognition.pptx
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015
 
The Art of Data Visualization Seminar - Webcast Recording
The Art of Data Visualization Seminar - Webcast RecordingThe Art of Data Visualization Seminar - Webcast Recording
The Art of Data Visualization Seminar - Webcast Recording
 
The Art of Data Visialization
The Art of Data VisializationThe Art of Data Visialization
The Art of Data Visialization
 
The art of data visualization slideset
The art of data visualization slidesetThe art of data visualization slideset
The art of data visualization slideset
 
Look Around: Question Answering, Serendipity, and the Research Process of Sch...
Look Around: Question Answering, Serendipity, and the Research Process of Sch...Look Around: Question Answering, Serendipity, and the Research Process of Sch...
Look Around: Question Answering, Serendipity, and the Research Process of Sch...
 
Fcv scene efros
Fcv scene efrosFcv scene efros
Fcv scene efros
 
Design the future of the Australian Web Industry with Design Thinking
Design the future of the Australian Web Industry with Design ThinkingDesign the future of the Australian Web Industry with Design Thinking
Design the future of the Australian Web Industry with Design Thinking
 
Researching people: using questionnaires and interviews
Researching people: using questionnaires and interviewsResearching people: using questionnaires and interviews
Researching people: using questionnaires and interviews
 
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...
Advances and Challenges in Visual Information Search and Retrieval (WVC 2012 ...
 
Understanding Human Player: Attention, Perception and Motivation / Sergei Sav...
Understanding Human Player: Attention, Perception and Motivation / Sergei Sav...Understanding Human Player: Attention, Perception and Motivation / Sergei Sav...
Understanding Human Player: Attention, Perception and Motivation / Sergei Sav...
 
CP wk 3
CP wk 3CP wk 3
CP wk 3
 
VR in Education to ARNY Oct. 25th, 2016
VR in Education to ARNY Oct. 25th, 2016VR in Education to ARNY Oct. 25th, 2016
VR in Education to ARNY Oct. 25th, 2016
 
Pc Seminar Jordi
Pc Seminar JordiPc Seminar Jordi
Pc Seminar Jordi
 
Coast to Coast March 2013
Coast to Coast March 2013Coast to Coast March 2013
Coast to Coast March 2013
 
Unpacking
UnpackingUnpacking
Unpacking
 
CV
CVCV
CV
 
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
MIT6.870 Grounding Object Recognition and Scene Understanding: lecture 1
 
Iccv2009 recognition and learning object categories p0 c00 - introduction
Iccv2009 recognition and learning object categories   p0 c00 - introductionIccv2009 recognition and learning object categories   p0 c00 - introduction
Iccv2009 recognition and learning object categories p0 c00 - introduction
 
Vass2012 fisher
Vass2012 fisherVass2012 fisher
Vass2012 fisher
 

Mehr von Förderverein Technische Fakultät

The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...Förderverein Technische Fakultät
 
Engineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfEngineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfFörderverein Technische Fakultät
 
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfFörderverein Technische Fakultät
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Förderverein Technische Fakultät
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...Förderverein Technische Fakultät
 
Advances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksAdvances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksFörderverein Technische Fakultät
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfFörderverein Technische Fakultät
 

Mehr von Förderverein Technische Fakultät (20)

Supervisory control of business processes
Supervisory control of business processesSupervisory control of business processes
Supervisory control of business processes
 
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
 
A Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdfA Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdf
 
From Mind to Meta.pdf
From Mind to Meta.pdfFrom Mind to Meta.pdf
From Mind to Meta.pdf
 
Miniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdfMiniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdf
 
Distributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptxDistributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptx
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
 
Engineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfEngineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdf
 
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
 
Towards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdfTowards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdf
 
Förderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptxFörderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptx
 
The Computing Continuum.pdf
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdf
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...
 
Machine Learning in Finance via Randomization
Machine Learning in Finance via RandomizationMachine Learning in Finance via Randomization
Machine Learning in Finance via Randomization
 
IT does not stop
IT does not stopIT does not stop
IT does not stop
 
Advances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksAdvances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial Networks
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
 
Introduction to 5G from radio perspective
Introduction to 5G from radio perspectiveIntroduction to 5G from radio perspective
Introduction to 5G from radio perspective
 

Kürzlich hochgeladen

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 

Kürzlich hochgeladen (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 

Promising avenues for interdisciplinary research in vision

  • 1. Dr. Oge Marques Associate Professor Computer Science and Engineering Florida Atlantic University Boca Raton, FL (USA) June 2009
  • 2. Take-home message We postulate that many challenging problems in human and computer vision research can be approached in a truly interdisciplinary way and show examples of recent work on the topic of “objects in context” that support our claim.!
  • 3. Outline •  Background and motivation •  Visual perception •  Object detection and recognition •  Scene recognition and analysis •  The role of context •  Representative work •  Concluding remarks
  • 4. Background and motivation Computer vision is not as easy as it seemed 40+ years ago
  • 5. Background and motivation •  Computer vision has many open research questions –  Object detection, recognition, and categorization –  Scene analysis, recognition, and understanding –  Objects in context •  Research in human vision has grown tremendously –  Computational models of selected visual processes have emerged •  A truly interdisciplinary effort can help bring the best of human vision research into selected problems in computer vision.
  • 6. The fundamental question of vision How are we able so quickly and effortlessly to perceive meaningful, coherent, 3D scenes from incomplete, 2D patterns of light that enter our eyes?
  • 7. A selected related question 
 How
are
we
able
to
perceive,
 detect,
categorize
and
recognize
 objects
and
scenes?


  • 8. Vision Science Interdisciplinary
study
of
many
areas
of
visual
 processing
and
function
 Areas
of
Research
 Disciplines
 • Detection
 • Psychology
 • Attention
 • Neuroscience
 • Memory
 • Biology
 • Recognition
 • Computer
Science
 • Motion
perception
 • Engineering
 • etc.
 • etc.

  • 9. Reverse engineering the perceptual system •  We know the visual system works •  But how?
  • 10. We don’t ‘see’ with our eyes We see with our brains!
  • 11. The hierarchical nature of the scientific knowledge of the visual system The deeper in the system you go, the less we know…
  • 12. What do we know about visual perception? Not much compared to what we don’t know Ignorance Knowledge
  • 13. Outline •  Background and motivation •  Visual perception •  Object detection and recognition •  Scene recognition and analysis •  The role of context •  Representative work •  Concluding remarks
  • 14. The perceptual process Source: E.B. Goldstein, “Sensation and Perception”
  • 15. Four Stages of Visual Perception Inspired by work by David Marr (1945-1980) •  One of the most influential neuroscientists of vision. •  Thought of vision as an information-processing task. •  In his book Vision (1982), he distinguished three different levels of description involved in understanding complex information processing systems: –  Computational level –  Algorithmic level –  Implementation level •  An important point is that the levels can be considered independently.
  • 16. Four Stages of Visual Perception
  • 17. Four Stages of Visual Perception
  • 18. Four Stages of Visual Perception
  • 19. Four Stages of Visual Perception
  • 20. Four Stages of Visual Perception “cup”
  • 21. Outline •  Background and motivation •  Visual perception •  Object detection and recognition •  Scene recognition and analysis •  The role of context •  Representative work •  Concluding remarks
  • 22. The challenge of object recognition •  Why is it so difficult for computers to carry out object recognition tasks that humans can perform easily? •  Although most human visual perception appears to be almost effortless, it involves complex “behind the scenes” processes.
  • 23. The challenge of object recognition Human vision scientist: “Let’s look at selected behavioral and neural processes that make it possible for people to perceive (i.e., detect and recognize) objects.” Computer vision scientist: “Let’s model what is known – and reasonable – and try it out on standard databases containing real-world images.”
  • 24. The challenge of object perception •  The stimulus on the receptors is ambiguous
  • 25. The challenge of object perception •  The stimulus on the receptors is ambiguous The inverse projection problem
  • 26. The challenge of object perception •  The stimulus on the receptors is ambiguous
  • 27. The challenge of object perception •  The stimulus on the receptors is ambiguous http://users.skynet.be/J.Beever/pave.htm
  • 28. The challenge of object perception •  Objects can be hidden or blurred Can you find… - the pencil? - the glasses?
  • 29. The challenge of object perception •  Objects can be hidden or blurred Who are these people?
  • 30. The challenge of object perception Objects look different from different viewpoints The ability of humans to recognize an object seen from different viewpoints is called viewpoint invariance.
  • 31. The challenge of object perception Objects look different from different viewpoints Q: Which two faces correspond to the same person? A1 (human): (a) and (c) A2 (computer): (a) and (b)
  • 32. Research question How do we recognize objects from different viewpoints? Structural-Description Models Image-Description Models Propose that our ability to recognize Propose that our ability to recognize 3D objects is based on 3D volumes objects from different viewpoints is (called volumetric features) that can based on stored 2D views of the be combined to create the overall object as it would appear from different shape of an object. viewpoints. Which Model Is Correct? The actual mechanism for object recognition probably involves elements of both the structural-description and image-description models (Palmeri & Gauthier, 2004)
  • 33. Why do we care about object recognition? Because object recognition leads to perception of function.
  • 34.
  • 35.
  • 36.
  • 37. So, what do we use direct or indirect? “It seems exceedingly unlikely (though logically possible) that we categorize everything in our visual fields”, Palmer. Hypothesis: we categorize the objects that are relevant for a specific task that we have at hand, but we only extract affordances from the others.
  • 38. Object detection and the “Head in the coffee beans problem”
  • 39. “Head in the coffee beans problem” Can you find the head in this image?
  • 40. “Head in the coffee beans problem” Can you find the head in this image?
  • 41. So what does object recognition involve? Slide by Fei-Fei, Fergus, Torralba
  • 42. Verification: is that a lamp? Slide by Fei-Fei, Fergus, Torralba
  • 43. Detection: are there people? Slide by Fei-Fei, Fergus, Torralba
  • 44. Identification: is that Potala Palace? Slide by Fei-Fei, Fergus, Torralba
  • 45. Object categorization mountain tree building banner street lamp vendor people Slide by Fei-Fei, Fergus, Torralba
  • 46. Scene and context categorization •  outdoor •  city •  … Slide by Fei-Fei, Fergus, Torralba
  • 47. Is this space large or small? How far are the buildings in the back? Slide by Fei-Fei, Fergus, Torralba
  • 48. Activity What is this person doing? What are these two doing?? Slide by Fei-Fei, Fergus, Torralba
  • 49. Outline •  Background and motivation •  Visual perception •  Object detection and recognition •  Scene recognition and analysis •  The role of context •  Representative work •  Concluding remarks
  • 50. What is a scene? •  A scene is a view of a real-world environment that contains multiples surfaces and objects, organized in a meaningful way. –  A tour of scene understanding literature: http://cvcl.mit.edu/SUNSarticles.htm
  • 51.
  • 52. The “gist” of a scene •  Mary Potter (1975, 1976) demonstrated that during a rapid sequential visual presentation (100 msec per image), a novel scene picture is indeed instantly understood and observers seem to comprehend a lot of visual information, but a delay of a few hundreds msec (~ 300 msec) is required for the picture to be consolidated in memory. •  The “gist” (a summary) refers to the visual information perceived after/during a glance at an image. •  To simplify, the gist is often synonymous with the basic level category of the scene or event (e.g. wedding, bathroom, beach, forest, street)
  • 53.
  • 54. What we (don’t) know about scene analysis, recognition, and classification •  Humans are very good at recognizing and classifying scenes •  We are also very fast (100 ms or less) •  We often sacrifice accuracy in the name of speed (we capture the gist but miss many details) •  How exactly do we do it?
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60. What is the basis for scene identification? •  Different schools of thought: – Scene-centered – Part-based (i.e., object-centered) – Holistic
  • 61.
  • 62.
  • 63. Outline •  Background and motivation •  Visual perception •  Object detection and recognition •  Scene recognition and analysis •  The role of context •  Representative work •  Concluding remarks
  • 64. Objects in context •  Objects do not exist isolated from a context •  Torralba’s challenge: “How far can you go without using an object detector?”
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 76. Why is context important?
  • 77.
  • 78. What are the hidden objects?
  • 79. What are the hidden objects?
  • 80.
  • 81.
  • 82.
  • 83. Biederman 1982 •  Pictures shown for 150 ms. •  Objects in appropriate context were detected more accurately than objects in an inappropriate context. •  Scene consistency affects object detection.
  • 84. Objects and Scenes Biederman’s violations (1981):
  • 87. Size
  • 89. Biederman’s classes in Computer Vision Galleguillos & Belongie, Tech Report (2008) •  Interposition and support can be coded by reference to physical space. •  Probability, position and size are defined as semantic relations because they require access to the referential meaning of the object. •  Semantic relations include information about detailed interactions among objects in the scene and they are often used as contextual features.
  • 90. Dreaming of an ideal computer vision solution… Galleguillos & Belongie, Tech Report (2008)
  • 91. Types of context Galleguillos & Belongie, Tech Report (2008) •  Contextual features can be grouped into 3 categories: –  semantic context (probability) –  spatial context (position) –  scale context (size). •  Contextual knowledge can be any information that is not directly produced by the appearance of an object. •  It can be obtained from: –  the nearby image data; –  image tags or annotations; –  the presence and location of other objects.
  • 92. Acquiring and modeling context Galleguillos & Belongie, Tech Report (2008) •  Which of the three should one use? –  Spatial and scale context are the most exploited types of context by recognition frameworks. –  Generally, semantic context is implicitly present in spatial context, as information of object co-occurrences come from identifying objects for the spatial relations in the scene. –  The same happens to scale context, as scale is measured with respect to others objects. –  Therefore, using spatial and scale context involve using all forms of contextual information in the scene.
  • 93. Outline •  Background and motivation •  Visual perception •  Object detection and recognition •  Scene recognition and analysis •  The role of context •  Representative work •  Concluding remarks
  • 94. Representative work •  There are many research groups working on the intersection of human and computer vision in numerous topics, including “objects in context”. •  Most expressive example: work by Aude Oliva and Antonio Torralba (and collaborators) at MIT.
  • 95. Representative work •  A case study: –  L.W. Renninger and J. Malik (2004). When is scene recognition just texture recognition? Vision Research, 44, 2301-2311.
  • 96. Renninger and Malik •  Basic idea –  Consider texture as an early cue for scene perception. •  It’s simple •  It’s fast (pre-attentive) (Julesz, 1981)
  • 97. Renninger and Malik •  Approach How well do humans Build a texture-based discriminate scenes model for scene with very limited discrimination. exposure? Compare performance!
  • 100. Renninger and Malik •  Task –  2AFC –  Subjects are shown an image •  Image exposure time: 37, 50 and 69ms –  Image followed by a jumbled scene mask –  The task is to select one of two word choices that best describes the image –  Subject performance: 77%, 82% and 92% correct •  Get ready…
  • 101.
  • 102.
  • 103.
  • 104.
  • 105.
  • 106. Texture Discrimination Model –  Cluster response distributions from V1-like filters to get prototypical responses (textons) –  Remember what types of textons occur in particular scenes (build histogram) –  Label new image using a nearest neighbor classifier •  Compare texton histogram for new image to stored representations (χ2 distance) (Malik and Perona, 1990) (Malik, et. al., 1999)
  • 110. Confusion matrix Outdoor/ Natural Indoor MM Natural 50.56 33.26 16.19 Outdoor / MM 23.14 46.54 30.33 Indoor 8.12 18.69 73.18
  • 112.
  • 113. Renninger and Malik •  Conclusion –  Early scene identification can be mostly explained by a simple texture model
  • 114. Outline •  Background and motivation •  Visual perception •  Object detection and recognition •  Scene recognition and analysis •  The role of context •  Representative work •  Concluding remarks
  • 115. Our experience •  Working with Dept of Psychology @ FAU –  Two joint graduate-level courses –  Joint student supervision –  Joint grant proposals –  Joint papers (in preparation) –  Constant discussions –  Promising days ahead… •  Imaging Science & Technology Center •  Multidisciplinary Vision Program
  • 116. Our focus •  To establish quantitative measures of the importance of context – Method: present subjects with degraded (blocky, blurry, etc.) objects against a context and ask them to recognize the objet as it becomes progressively more visible. – Human vision: behavioral experiments – Computer vision: stimuli creation
  • 117. Concluding remarks •  Great potential •  Cultural barriers •  Open problems and challenges on both sides •  The time is ripe for interdisciplinary research on vision, particularly “objects in context”
  • 118. Acknowledgments •  Thanks to Prof. Elan Barenholtz (Dept of Psychology, FAU) for allowing me to use some of his slides and for the many interesting discussions on the topics presented in this talk. •  Many slides for this talk contain material made publicly available on the Web by Antonio Torralba and Aude Oliva (MIT) and Fei-Fei Li (UIUC).
  • 119. Thank you for attending my talk! Questions? Email: omarques@fau.edu