The document summarizes research into visual perception and optical illusions. It discusses how vision researchers study human vision through experiments, brain imaging, and by examining patients with brain or eye issues. It provides examples of common optical illusions and ways our perception can be fooled, such as through ambiguous images, brightness/contrast effects, and how prior knowledge influences interpretation. The talk discusses applications of this research including computational models of visual attention, image retrieval, and a game the speaker developed to detect objects using semantic and location clues. In the end, it questions whether we can truly trust what we see based on the limitations of human perception.
Injustice - Developers Among Us (SciFiDevCon 2024)
Can you trust what you see? The magic of visual perception
1. Can you trust what you see?
The magic of visual perception
Oge Marques, PhD
Professor
College of Engineering and Computer Science
Florida Atlantic University – Boca Raton, FL (USA)
2. The Distinguished Speakers Program
is made possible by
For additional information, please visit http://dsp.acm.org/
3. About ACM
ACM, the Association for Computing Machinery is the world’s largest
educational and scientific computing society, uniting educators, researchers and
professionals to inspire dialogue, share resources and address the field’s
challenges.
ACM strengthens the computing profession’s collective voice through strong
leadership, promotion of the highest standards, and recognition of technical
excellence.
ACM supports the professional growth of its members by providing
opportunities for life-long learning, career development, and professional
networking.
With over 100,000 members from over 100 countries, ACM works to advance
computing as a science and a profession. www.acm.org
4. A man enters a room…
Source: https://www.youtube.com/watch?v=zNbF006Y5x4
6. My background
• Oge Marques, PhD
– Professor of Engineering and Computer Science at
FAU
– Resarch focus: Intelligent processing of visual
information (blend of image processing, computer
vision, human vision, artificial intelligence and
machine learning).
– 10 years ago, I’ve decided to study human vision and
actively interact with researchers in the field.
– Here are some of the things I’ve learned along the
way…
Facebook: https://www.facebook.com/ProfessorOgeMarques
7. Goals of this talk
• To explore together several visual
perception phenomena that challenge
our common knowledge of how well we
make decisions upon the information that
arrives at our brain through our eyes.
• To examine possible applications of
human vision knowledge to the
solution of computer vision
research questions.
8. Visual illusions
• Serious vision research
– “Errors of perception
(phenomena of illusions) can
be due to knowledge being
inappropriate or being
misapplied. So illusions are
important for investigating
cognitive processes of
vision.”
(Richard Gregory)
• Fun (party tricks)
– “Tricks work only because
magicians know, at an
intuitive level, how we look at
the world. […] Magicians
were taking advantage of
these cognitive illusions long
before any scientist identified
them.” (Stephen Macknik and
Susana Martinez-Conde)
9. Speaking of fun tricks…
Source: https://www.youtube.com/watch?v=r6h02WuxmVY
23. Things we DO know
• Vision for ACTION vs. vision for RECOGNITION
What?
Where?
24. Example of what we DON’T know
(yet)
• The moon seems
larger when it is
near the horizon
than when it is high
in the sky. Why?
• It fools the human
brain, but cannot
be captured in a
photo.
• Many competing
theories, no
consensus.
Source: https://freethoughtblogs.com/singham/files/2014/02/moonrise-timelapse-over-la.jpg
The moon illusion
25. How scientists learn about
human vision
• Patients with brain damage or eye conditions
• Direct access to the brain
– Single-cell recording
– Modern brain imaging and activity recording
devices
• Controlled experiments
– Calibrated monitors and rooms
– Eye-tracking devices
– Psychophysics
26. Can you trust your brain?
• “Our brains are brilliant instruments, able to reason,
synthesize, remember and imagine at an extraordinary
pitch and rate. We trust them immediately and
innately – and have reasons to be deeply proud of
them too.
• However, these brains […] are also very subtly and
dangerously flawed machines, flawed in ways that
typically don’t announce themselves to us and
therefore give us few clues as to how on guard we
should be about our mental processes.”
(Alain de Botton, “The faulty walnut”)
Source: http://www.thebookoflife.org/the-faulty-walnut/
30. Sometimes we must make a ‘best guess’
Source: http://www.slideshare.net/mrg3515/optical-illusions-8167051/3
31. Sometimes we even combine
two or more illusions
Source: Goldstein (2002)
32. Sometimes we have trouble with
(relative) brightness and contrast
Source: Wikimedia Commons
33. Sometimes we have trouble with
(relative) brightness and contrast
Source: Wikimedia Commons
34. Sometimes we have trouble with
color (constancy)
Source: http://www.lottolab.org/
35. “The dress”
• On Feb 26, 2015 this
dress “broke the
Internet”
– #whiteandgold
or
– #blackandblue?
36. “The dress”: a simplified
explanation
• Most of the time, our visual system does a
remarkable job of inferring the ambient
lighting conditionsat any given time and
discountingtheir contribution to color
computations.
• But in this image, the cues to the lighting
conditions are particularly ambiguous.
• Is the light illuminating the dress bright
and yellowish or is it dim and blueish? Your
brain has to make a guess.
Source: http://web.mit.edu/bcs/nklab/what_color_is_the_dress.shtml
37. “The dress” meets the color
cube
• An experiment by Rosa Lafer-Sousa (Kanwisher Lab, MIT)
combined the dress with Beau Lotto’s color cube.
Here are the results:
Source: http://web.mit.edu/bcs/nklab/what_color_is_the_dress.shtml
38. “The dress”
• But what color is it?
• Think the controversy is over? Think again!
55. Sometimes we know that what we’re
seeing is not what is there…
… but we still can’t help it.
Source: Gregory (2006)
56. Applications to multimediaresearch
• Computational modeling of visual attention
– Image retrieval
– Object detection
• Face recognition
– Game: Guess That Face
58. We can only pay attention to
part of the visual scene
Which
part?
Source: Yarbus (1967)
59. We can only pay attention to
part of the visual scene
• Contemporary computer models
Source: http://www.saliencytoolbox.net/
60. Our work
Visual Attention + Image Retrieval
Hindawi Publishing Corporation
EURASIP Journal on Advances in Signal Processing
Volume 2007, Article ID 43450, 17 pages
doi:10.1155/2007/43450
Research Article
An Attention-Driven Model for Grouping Similar
Images with Image Retrieval Applications
Oge Marques,1 Liam M. Mayron,1 Gustavo B. Borba,2 and Humberto R. Gamba2
1 Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431-0991, USA
2 Programa de P´os-Graduac¸˜ao em Engenharia El´etrica e Inform´atica Industrial, Universidade Tecnol´ogica Federal do Paran´a (UTFPR),
Curitiba, Paran´a 80230-901, Brazil
Received 1 December 2005; Revised 3 August 2006; Accepted 26 August 2006
Recommended by Gloria Menegaz
Recent work in the computational modeling of visual attention has demonstrated that a purely bottom-up approach to identify-
ing salient regions within an image can be successfully applied to diverse and practical problems from target recognition to the
placement of advertisement. This paper proposes an application of a combination of computational models of visual attention to
the image retrieval problem. We demonstrate that certain shortcomings of existing content-based image retrieval solutions can
be addressed by implementing a biologically motivated, unsupervised way of grouping together images whose salient regions of
interest (ROIs) are perceptually similar regardless of the visual contents of other (less relevant) parts of the image. We propose a
model in which only the salient regions of an image are encoded as ROIs whose features are then compared against previously seen
ROIs and assigned cluster membership accordingly. Experimental results show that the proposed approach works well for several
61. Our work
Visual attention + object detection (using a game)
Ask’nSeek: a new game for object detection and labeling
Axel Carlier1, Oge Marques2, and Vincent Charvillat1
1 IRIT-ENSEEIHT, University of Toulouse, France
{Axel.Carlier, Vincent.Charvillat}@enseeiht.fr
2 Florida Atlantic University, USA omarques@fau.edu
Abstract. This paper proposes a novel approach to detect and label objects within
images and describes a two-player web-based guessing game – Ask’nSeek – that
supports these tasks in a fun and interactive way. Ask’nSeek asks users to guess
the location of a hidden region within an image with the help of semantic and
topological clues. The information collected from game logs is combined with
62. Face Recognition
We seem to be particularly good at recognizing
famous / familiar faces even when they’re blurry
Unlike current machine-based systems, human observers are able to handle significant degradations in face images. For instance
ts are able to recognize more than half of all familiar faces shown to them at the resolution depicted here. Individuals shown in
are: Michael Jordan, Woody Allen, Goldie Hawn, Bill Clinton, Tom Hanks, Saddam Hussein, Elvis Presley, Jay Leno,
Hoffman, Prince Charles, Cher, and Richard Nixon.
et al.: Face Recognition by Humans: Nineteen Results Researchers Should Know About
Sinha et al.: Face Recognition by Humans: Nineteen Results Researchers Should Know About