PPT s02-machine vision-s2

Course: Machine Vision
Human Visual System and
Digital Camera
Session 02
D5627 – I Gede Putra Kusuma Negara, B.Eng., PhD

Outline
• Human Visual System
• Digital Cameras
• Image Formation
• Camera and Image Plane Coordinates
• Image Sensing

The Human Eye
• Diameter: 20 mm
• 3 membranes enclose the eye
• Cornea & sclera
• Choroid
• Lens
• Retina
• The human eye is a camera:
photoreceptor cells (rods and
cones) in the retina have the
same function as
film/CCD/CMOS sensor in a
camera

The Human Eye
The Choroid
• The choroid contains blood
vessels for eye nutrition and is
heavily pigmented to reduce
extraneous light entrance and
backscatter.
• It is divided into the ciliary body
and the iris diaphragm, which
controls the amount of light that
enters the pupil (2 mm ~ 8 mm).

The Human Eye
The Lens
• The lens is made up of fibrous
cells and is suspended by fibers
that attach it to the ciliary body.
• It is slightly yellow and absorbs
approx. 8% of the visible light
spectrum

The Human Eye
The Retina
• The retina lines the entire
posterior portion.
• Discrete light receptors are
distributed over the surface of
the retina:
– cones (6-7 million per eye) and
– rods (75-150 million per eye)

Light Receptors
• Cones
– Cones are located in the fovea and
are sensitive to color
– Each one is connected to its own
nerve end
– Cone vision is called photopic (or
bright-light vision)
• Rods
– Rods are giving a general, overall
picture of the field of view and are
not involved in color vision
– Several rods are connected to a
single nerve and are sensitive to
low levels of illumination (scotopic
or dim-light vision)

Light Receptors Distribution
• The distribution of receptors is
radially symmetric about the
fovea.
• Cones are most dense in the
center of the fovea while rods
increase in density from the
center out to approximately 20%
off axis and then decrease.

The Fovea
• The fovea is circular (1.5 mm in
diameter) but can be assumed to
be a square sensor array (1.5 mm
x 1.5 mm).
• The density of cones: 150,000
elements/mm2 ~ 337,000 for the
fovea.
• A CCD imaging chip of medium
resolution needs 5 mm x 5 mm
for this number of elements

Image Formation in the Eye
• The eye lens (if compared to an
optical lens) is flexible
• It gets controlled by the fibers of
the ciliary body and to focus on
distant objects it gets flatter (and
vice versa)
• Distance between the center of the
lens and the retina (focal length):
– varies from 17 mm to 14 mm
(refractive power of lens goes from
minimum to maximum)
– Objects farther than 3 m use
minimum refractive lens powers
(and vice versa)

Image Formation in the Eye
• Perception takes place by the relative excitation of light receptors
• These receptors transform radiant energy into electrical impulses that
are ultimately decoded by the brain
• Example: calculation of retinal image of an object

Brightness Adaptation &
Discrimination
• Range of light intensity levels
to which HVS (human visual
system) can adapt: on the
order of 1010
• Subjective brightness (i.e.
intensity as perceived by the
HVS) is a logarithmic function
of the light intensity incident
on the eye

Discrimination
• The HVS cannot operate over
such a range simultaneously
• If one is at Ba intensity outside
and walk into a dark theater,
he/she can only distinguish up to
Bb. It will take much longer for
eye to adapt for the stocopic
vision to pick up
• For any given set of conditions,
the current sensitivity level of
HVS is called the brightness
adaptation level
The typical
observer can
discern one to two
dozen different
intensity changes

Discrimination
• Eye response to the signal
intensity is not determined by
the nominal change in physical
stimulus (light energy), rather by
its change relative to its initial
level
• In general, there is a minimum
required change in signal
intensity needed to produce
change in sensation, and the
latter is not necessarily
proportional to the former
• Eye brightness response is not
proportional to light's nominal
(physical) intensity, but
proportional to its intensity level.
This threshold ratio, which he
found to be 1/64 (around 1.5%)
ratio
Weber

D
I
Ic

Discrimination
• Overall intensity discrimination is
broad due to different set of
incremental changes to be
detected at each new adaptation
level.
• Perceived brightness is not a
simple function of intensity
– Scalloped effect, Mach band
pattern
– Simultaneous contrast

Traditional Photography
• Camera obscura: Known during classical period in China and Greece
(e.g. Mo-Ti, China, 470BC to 390BC)

First Photograph
• Oldest surviving photograph by Joseph Niepce, 1826
• It took 8 hours on pewter plate
• It stored at UT Austin

Image Formation
• Images are typically generated by illuminating a scene and absorbing
the energy reflected by the objects in that scene

Imaging Device
• Basic elements of an imaging
device
– Aperture: an opening (or
“pupil”) to limit amount of light,
and angle of incoming light
rays
• Optical system
– Lenses: purpose is to focus light
from a scene point to a single
image point
• – Imaging photosensitive
surface
– Film or sensors, usually a plane

Pinhole Camera
• Light enters a darkened chamber through pinhole opening
and forms an image on the further surface
f
f = focal length
c = center of the camera
c

Digital SLR (Single-lens Reflex)
Camera

Perspective Projection Equations
By similar triangles,
x=f X/Z, y=f Y/Z
Field of view (θ)
tan(θ/2) = (w/2)/f

Camera vs Image Plane
Coordinates
Camera coordinate system {C}
• A 3D coordinate system (X,Y,Z) – units
say, in meters
• Origin at the center of projection
• Z axis points outward along optical axis
• X points right, Y points down
Image plane coordinate system {π}
• A 2D coordinate system (x,y) – units in
mm
• Origin at the intersection of the optical
axis with the image plane
• In real systems, this is where the CCD or
CMOS plane is

Examples
• Assume focal length = 5 mm
• A scene point is located at (X,Y,Z) = (1m, 2m, 5m)
• What are the image plane coordinates (x,y) in mm?
x = f X/Z = (5 mm) (1 m)/(5 m) = 1 mm
y = f Y/Z = (5 mm) (2 m)/(5 m) = 2 mm
• If the image plane is 10mm x 10mm, what is the field of view?
tan(θ/2) = (w/2)/f = 5/5 = 1
θ/2 = 45 deg, fov is 90x90 deg
• A building is 100m wide. How far away do we have to be in order that
it fills the field of view?
tan(θ/2) = (W/2)/Z= 50/Z
so Z = 50 m

Image Buffer
Image plane
• The real image is formed on the
CCD plane
• (x,y) units in mm
• Origin in center (principal point)
Image buffer
• Digital (or pixel) image
• (row,col) indices
• We can also use (xim,yim)
• Origin in upper left
(0,0)
(1,1)
x
y
xim
yim

Conversion between real image
and pixel image coordinates
Assume
• The image center (principal
point) is located at pixel (cx,cy) in
the pixel image
• The spacing of the pixels is (sx,sy)
in millimeters
Then
• x=(xim –cx)sx
• xim = x/sx + cx
• y=(yim –cy)sy
• yim = y/sy +cy
(Cx,Cy)
x
y
xim
yim

Note on Focal Length
• Recall
x=(xim –cx)sx
y=(yim –cy)sy
• or
xim = x/sx + cx
yim = y/sy + cy
• and
x = f X/Z
y = f Y/Z
• So
xim = (f/sx) X/Z + cx
yim = (f/sy) Y/Z + cy
• All we really need is
fx = (f/sx)
fy = (f/sy)
• We don’t need to know the
actual values of f and sx,sy; just
their ratios
• We can alternatively express
focal length in units of pixels

Image Sensing
• Incoming energy lands on a
sensor material responsive to
that type of energy and this
generates a voltage
• Collections of sensors are
arranged to capture images
• Currently there are two kind
of popular image sensors: CCD
(Charge Coupled Device) and
CMOS (Complimentary Metal
Oxide on Silicon)

Image Sensing
Flatbed scanner
Magnetic Resonance
Imaging (MRI)

Image Sampling
and Quantization
• A digital sensor can only measure a limited number of samples at a
discrete set of energy levels
• Quantization is the process of converting a continuous analogue signal
into a digital representation of this signal

Image Sampling
and Quantization

Image Sampling
and Quantization
• Always remember that a
digital image is always only
an approximation of a real
world scene

Digital Image Representation
• Digital image is composed of M
rows and N columns of pixels
each storing a value
• Pixel values are most
often grey levels in the
range 0-255 (black-white)
• Images can easily
be represented as
matrices
col
row
f (row, col)

Acknowledgment
Some of slides in this PowerPoint presentation are adaptation from
various slides, many thanks to:
1. Professor Peggy Agouris, Department of Geography and
GeoInformation Science, George Mason University
(http://ggs.gmu.edu/People/Agouris/Agouris.html)
2. Professor William Hoff, Department of Electrical Engineering &
Computer Science (http://inside.mines.edu/~whoff/)
3. Dr. Brian Mac Namee, School of Computing at the Dublin Institute of
Technology (http://www.comp.dit.ie/bmacnamee/gaip.htm)

PPT s02-machine vision-s2

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie PPT s02-machine vision-s2

Ähnlich wie PPT s02-machine vision-s2 (20)

Mehr von Binus Online Learning

Mehr von Binus Online Learning (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

PPT s02-machine vision-s2