This slidecast takes an informal approach to image processing using Matlab environment.
Very little math is involved to keep things simple. But the full essence is only felt with the math involved.
1. Digital Image Processing (DIP)
The Fundamentals - A MATLAB assisted non-mathematical approach
By Abhishek Sharma (EEE 2k7)
2. Materials
Some taken from slides by a NIT Surathkal friend : Varun
Nagaraja. He worked in the same lab at IISc Bangalore. Joined as
a PhD student in University of Maryland – College Park, a week
back.
Some material taken from online works of Sabih D. Khan, a
French-Pakistani researcher.
Some prepared by me!
3. Objective
To take a quasi non-mathematical , MATLAB assisted
approach to DIP.
I cut out the equations because…
1) I don’t want to scare you even before we start the climb
2) I am pathetic at elucidating maths concepts
3) I don’t have a lot of time.
4) You don’t need maths to start working in DIP.
If I am able to make you fall in love with DIP, I hope like
real world love, you will be able to accept it with its’
shortcomings – maths!
4.
5.
6.
7.
8. SixthSense
by Pranav Mistry, MIT Media
Labs
'SixthSense' is a wearable gestural interface that
augments the physical world around us with digital
information and lets us use natural hand gestures to
interact with that information.
9.
10. Image Processing - a definition
Image Processing generally involves extraction of useful
information from an image.
This useful information may be the dimensions of an
engineering component, size of diagnosed tumor, or
even a 3D view of an unborn baby.
11. Intro to DIP with MATLAB
Images can be conveniently represented as matrices in
Matlab.
One can open an image as a matrix using imread
command.
The matrix may simply be m x n form or it may be 3
dimensional array or it may be an indexed
matrix, depending upon image type.
The image processing may be done simply by matrix
calculation or matrix manipulation.
Image may be displayed with imshow command.
Changes to image may then be saved with imwrite
command.
12. Image types
Images may be of three types i.e. black & white, grey
scale and colored.
In Matlab, however, there are four types of images.
Black & White images are called binary images,
containing 1 for white and 0 for black.
Grey scale images are called intensity images,
containing numbers in the range of 0 to 255 or 0 to 1.
Colored images may be represented as RGB Image or
Indexed Image.
13. Image types
In RGB Images there exist three indexed images.
First image contains all the red portion of the
image, second green and third contains the blue portion.
So for a 640 x 480 sized image the matrix will be 640 x
480 x 3.
An alternate method of colored image representation is
Indexed Image.
It actually exist of two matrices namely image matrix
and map matrix.
Each color in the image is given an index number and in
image matrix each color is represented as an index
number.
Map matrix contains the database of which index
number belongs to which color.
14. MATLAB – image conversions
RGB Image to Intensity Image (rgb2gray)
RGB Image to Indexed Image (rgb2ind)
RGB Image to Binary Image (im2bw)
Indexed Image to RGB Image (ind2rgb)
Indexed Image to Intensity Image (ind2gray)
Indexed Image to Binary Image (im2bw)
Intensity Image to Indexed Image (gray2ind)
Intensity Image to Binary Image (im2bw)
Intensity Image to RGB Image (gray2ind, ind2rgb)
15. Image Histograms
There are a number of ways to get statistical information about
data in the image.
Image histogram is one such way.
An image histogram is a chart that shows the distribution of
intensities in an image.
Each color level is represented as a point on x-axis and on y-
axis is the number instances a color level repeats in the image.
Histogram may be view with imhist command.
Sometimes all the important information in an image lies only
in a small region of colors, hence it usually is difficult to extract
information out of that image.
To balance the brightness level, we carry out an image
processing operation termed histogram equalization. Use
MATLAB histeq command
17. Simple character recognition
code
Detect only particular
characters and numbers in an
image.
Characters are in white and of
a fixed size.
Background is black in color.
The image is in binary format.
We will explore various DIP
concepts while we do this….
Let’s start
19. Let’s have some fun… I know my
sense of humor sucks!
Now read the image ‘same color.jpg’ and display it on a window.
Once the image is displayed in the window, select Tools –Data
Cursor or select the shortcut on the toolbar.
Click on point A as shown, on the image. It displays three values
(RGB) since it is a color image. You can try reading pixel values for
the previous image. It will be either 0/1 since it is binary image.
Hold Alt and Click on point B. This creates something called as a
new datatip.
Now for some fun
What are the RGB values at the two points?
20. Morphological Operations
These are image processing operations done on
binary images based on certain morphologies or
shapes.
The value of each pixel in the output is based on the
corresponding input pixel and its neighbors.
By choosing appropriately shaped neighbors one can
construct an operation that is sensitive to a certain
shape in the input image.
22. Skeletonize
It creates skeleton of an object, by removing pixels on
the boundaries but does not allow objects to break
apart.
It is an extremely important operation in image
processing as it removes complexities from an image
without loosing details.
23. Erosion and Dilation
These are the most fundamental of binary
morphological operations.
In dilation if any pixel in the input pixel’s neighborhood
is on, the output pixel is on otherwise off.
In actual dilation grows the area of the object. Small
holes in the object are removed.
In erosion if every pixel in the input pixel’s
neighborhood is on the output pixel is on otherwise off
This in actual works as shrinking the object’s area, thus
small isolated regions disappear.
24. Dilation….
Dilation does not necessarily mean dilation of the holes also. The
holes get contracted as shown above.
Also try image erosion. Use MATLAB’s help.
25. Dilation…
adds pixels to the boundaries of objects in an image.
number of pixels added from the objects in an image
depends on the size and shape of the structuring
element
function strel(…)can be used to generate the SEs.
26. Structuring Elements
In mathematical morphology,
structuring element is a shape, used
to probe or interact with a given
image, with the purpose of drawing
conclusions on how this shape fits or
misses the shapes in the image.
Check out help on strel for various
combinations
27. Continuing with the algo…
When the dilated image of the character is
subtracted from the original we get something
like…
Next we create such images for all the
characters that we want to recognize.
(For all those individual character images in
the folder)
28. Hit or miss?
Function, bwhitmiss is employed to check if a particular
character is present in the given image.
bwhitmiss(BW1,SE1,SE2)performs the hit‐miss operation
defined by the structuring elements SE1 and SE2. The
hit‐miss operation preserves pixels whose neighborhoods
match the shape of SE1 and don't match the shape of SE2.
If the matrix returned by bwhitmiss contains nonzero
elements, then the character is found in the image.
Also note the use of functions isempty and nonzeros
You can now use charrec.m to recognize few characters
in a crude way.
31. Image Segmentation
The goal of image segmentation is to cluster pixels
into salient image regions, i.e., regions corresponding
to individual surfaces, objects, or natural parts of
objects.
A segmentation could be used for object recognition,
image compression, image editing, or image database
look-up.
34. Image Segmentation - Global
Thresholding
Disadvantage is when there are multiple colors for objects and
backgrounds.
35. Otsu’s Method
Based on a very simple idea: Find the threshold that
minimizes the weighted within-class variance.
This turns out to be the same as maximizing the
between-class variance.
Operates directly on the gray level histogram [e.g.
256 numbers, P(i)], so it’s fast (once the histogram is
computed).
36. The weighted within-class variance is:
(t) q1 (t) (t) q2 (t) (t)
2
w
2
1
2
2
Where the class probabilities are estimated as:
t I
q1 (t) P(i) q2 (t) P(i)
i t 1
i 1
And the class means are given by:
I
t
iP(i) iP(i)
1 (t) 2 (t)
i 1 q1 (t) i t 1 q2 (t )
41. This was again a very crude way, since we are depending
only on value of area which might not remain constant if
camera changes position.
Most of the times the standard features available with
regionprops() is not sufficient. We will have to write our
own code to extract features.
Also we used hard thresholds for are as to classify CCs.
Again most of the times, this is not followed. Classifiers
using Pattern Recognition techniques are employed.
42. Why edges?
Reduce dimensionality of data
Preserve content information
Useful in applications such as:
◦ object detection
◦ structure from motion
◦ tracking
43. Why not edges?
But, sometimes not that useful, why?
Difficulties:
1. Modeling assumptions
2. Parameters
3. Multiple sources of information
(brightness, color, texture, …)
4. Real world conditions
Is edge detection even well defined?
47. Canny difficulties
1. Modeling assumptions
Step edges, junctions, etc.
2. Parameters
Scales, threshold, etc.
3. Multiple sources of information
Only handles brightness
4. Real world conditions
Gaussian iid noise? Texture…
48. Edge Detection
Edge detection extract edges of objects from an image.
There are a number of algorithms for this, but these
may be classified as derivative based or gradient based.
In derivative based edge detection the algorithm takes
first or second derivative on each pixel in the image.
In case of first derivative at the edge of the image there
is a rapid change of intensity.
While in case of second derivative there is a zero pixel
value, termed zero crossing.
In gradient based edge detection a gradient of
consecutive pixels is taken in x and y direction.
49.
50. Taking derivative on each and every pixel of the image
consumes a lot of computer resources and hence is not
practical.
So usually an operation called kernel operation is
carried out.
A kernel is a small matrix sliding over the image matrix
containing coefficients which are multiplied to
corresponding image matrix elements and their sum is
put at the target pixel.
53. Best one is – Canny’s Method
Uses both the derivative and the gradient to perform
edge detection
The maths is a bit complex
Can through the PDF in your folder later.
Pass ‘canny’ as a parameter to the edge function to
perform canny.