Scale Invariant Feature Transform

SIFT
Scale Invariant Feature Transform

The term is a difficult one … so lets see through an example

Peaks …. aren’t they interesting!

And I kept marking them … Phew!

The sky isn’t
interesting at
all!

It is tiresome to do all this by hand!
• What if , we can perform it automatically …
that is, given the Taj image as input
I get the “keypoints” marked image as the output

SIFT does exactly this!
Formally,
SIFT is a method to detect distinctive, invariant image feature
points, which can be matched between images to perform tasks
such as object detection and recognition, or to compute
geometrical transformations between images.

Formally,
Remember this!!!
The whole TAJ example was
based on this.

Formally,
Let’s work out
this term!

Understanding Invariance.
SIFT keypoints are said to be scale and orientation invariant.
That means,
The change in scale would not affect the keypoints
detected.
In other words, same keypoints will be generated for two images of the
same object but different scales.

An Eiffel tower will remain an Eiffel tower in the two following images at different distances !
Similarly, if the keypoints remains the same even if the zoomed image is provided
we say our keypoints are scale invariant!

How does a SIFT output looks like?

Why are we doing this all?
The distinctive invariant key-points that we were trying to extract till
now can be used to perform reliable matching between different views
of an object or scene.

Our department pic
taken from 2 different
viewpoints

This is a powerful scene matching using SIFT!

Step by Step… Internals of SIFT
• Scale-space extrema detection.
• Keypoint localization.
• Orientation assignment.
• Keypoint descriptor.

1. Scale-space extrema detection.
The first stage of keypoint detection is to identify locations and scales
that remain firm under differing views of the same object.
**********************How we do it?***********************
Detecting locations that are in variant to scale change of the image can
be accomplished by searching for stable features across all possible
scales, using a continuous function of scale known as scale space

• Convolve the input image with different Gaussian Kernel .
• Subtract adjacent one from the other. These output images are usually called
DoG ( Difference of Gaussians )

Maxima and minima of the difference-of-Gaussian images are detected
by comparing a pixel (marked with X) to its 26 neighbors in 3x3 regions
at the current and adjacent scales (marked with circles)

2. Keypoint Localization.
Once a keypoint candidate has been found by comparing a pixel to its
neighbors, the next step is to perform a detailed fit to the nearby data
for location, scale, and ratio of principal curvatures.
This information allows points to be rejected that have low contrast
(and are therefore sensitive to noise) or are poorly localized along an
edge.

2. Keypoint Localization.
Essentially, we are removing some of the outliers here.

3. Orientation Assignment.
Here we assign a consistent orientation to each keypoint based on local
image properties.
The keypoint descriptor can be represented relative to this orientation
and therefore achieve invariance to image rotation.

Orientations are assigned in this step.
3. Orientation Assignment.

4. Keypoint Descriptor.
• A keypoint descriptor is created by first computing the gradient
magnitude and orientation at each image sample point in a region
around the keypoint location, as shown below.
• These are weighted by a Gaussian window, indicated by the overlaid
circle.

4. Keypoint Descriptor.
These samples are then accumulated into orientation histograms summarizing the contents over 4x4
subregions, as shown on the right, with the length of each arrow corresponding to the sum of the
gradient magnitudes near that direction within the region.
8
8 8 8
8+8+8+8 = 32 variable vectors.

Finally …
• Each keypoint has these attributes now:
• Keypoints location (x,y coordinates)
• Orientation direction
• A 32(or 128 ) variable vector associated with it
These things make each of the point distinct and identifiable from the others

Scene matching is quite old …
There are pretty amazing stuffs happening currently where these
keypoints are used.
One of them Is digital makeup.
We identify the keypoints here, mark them. Taking them as the
reference point, we project light in a coherent manner that makes it
look like a makeup.
Saves huge costs in movies production!!

References
• Lowe, David G. (1999). "Object recognition from local scale-invariant features". "Proceedings of the
International Conference on Computer Vision" 2. pp. 1150–1157. doi:10.1109/ICCV.1999.790410.
• "Method and apparatus for identifying scale invariant features in an image and use of same for locating an
object in an image", David Lowe's patent for the SIFT algorithm, March 23, 2004
• Beis, J., and Lowe, D.G “Shape indexing using approximate nearest-neighbour search in high-dimensional
spaces”, Conference on Computer Vision and Pattern Recognition, Puerto Rico, 1997, pp. 1000–1006.

Scale Invariant Feature Transform

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Scale Invariant Feature Transform

Ähnlich wie Scale Invariant Feature Transform (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Scale Invariant Feature Transform