rips-hk-lenovo (1)

Creation and Optimization of a Logo
Recognition System
Haozhi Qi, Owen Richﬁeld, Xiaohui Zeng, Michael Zhao
Academic Mentor: Dr. Albert Ku
Industrial Mentor: Mr. Sun Lin
August 6, 2015
Qi, Richﬁeld, Zeng, Zhao
RIPS-HK: Lenovo

Problem Description
Problem: What if there was an
app that could provide a
smartphone user with
information about a company
just by recognizing that
company’s logo in an image?
Goal: Create this app.
RIPS-HK: Lenovo

Outline
Model Introduction
RIPS-HK: Lenovo

Outline
Model Introduction
Bag of Features Model
RIPS-HK: Lenovo

Outline
Model Introduction
Convolutional Neural Network
RIPS-HK: Lenovo

Outline
Model Introduction
Model Testing and Results
RIPS-HK: Lenovo

Outline
Model Introduction
Application Demonstration
RIPS-HK: Lenovo

Outline
Model Introduction
Application Demonstration
Conclusions and Future Work
RIPS-HK: Lenovo

RIPS-HK: Lenovo

Feature Extraction
RIPS-HK: Lenovo

Feature Extraction and description: SURF
Interest points detection
RIPS-HK: Lenovo

Rotational and scale-invariant features
RIPS-HK: Lenovo

Interest points description
RIPS-HK: Lenovo

Interest points description
Good representation form of image
RIPS-HK: Lenovo

SURF: Interest points detection
Use determinant of Hessian to detect blob-like structure
RIPS-HK: Lenovo

Use box filter to approximate the second order derivative of Gaussian filter
Second-order box filter
RIPS-HK: Lenovo

Taking advantages of integral image
RIPS-HK: Lenovo

Taking advantages of integral domain
Apply scale-space analysis to choose
the appropriate points scale
RIPS-HK: Lenovo

SURF: Interest points description
Calculate dominant orientation based on Haar wavelet analysis
RIPS-HK: Lenovo

SURF: Interest points description
Calculate dominant orientation based on Haar wavelet analysis
Build 4*4 descriptor
RIPS-HK: Lenovo

BOW Training
RIPS-HK: Lenovo

Feature Vector Clustering
RIPS-HK: Lenovo

Basics of K-means
Clustering Method in N-dimensional Space
RIPS-HK: Lenovo

Basics of K-means
Algorithmic Steps:
RIPS-HK: Lenovo

Basics of K-means
Algorithmic Steps:
With a given set of data, choose k cluster centers
RIPS-HK: Lenovo

Basics of K-means
Algorithmic Steps:
Calculate distances between each data point and each
cluster
RIPS-HK: Lenovo

Basics of K-means
Algorithmic Steps:
cluster
Cluster points based on min distance
RIPS-HK: Lenovo

Basics of K-means
Algorithmic Steps:
cluster
Recalculate cluster centers:
vi =
1
ci
ci
j=1
xj
RIPS-HK: Lenovo

Basics of K-means
Algorithmic Steps:
cluster
Recalculate cluster centers:
vi =
1
ci
ci
j=1
xj
vi=new cluster center, ci=number of data points in ith
cluster, xj=jth
data point in ith
cluster.
RIPS-HK: Lenovo

K-means Clustering
RIPS-HK: Lenovo

Hierarchical K-means
RIPS-HK: Lenovo

Bag of Words and Hierarchical K-means
FEATURE VECTORS
CL.
CL. CL. CL.
CL.
CL. CL.
CL.
CL. CL.
CL.
CL. CL.
RIPS-HK: Lenovo

CL.
CL. CL. CL.
CL.
CL. CL.
CL.
CL. CL.
CL.
CL. CL.
RIPS-HK: Lenovo

CL.
CL. CL. CL.
CL.
CL. CL.
CL.
CL. CL.
CL.
CL. CL.
X
RIPS-HK: Lenovo

CL.
CL. CL. CL.
CL.
CL. CL.
CL.
CL. CL.
CL.
CL. CL.
X X
RIPS-HK: Lenovo

CL.
CL. CL. CL.
CL.
CL. CL.
CL.
CL. CL.
CL.
CL. CL.
X XXXXXX
X X X
RIPS-HK: Lenovo

word
1
word
2
word
3
word
4
word
5
0
2
4
6
8
3
8
2
5
1
matches
RIPS-HK: Lenovo

Inverted File Index
word 1:
word 2
word 3
word 4
word 5
word 6
...
RIPS-HK: Lenovo

Inverted File Index
word 1: image 1, image 3, image 5, ...
...
RIPS-HK: Lenovo

Classification: Inverted File Index
Benefit: retrieval via the inverted file is faster than
searching every image
RIPS-HK: Lenovo

Drawback: lack of spatial accuracy
RIPS-HK: Lenovo

Drawback: lack of spatial accuracy
Need additional veriﬁcation to re-rank the retrieval images
RIPS-HK: Lenovo

Re-ranking of Return Images
Match descriptors of query image to descriptors in images
in returned list.
RIPS-HK: Lenovo

in returned list.
Simple Algorithm:
RIPS-HK: Lenovo

in returned list.
Simple Algorithm:
Match each descriptor in query image to its nearest
neighbor descriptor from list image.
RIPS-HK: Lenovo

in returned list.
Simple Algorithm:
Compare L2 norm of the pair to the norm of the query
descriptor and every other descriptor in list image.
RIPS-HK: Lenovo

in returned list.
Simple Algorithm:
If original norm is signiﬁcantly smaller, count as “match”.
RIPS-HK: Lenovo

in returned list.
Simple Algorithm:
Sum up number of “matches” for each list image and divide
by total number of features.
RIPS-HK: Lenovo

in returned list.
Simple Algorithm:
Sum up number of “matches” for each list image and divide
by total number of features.
The returned list is then re-ranked based on this “match
ratio” and returned to the user.
RIPS-HK: Lenovo

Convolutional Neural
Networks (CNNs)
RIPS-HK: Lenovo

Neural Networks
Figure: Neural network from http://www.texample.net/media/
tikz/examples/PNG/neural-network.png
RIPS-HK: Lenovo

Convolutional Neural Networks
Convolutional neural networks are neural networks with an
additional biological inspiration.
RIPS-HK: Lenovo

additional biological inspiration. Each layer is of two basic
types: convolution and pooling.
Convolution is the process of convolving an image with a
kernel. This idea comes from image processing where it
has been used for things like edge detection. Here, we
want to learn kernels speciﬁc to the data.
RIPS-HK: Lenovo

additional biological inspiration. Each layer is of two basic
types: convolution and pooling.
Convolution is the process of convolving an image with a
kernel. This idea comes from image processing where it
has been used for things like edge detection. Here, we
want to learn kernels speciﬁc to the data.
Pooling refers to the process of providing a statistical
summary of the outputs of several nearby “neurons”, e.g.
by taking an average or max.
RIPS-HK: Lenovo

Figure: Description of convolution process from http://www.
songho.ca/dsp/convolution/files/conv2d_matrix.jpg.
RIPS-HK: Lenovo

Implementation and Architecture
For implementation of CNNs, we used Caffe [?]. We only had
around 16,000 images, so we used two pre-trained models to
do ﬁne-tuning:
RIPS-HK: Lenovo

do ﬁne-tuning:
AlexNet [?], the winner of the ImageNet Large Scale Visual
Recognition Challenge (ILSVRC) 2012.
RIPS-HK: Lenovo

do ﬁne-tuning:
GoogLeNet [?], the winner of the ILSVRC 2014.
RIPS-HK: Lenovo

do ﬁne-tuning:
GoogLeNet [?], the winner of the ILSVRC 2014.
Both of these are provided in Caffe’s Model Zoo, with a ﬁle that
stores the weights of these models after training on ImageNet.
RIPS-HK: Lenovo

AlexNet
Figure: Image of AlexNet architecture (from [?]). This also illustrates
how original the network was split to train on two GPUs.
RIPS-HK: Lenovo

GoogLeNet
Figure: Image of GoogLeNet architecture (from [?]). Deeper, and 12x
fewer parameters than AlexNet.
RIPS-HK: Lenovo

Filter/Layer Visualization
Let’s do some ﬁlter/layer visualization!
143.89.75.120/ﬁlayer.html
RIPS-HK: Lenovo

Model Testing
RIPS-HK: Lenovo

Dataset Construction
We gathered a data set of images of logos of 167 brands using
Bing Search API (on average, 100 images per brand),
searching for things like “<brand>”, “<brand>building”,
“<brand><product>”.
RIPS-HK: Lenovo

“<brand><product>”. One problem we faced was that we
downloaded either mislabeled images or irrelevant images. We
ﬁltered the dataset using two methods:
RIPS-HK: Lenovo

compute the proportion of matching SIFT descriptors
between the downloaded image and a reference image for
that brand, and toss the image if it doesn’t meet some
threshold
RIPS-HK: Lenovo

compute the proportion of matching SIFT descriptors
between the downloaded image and a reference image for
that brand, and toss the image if it doesn’t meet some
threshold
import ManualLabor
RIPS-HK: Lenovo

Testing the original pipeline
parameter tuning
cross validation
RIPS-HK: Lenovo

Parameter Tuning
BOW structure: how to choose vocabulary size:
words = BL
B: number of branch; L: number of level
RIPS-HK: Lenovo

Parameter Tuning
words = BL
Too large: lack of generalization, overﬁtting
RIPS-HK: Lenovo

Parameter Tuning
words = BL
Too large: lack of generalization, overﬁtting
Too small: lack of discrimination,mismatched
RIPS-HK: Lenovo

Parameter Tuning
vocabulary size
How to choose the number of images returned by inverted
ﬁle index search
accuracy
the computation time of re-ranking
RIPS-HK: Lenovo

Parameter Tuning
vocabulary size
ﬁle index search
accuracy
How to choose the number of image shown in the client
side
accuracy
mobile application, the size of screen
RIPS-HK: Lenovo

Parameter Tuning
vocabulary size
ﬁle index search
accuracy
How to choose the number of image shown in the client
side
accuracy
mobile application, the size of screen
post
RIPS-HK: Lenovo

Parameter Tuning
vocabulary size
the number of images returned by searching
the number of image shown
Re-ranking: how to determine weight factor w in the
weighted function
scores = w ∗ I + (1 − w) ∗ F
I: number of inliers
F: frequency of the brands in the return images
RIPS-HK: Lenovo

Parameters for Evaluation
vocabulary size
number of branch
number of level
the number of images returned by searching
the number of image shown
weight factor w in the weighted function
calculation of the accuracy
one correct return then accuracy = 1
RIPS-HK: Lenovo

Cross Validation
application
model selection
model assessment
procedure
RIPS-HK: Lenovo

Cross Validation
randomly divide the data into K
equal sized parts.
leave out part k, ﬁt the
model to the other K-1
parts(combined), and then
obtain predictions for the
left-out kth part
this is done in turn for each
part k=1,2,...K, and then
the results are combined
choose k = 5
RIPS-HK: Lenovo

Testing Result
RIPS-HK: Lenovo

Testing Result
test on vocabulary size
optimal number of words: 500000 to 800000
number of branch = 14 or 15
number of level = 5
RIPS-HK: Lenovo

Testing Result
With other
parameters ﬁxed,
test on
weight factor
number of return
image
number of image
shown on the
client side
RIPS-HK: Lenovo

Testing Result
optimal parameter
setting:
number of image
shown = 6
set number of
return image to
be 15, saving
about 0.3s
RIPS-HK: Lenovo

Testing Summary
optimal parameter setting:
number of words: 500000 to 800000
number of image return: 15
number of image shown: 6
stability of the system was also test:
standard deviation of 5 fold cross validation range from
0.005 to 0.007
RIPS-HK: Lenovo

Evaluation of Deep Learning framework
Cross-validation for AlexNet (Top-5 Accuracy)
0.87
0.88
0.89
0.9
0.91
0.92
0.93
0.94
0.95
1000
6000
11000
16000
21000
26000
31000
36000
41000
46000
51000
56000
61000
66000
71000
76000
81000
86000
91000
96000
101000
106000
111000
116000
121000
126000
131000
136000
141000
146000
151000
156000
161000
166000
171000
176000
181000
186000
191000
196000
Cross Validation Example
94.63%
94.02%
93.80%
94.02%
93.90%
93.59%
94.11%
93.44%
94.54%
93.80%
RIPS-HK: Lenovo

Cross-validation for AlexNet
Final Accuracy reaches: (AlexNet)
AlexNet
Top-1 Accuracy 93.33%
RIPS-HK: Lenovo

Cross-validation for GoogleNet (Top-5 Accuracy)
RIPS-HK: Lenovo

Cross-validation for AlexNet
Cross-validation for GoogleNet
Final Accuracy reaches: (GoogleNet)
GoogleNet
RIPS-HK: Lenovo

Final Comparison
GoogleNet AlexNet Visual Bag of Words
Accuracy (Top-5) 97.39% 96.73% 87.6%
Efficiency
Preprocess 8.47ms 7.5ms 6ms
Classification 17.7ms 6.94ms
SURF Feature
extraction
24ms
Total Time
(Including some
system level
operation)
129ms 170ms 281ms
RIPS-HK: Lenovo

Demonstration
RIPS-HK: Lenovo

Future development
There is still something we can do to improve the system
We can enlarge the data set. (Currently 167 classes and
16,000 images)
RIPS-HK: Lenovo

Future development
16,000 images)
Test different deep learning frameworks.
RIPS-HK: Lenovo

Future development
16,000 images)
Test different deep learning frameworks.
Combine locally hand-crafted feature and globally deep
learned feature to achieve better accuracy.
RIPS-HK: Lenovo

We would like to thank
Mr. Sun Lin and Lenovo-Hong Kong.
Professor Shingyu Leung, Dr. Ku Yin Bon and Hong Kong
University of Science and Technology.
Professor Susanna Serna and the Institute for Pure and
Applied Mathematics.
The National Science Foundation for program funding -
Grant DMS #0931852.
RIPS-HK: Lenovo

RIPS-HK: Lenovo

rips-hk-lenovo (1)

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie rips-hk-lenovo (1)

Ähnlich wie rips-hk-lenovo (1) (20)

rips-hk-lenovo (1)