SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
CS676A Project
Report
Learning with Relative Attributes
Submitted in partial fulfillment of
the requirements for the course CS676A
April 15, 2016
Submitted by
13683 Shubham Jain
13788 Vikas Jain
Under the guidance of
Prof Vinay P. Namboodiri
Department of Computer Science and
Engineering
Indian Institute of Technology Kanpur
Kanpur, U.P. , India – 208 016
July 2015
Contents
1 Introduction 2
2 Methodology 3
2.1 Feature representation . . . . . . . . . . . . . . . . 3
2.2 Ranking Methods . . . . . . . . . . . . . . . . . . . 3
2.2.1 RankSVM . . . . . . . . . . . . . . . . . . . . 3
2.2.2 RankNet [2] . . . . . . . . . . . . . . . . . . . 4
2.3 Zero Shot Learning . . . . . . . . . . . . . . . . . . 5
3 Dataset 6
4 Code Used 6
5 Results 6
5.1 Correctly ordered pairs: . . . . . . . . . . . . . . . . 6
5.2 Class Prediction: . . . . . . . . . . . . . . . . . . . . 7
6 Conclusions 7
7 Scope and Future Work 7
1
1 Introduction
Human nameable attributes are more intuitive to understand,
e.g. open natural scene, white young face. Due to their simplic-
ity, human nameable attributes are used for recognition tasks.
But sometimes, it becomes difficult to assign categorical labels
to the images i.e binary labels – yes or no if attribute is present or
not respectively. Relative attributes[1] are more understandable
and intuitive, e.g. a person in a image is taller than some other
person in other image and shorter than an another person in an-
other image. In this project, we aim to learn relative attributes
associated with face images namely Masculine, white, young,
chubby, visible-forehead, bushy-eyebrows, narrow-eyes, pointy-
nose, big-lips and round-face. We used PubFig dataset[5] for
the task. We extracted features using Convolutional Neural Net-
work(VGG16 model)[4]. We implemented and trained Ranknet
model to predict absolute ranking of attributes give the image
features. We also explored zero shot learning for unseen classes
by building probabilistic model of each seen and unseen class
using attribute ranking for each image of seen class and relative
description of unseen classes with seen classes respectively.
Figure 1: Relative Attribute [1]
2
2 Methodology
The main idea in a simplified version is as follows: Find a good
feature representation of the image which somehow incorpo-
rates the strength of features not just their presence. Now using
these features learn a model which when given a feature repre-
sentation of a new image tries to predict a ranking for the image
for that particular attribute. We will now describe the feature
representation that we used and the ranking models.
2.1 Feature representation
Parikh et. al. [1] in their paper used GIST features of the images
for the learning task. Instead of using handcrafted features, We
used features extracted from Convolutional Neural Network. The
VGG16 [4] model for extracting features from the images. We
used learned model of VGG16 trained on ImageNet dataset[6].
The last layer(softmax) from the network is removed and the out-
put of layer before it(FC-4096) is used as feature vector of each
image. The dimension of each feature vector is 4096.
2.2 Ranking Methods
For ranking we have compared two methods. One is based on
an approximation algorithm to solve the problem via Support
Vector Machines (SVM). The latter is based on neural networks.
2.2.1 RankSVM
The idea was to learn given a query image and user preferences
which are incorporated to retrieve more relevant images in the
search results. It uses a set of comparisons between certain
images with respect to an attribute. The exact equation is NP-
Hard and an approximate solution is obtained by using negative
3
slack variables. The mathematical description is as follows:
Rank SVM formalization[1]
2.2.2 RankNet [2]
It is a Neural network with two hidden layers and back-propagation
can be extended to train on these pairs. The first hidden layer
had 2048 neurons and the second hidden layer had 512 neurons.
The model trains on pairs of examples to learn a ranking func-
tion that maps to the real numbers. It uses a natural probabilis-
tic cost function on pairs of examples. The network as described
in the paper is as follows :
For the ith training sample, denote the outputs of net by
oi, the targets by ti, let the transfer function of each node in the
jth layer of nodes be gj, and let the cost function be q
i=1 f(oi, ti)
If αk are the parameters of the model, then a gradient descent
step amounts to ∂αk = -ηk
∂f
∂αk
, where the ηk are positive learning
rates. The net embodies the function
oi = g3
(
j
w32
ij g2
(
k
w21
jkxk + b2
j ) + b3
i ) ≡ g3
i (1)
where for the weights w and offsets b, the upper indices index
the node layer, and the lower indices index the nodes within each
corresponding layer. The cost function becomes a function of the
4
difference of the outputs of two consecutive training samples: f
(o2 − o1). The update equations (back propagation) are as follows
(f’ ≡ f’ (o2 − o1)):
∂f
∂b3
= f (g
3
2 − g
3
1) ≡ 3
2 − 3
1 (2)
∂f
∂w32
m
= 3
2g2
2m − 3
1g2
1m (3)
∂f
∂b2
m
= 3
2w32
m g
2
2m − 3
1w32
m g
2
1m (4)
∂f
∂w21
m
= 2
2mg1
1n − 2
1mg2
1m (5)
2.3 Zero Shot Learning
Once the ranking of each attribute of the classes are learnt. The
feature vector of each image of each class is represented in the
attribute space. Using the features vector of each image of a
particular class, the parameters of normal distribution is found
to obtain a probabilistic model of the class.
The probabilistic model for the unseen class is found using the
relative description of unseen class with the seen classes. The
method to find the Normal Distribution parameters is same as
given in [1]
Once the probabilistic model of seen and unseen classes are
obtained, for a query image, fist the rank of attributes for the
images is found and we obtain a vector ¯x in the attribute space.
The image is labelled with the class with maximum probability
among all seen and unseen classes.
c∗
= max
j∈1,...,N
P(¯xi|µi, Σi) (6)
5
3 Dataset
We used PubFig[5] Dataset for this project. The dataset consist
of face images of celebrity. The images from 8 classes were taken
for the learning tasks. Around 100 images from each class were
used, out of which 85 were used for training and rest for testing.
For zero shot learning, we took images from 3 different classes.
Around 30 images of each unseen class were used for prediction
the class.
4 Code Used
• Feature Extraction - From [4] https://gist.github.com/
baraldilorenzo/07d7802847aaad0a35d3
• RankNet - Adapted from https://github.com/shiba24/learning2rank
• RankNet - From [1]
5 Results
5.1 Correctly ordered pairs:
The result obtained for the learned relative ranking per attribute
in the 8 classes are shown in Table 1.
Attribute Acc. Attribute Acc.
Masculine 80.35 Bushy-eyebrows 80.81
Young 79.14 Pointy-nose 79.25
Chubby 69.68 Big-lips 88.25
Forehead 66.48 Round-face 83.66
Table 1: Accuracy of learned attributes
6
5.2 Class Prediction:
The results of predicted class by formulating Normal Distribu-
tion of each class are shown in Table 2.
Class RankNet Acc. RankSVM Acc.
1 0.80 0.67
2 0.91 0.36
3 1.00 0.75
4 0.89 0.56
5 0.75 0.67
6 0.83 0.58
7 0.85 0.77
8 0.92 0.85
9(Unseen) 0.03 –
10(Unseen) 0.00 –
11(Unseen) 0.10 –
Table 2:
6 Conclusions
It can be inferred that feature vectors extracted from Convolu-
tional Neural Network and RankNet model used for learning rel-
ative attributes are performing better than the techniques used
in [1] of GIST features and RankSVM model.
However, for zero shot learning task, while the prediction of seen
classes are better for RankNet as compared to RankSVM, it per-
forms significantly poor in case of unseen classes.
7 Scope and Future Work
For zero shot we can instead do unsupervised step like cluster-
ing instead of using the comparison of attributes but that is one
thing that the paper wanted to incorporate so that it kind of
7
comes out in a natural way. Also, we need to find a better way
to fins out the parameters of the unseen classes as the one that
the paper used we found not to work so well.
References
[1] Parikh, Devi, and Kristen Grauman. ”Relative attributes.”
Computer Vision (ICCV), 2011 IEEE International Confer-
ence on. IEEE, 2011.
[2] Burges, Chris, et al. ”Learning to rank using gradient de-
scent.” Proceedings of the 22nd international conference on
Machine learning. ACM, 2005.
[3] Joachims, Thorsten. ”Optimizing search engines using
clickthrough data.” Proceedings of the eighth ACM SIGKDD
international conference on Knowledge discovery and data
mining. ACM, 2002.
[4] Simonyan, Karen, and Andrew Zisserman. ”Very deep
convolutional networks for large-scale image recognition.”
arXiv preprint arXiv:1409.1556 (2014).
[5] ”Attribute and Simile Classifiers for Face Verification,”
Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and
Shree K. Nayar, International Conference on Computer Vi-
sion (ICCV), 2009.
[6] Deng, Jia, et al. ”Imagenet: A large-scale hierarchical im-
age database.” Computer Vision and Pattern Recognition,
2009. CVPR 2009. IEEE Conference on. IEEE, 2009.
8

Weitere ähnliche Inhalte

Was ist angesagt?

DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...cscpconf
 
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)Fellowship at Vodafone FutureLab
 
4213ijaia05
4213ijaia054213ijaia05
4213ijaia05ijaia
 
Neural network in matlab
Neural network in matlab Neural network in matlab
Neural network in matlab Fahim Khan
 
Paper id 21201483
Paper id 21201483Paper id 21201483
Paper id 21201483IJRAT
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNTRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNijaia
 
Knowledge distillation deeplab
Knowledge distillation deeplabKnowledge distillation deeplab
Knowledge distillation deeplabFrozen Paradise
 
Black-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX modelBlack-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX modelIJECEIAES
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningMLAI2
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]SubhradeepMaji
 
Offline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural NetworkOffline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural Networkijaia
 
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence MatrixSteganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence MatrixCSCJournals
 
Ml10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsMl10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsankit_ppt
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketakiKetaki Patwari
 
Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network  Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network Tomoki Hayashi
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 

Was ist angesagt? (20)

DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
DESIGN AND IMPLEMENTATION OF BINARY NEURAL NETWORK LEARNING WITH FUZZY CLUSTE...
 
Neural networks
Neural networksNeural networks
Neural networks
 
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
Methodology (DLAI D6L2 2017 UPC Deep Learning for Artificial Intelligence)
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
4213ijaia05
4213ijaia054213ijaia05
4213ijaia05
 
Neural network in matlab
Neural network in matlab Neural network in matlab
Neural network in matlab
 
Paper id 21201483
Paper id 21201483Paper id 21201483
Paper id 21201483
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNNTRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
TRANSFER LEARNING BASED IMAGE VISUALIZATION USING CNN
 
Knowledge distillation deeplab
Knowledge distillation deeplabKnowledge distillation deeplab
Knowledge distillation deeplab
 
Black-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX modelBlack-box modeling of nonlinear system using evolutionary neural NARX model
Black-box modeling of nonlinear system using evolutionary neural NARX model
 
Representational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual LearningRepresentational Continuity for Unsupervised Continual Learning
Representational Continuity for Unsupervised Continual Learning
 
nn network
nn networknn network
nn network
 
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]Handwritten Digit Recognition and performance of various modelsation[autosaved]
Handwritten Digit Recognition and performance of various modelsation[autosaved]
 
Offline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural NetworkOffline Character Recognition Using Monte Carlo Method and Neural Network
Offline Character Recognition Using Monte Carlo Method and Neural Network
 
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence MatrixSteganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
 
Ml10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topicsMl10 dimensionality reduction-and_advanced_topics
Ml10 dimensionality reduction-and_advanced_topics
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
 
Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network  Prediction of Exchange Rate Using Deep Neural Network
Prediction of Exchange Rate Using Deep Neural Network
 
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)
 

Ähnlich wie Learning with Relative Attributes

Image Classification using Deep Learning
Image Classification using Deep LearningImage Classification using Deep Learning
Image Classification using Deep LearningIRJET Journal
 
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an ObjectAnkur Tyagi
 
ShawnQuinnCSS581FinalProjectReport
ShawnQuinnCSS581FinalProjectReportShawnQuinnCSS581FinalProjectReport
ShawnQuinnCSS581FinalProjectReportShawn Quinn
 
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...IRJET Journal
 
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...Faster Training Algorithms in Neural Network Based Approach For Handwritten T...
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...CSCJournals
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Fwdays
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecognIlyas CHAOUA
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection ProcessBenjamin Bengfort
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System OperationsIRJET Journal
 
A study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithmsA study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithmsManje Gowda
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundIJCSIS Research Publications
 
Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple featuresHirantha Pradeep
 
Computer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonComputer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonAkash Satamkar
 
cvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxcvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxPyariMohanJena
 
ppt 20BET1024.pptx
ppt 20BET1024.pptxppt 20BET1024.pptx
ppt 20BET1024.pptxManeetBali
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisNaeem Shehzad
 

Ähnlich wie Learning with Relative Attributes (20)

Image Classification using Deep Learning
Image Classification using Deep LearningImage Classification using Deep Learning
Image Classification using Deep Learning
 
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object3D Reconstruction from Multiple uncalibrated 2D Images of an Object
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
 
ShawnQuinnCSS581FinalProjectReport
ShawnQuinnCSS581FinalProjectReportShawnQuinnCSS581FinalProjectReport
ShawnQuinnCSS581FinalProjectReport
 
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
 
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...Faster Training Algorithms in Neural Network Based Approach For Handwritten T...
Faster Training Algorithms in Neural Network Based Approach For Handwritten T...
 
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITIONFEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
research_paper
research_paperresearch_paper
research_paper
 
Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"Yulia Honcharenko "Application of metric learning for logo recognition"
Yulia Honcharenko "Application of metric learning for logo recognition"
 
Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System Operations
 
A study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithmsA study and comparison of different image segmentation algorithms
A study and comparison of different image segmentation algorithms
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective Background
 
Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple features
 
Computer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and PythonComputer Vision - Real Time Face Recognition using Open CV and Python
Computer Vision - Real Time Face Recognition using Open CV and Python
 
cvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptxcvpresentation-190812154654 (1).pptx
cvpresentation-190812154654 (1).pptx
 
ppt 20BET1024.pptx
ppt 20BET1024.pptxppt 20BET1024.pptx
ppt 20BET1024.pptx
 
Decomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesisDecomposing image generation into layout priction and conditional synthesis
Decomposing image generation into layout priction and conditional synthesis
 

Kürzlich hochgeladen

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Kürzlich hochgeladen (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Learning with Relative Attributes

  • 1. CS676A Project Report Learning with Relative Attributes Submitted in partial fulfillment of the requirements for the course CS676A April 15, 2016 Submitted by 13683 Shubham Jain 13788 Vikas Jain Under the guidance of Prof Vinay P. Namboodiri Department of Computer Science and Engineering Indian Institute of Technology Kanpur Kanpur, U.P. , India – 208 016 July 2015
  • 2. Contents 1 Introduction 2 2 Methodology 3 2.1 Feature representation . . . . . . . . . . . . . . . . 3 2.2 Ranking Methods . . . . . . . . . . . . . . . . . . . 3 2.2.1 RankSVM . . . . . . . . . . . . . . . . . . . . 3 2.2.2 RankNet [2] . . . . . . . . . . . . . . . . . . . 4 2.3 Zero Shot Learning . . . . . . . . . . . . . . . . . . 5 3 Dataset 6 4 Code Used 6 5 Results 6 5.1 Correctly ordered pairs: . . . . . . . . . . . . . . . . 6 5.2 Class Prediction: . . . . . . . . . . . . . . . . . . . . 7 6 Conclusions 7 7 Scope and Future Work 7 1
  • 3. 1 Introduction Human nameable attributes are more intuitive to understand, e.g. open natural scene, white young face. Due to their simplic- ity, human nameable attributes are used for recognition tasks. But sometimes, it becomes difficult to assign categorical labels to the images i.e binary labels – yes or no if attribute is present or not respectively. Relative attributes[1] are more understandable and intuitive, e.g. a person in a image is taller than some other person in other image and shorter than an another person in an- other image. In this project, we aim to learn relative attributes associated with face images namely Masculine, white, young, chubby, visible-forehead, bushy-eyebrows, narrow-eyes, pointy- nose, big-lips and round-face. We used PubFig dataset[5] for the task. We extracted features using Convolutional Neural Net- work(VGG16 model)[4]. We implemented and trained Ranknet model to predict absolute ranking of attributes give the image features. We also explored zero shot learning for unseen classes by building probabilistic model of each seen and unseen class using attribute ranking for each image of seen class and relative description of unseen classes with seen classes respectively. Figure 1: Relative Attribute [1] 2
  • 4. 2 Methodology The main idea in a simplified version is as follows: Find a good feature representation of the image which somehow incorpo- rates the strength of features not just their presence. Now using these features learn a model which when given a feature repre- sentation of a new image tries to predict a ranking for the image for that particular attribute. We will now describe the feature representation that we used and the ranking models. 2.1 Feature representation Parikh et. al. [1] in their paper used GIST features of the images for the learning task. Instead of using handcrafted features, We used features extracted from Convolutional Neural Network. The VGG16 [4] model for extracting features from the images. We used learned model of VGG16 trained on ImageNet dataset[6]. The last layer(softmax) from the network is removed and the out- put of layer before it(FC-4096) is used as feature vector of each image. The dimension of each feature vector is 4096. 2.2 Ranking Methods For ranking we have compared two methods. One is based on an approximation algorithm to solve the problem via Support Vector Machines (SVM). The latter is based on neural networks. 2.2.1 RankSVM The idea was to learn given a query image and user preferences which are incorporated to retrieve more relevant images in the search results. It uses a set of comparisons between certain images with respect to an attribute. The exact equation is NP- Hard and an approximate solution is obtained by using negative 3
  • 5. slack variables. The mathematical description is as follows: Rank SVM formalization[1] 2.2.2 RankNet [2] It is a Neural network with two hidden layers and back-propagation can be extended to train on these pairs. The first hidden layer had 2048 neurons and the second hidden layer had 512 neurons. The model trains on pairs of examples to learn a ranking func- tion that maps to the real numbers. It uses a natural probabilis- tic cost function on pairs of examples. The network as described in the paper is as follows : For the ith training sample, denote the outputs of net by oi, the targets by ti, let the transfer function of each node in the jth layer of nodes be gj, and let the cost function be q i=1 f(oi, ti) If αk are the parameters of the model, then a gradient descent step amounts to ∂αk = -ηk ∂f ∂αk , where the ηk are positive learning rates. The net embodies the function oi = g3 ( j w32 ij g2 ( k w21 jkxk + b2 j ) + b3 i ) ≡ g3 i (1) where for the weights w and offsets b, the upper indices index the node layer, and the lower indices index the nodes within each corresponding layer. The cost function becomes a function of the 4
  • 6. difference of the outputs of two consecutive training samples: f (o2 − o1). The update equations (back propagation) are as follows (f’ ≡ f’ (o2 − o1)): ∂f ∂b3 = f (g 3 2 − g 3 1) ≡ 3 2 − 3 1 (2) ∂f ∂w32 m = 3 2g2 2m − 3 1g2 1m (3) ∂f ∂b2 m = 3 2w32 m g 2 2m − 3 1w32 m g 2 1m (4) ∂f ∂w21 m = 2 2mg1 1n − 2 1mg2 1m (5) 2.3 Zero Shot Learning Once the ranking of each attribute of the classes are learnt. The feature vector of each image of each class is represented in the attribute space. Using the features vector of each image of a particular class, the parameters of normal distribution is found to obtain a probabilistic model of the class. The probabilistic model for the unseen class is found using the relative description of unseen class with the seen classes. The method to find the Normal Distribution parameters is same as given in [1] Once the probabilistic model of seen and unseen classes are obtained, for a query image, fist the rank of attributes for the images is found and we obtain a vector ¯x in the attribute space. The image is labelled with the class with maximum probability among all seen and unseen classes. c∗ = max j∈1,...,N P(¯xi|µi, Σi) (6) 5
  • 7. 3 Dataset We used PubFig[5] Dataset for this project. The dataset consist of face images of celebrity. The images from 8 classes were taken for the learning tasks. Around 100 images from each class were used, out of which 85 were used for training and rest for testing. For zero shot learning, we took images from 3 different classes. Around 30 images of each unseen class were used for prediction the class. 4 Code Used • Feature Extraction - From [4] https://gist.github.com/ baraldilorenzo/07d7802847aaad0a35d3 • RankNet - Adapted from https://github.com/shiba24/learning2rank • RankNet - From [1] 5 Results 5.1 Correctly ordered pairs: The result obtained for the learned relative ranking per attribute in the 8 classes are shown in Table 1. Attribute Acc. Attribute Acc. Masculine 80.35 Bushy-eyebrows 80.81 Young 79.14 Pointy-nose 79.25 Chubby 69.68 Big-lips 88.25 Forehead 66.48 Round-face 83.66 Table 1: Accuracy of learned attributes 6
  • 8. 5.2 Class Prediction: The results of predicted class by formulating Normal Distribu- tion of each class are shown in Table 2. Class RankNet Acc. RankSVM Acc. 1 0.80 0.67 2 0.91 0.36 3 1.00 0.75 4 0.89 0.56 5 0.75 0.67 6 0.83 0.58 7 0.85 0.77 8 0.92 0.85 9(Unseen) 0.03 – 10(Unseen) 0.00 – 11(Unseen) 0.10 – Table 2: 6 Conclusions It can be inferred that feature vectors extracted from Convolu- tional Neural Network and RankNet model used for learning rel- ative attributes are performing better than the techniques used in [1] of GIST features and RankSVM model. However, for zero shot learning task, while the prediction of seen classes are better for RankNet as compared to RankSVM, it per- forms significantly poor in case of unseen classes. 7 Scope and Future Work For zero shot we can instead do unsupervised step like cluster- ing instead of using the comparison of attributes but that is one thing that the paper wanted to incorporate so that it kind of 7
  • 9. comes out in a natural way. Also, we need to find a better way to fins out the parameters of the unseen classes as the one that the paper used we found not to work so well. References [1] Parikh, Devi, and Kristen Grauman. ”Relative attributes.” Computer Vision (ICCV), 2011 IEEE International Confer- ence on. IEEE, 2011. [2] Burges, Chris, et al. ”Learning to rank using gradient de- scent.” Proceedings of the 22nd international conference on Machine learning. ACM, 2005. [3] Joachims, Thorsten. ”Optimizing search engines using clickthrough data.” Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2002. [4] Simonyan, Karen, and Andrew Zisserman. ”Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556 (2014). [5] ”Attribute and Simile Classifiers for Face Verification,” Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar, International Conference on Computer Vi- sion (ICCV), 2009. [6] Deng, Jia, et al. ”Imagenet: A large-scale hierarchical im- age database.” Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009. 8