SlideShare a Scribd company logo
1 of 13
Download to read offline
Semantic Retrieval and
Automatic Annotation
Linear Transformations, Correlation and
Semantic Spaces
Jonathon Hare & Paul Lewis
School of Electronics and Computer Science
University of Southampton
Introduction and Motivation
• Introduce a new, simple linear-transform based
annotation/retrieval technique
• Compare against a number of similar existing
techniques for automatic annotation & semantic
retrieval that:
• Represent images by a fixed length histogram (of
visual-term occurrences)
• Optionally use SVD for noise reduction
• Are deterministic (no randomness)
• Are (relatively) computationally efficient
• Reflect on real-world performance
SingularValue Decomposition
• SVD can be used to filter noise by producing a rank-k
estimate of the original data matrix
• The rank-k estimate is optimal in the least-squares
sense
Nomenclature
• F is a visual-term occurrence matrix
(columns represent images, rows visual-
terms)
• W is a keyword occurrence matrix
(columns represent images, rows keywords)
Technique: linear transform
Assume that visual-term occurrence vectors can be
related to keyword occurrence vectors by a simple
linear transformation, T.
FT=W
T can be estimated using the pseudo-inverse (calculated
using the SVD, which allows noise reduction) given a
training set with known F and W, then unknown W*
can be calculated from F* (from unannotated images)
and T.
Technique: Semantic Spaces
• Based around the factorisation [-]= TD
• Calculated using truncated SVD
• Rows of T represent coordinates of the features
and words in a vector space
• Columns of D represent coordinates of images in
the same space
• Similar objects have similar locations in the space, so
it is possible to rank images on their distance to a
given word
F
W
Hare, J. S., Lewis, P. H., Enser, P. G. B., and Sandom, C. J., “A Linear-Algebraic Technique with an
Application in Semantic Image Retrieval,” in CIVR 2006, Sundaram, H., Naphade, M., Smith, J. R., and
Rui,Y., eds., LNCS 4071, 31–40, Springer (2006).
Technique: Correlation
• Pan et al defined four techniques for building
translation tables between visual terms and keywords
[i.e. the elements of the table/matrix represent
p(wi,fj)].
• The Corr method used WTF to build the table
• The Cos method used the cosine of wi and fj
• The SVDCorr and SVDCos methods filtered the
tables from the Corr and Cos methods reducing
the rank using the SVD
Pan, J.-Y.,Yang, H.-J., Duygulu, P., and Faloutsos, C., “Automatic image captioning,” IEEE International Conference
on Multimedia and Expo 2004 (ICME ’04). Vol.3 (27-30 June 2004).
Technique Summary
Technique Variables Notes
Transform
feature-weighting,
dimensionality reduction
Words independent
Corr, Cos feature-weighting Words independent
SVDCorr,
SVDCos
feature-weighting,
dimensionality reduction
Words independent
Semantic Space
feature-weighting,
dimensionality reduction
Inter-word
dependencies
Image Features
• Two types of visual-term feature considered:
• Segmented-blob based (using shape, colour, texture
descriptors) [500 terms]
• Quantised DCT-based [500 terms]
Experimental Protocol
• 5000 image Corel data-set
• 4000 training images
• 500 validation images (for optimising reduced rank)
• 500 test images
• Two weighting types: unweighted and IDF
• Evaluation performed as a hypothetical retrieval
experiment
• Unannotated test images retrieved in response to
using each word in turn as a query
• Mean-average precision used for comparison
Results
Real-world performance
• ~20% mAP might sound low, but in reality many
queries will work quite well (reasonable initial
precision, but drops fast)
• Choice of image features is very important
• It would be difficult to learn the concept of “sun”
from grey-level SIFT features!
• See the paper for some more reflection on real-word
performance...
Conclusions
• We have described a set of auto-annotation/semantic
retrieval algorithms
• Performance is less than the state-of-the-art, but this is
partially explained by the use of different image
features (see our MIR 2010 paper)
• However, the methods;
• Are computationally inexpensive (although this is
proportional to the amount of training data)
• Are deterministic, and don’t rely on algorithms such
as EM which might get stuck in local minima/maxima

More Related Content

What's hot

CBIR in the Era of Deep Learning
CBIR in the Era of Deep LearningCBIR in the Era of Deep Learning
CBIR in the Era of Deep LearningXiaohu ZHU
 
Artist Assistant AI(AAA)
Artist Assistant AI(AAA)Artist Assistant AI(AAA)
Artist Assistant AI(AAA)Gunhee Lee
 
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachGeotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachSymeon Papadopoulos
 
Geotagging Photographs By Sanjay Rana
Geotagging Photographs By Sanjay RanaGeotagging Photographs By Sanjay Rana
Geotagging Photographs By Sanjay Ranasanjay_rana
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General AudiencesSangwoo Mo
 
Contextless Object Recognition with Shape-enriched SIFT and Bags of Features
Contextless Object Recognition with Shape-enriched SIFT and Bags of FeaturesContextless Object Recognition with Shape-enriched SIFT and Bags of Features
Contextless Object Recognition with Shape-enriched SIFT and Bags of FeaturesUniversitat Politècnica de Catalunya
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Jia-Bin Huang
 
Pre emphasis on data for an adaptive fingerprint image enhancement
Pre emphasis on data for an adaptive fingerprint image enhancementPre emphasis on data for an adaptive fingerprint image enhancement
Pre emphasis on data for an adaptive fingerprint image enhancementIAEME Publication
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Dongmin Choi
 
Sign Language Recognition Using Image Processing For Mute People
Sign Language Recognition Using Image Processing For Mute PeopleSign Language Recognition Using Image Processing For Mute People
Sign Language Recognition Using Image Processing For Mute Peoplepaperpublications3
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and NowSi Krishan
 
Knowing when to look
Knowing when to lookKnowing when to look
Knowing when to lookJaeHo Jang
 
Deep image generating models
Deep image generating modelsDeep image generating models
Deep image generating modelsLuba Elliott
 

What's hot (20)

Cbir ‐ features
Cbir ‐ featuresCbir ‐ features
Cbir ‐ features
 
CBIR
CBIRCBIR
CBIR
 
CBIR in the Era of Deep Learning
CBIR in the Era of Deep LearningCBIR in the Era of Deep Learning
CBIR in the Era of Deep Learning
 
Multimedia searching
Multimedia searchingMultimedia searching
Multimedia searching
 
Artist Assistant AI(AAA)
Artist Assistant AI(AAA)Artist Assistant AI(AAA)
Artist Assistant AI(AAA)
 
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachGeotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
 
Geotagging Photographs By Sanjay Rana
Geotagging Photographs By Sanjay RanaGeotagging Photographs By Sanjay Rana
Geotagging Photographs By Sanjay Rana
 
Generative Models for General Audiences
Generative Models for General AudiencesGenerative Models for General Audiences
Generative Models for General Audiences
 
Contextless Object Recognition with Shape-enriched SIFT and Bags of Features
Contextless Object Recognition with Shape-enriched SIFT and Bags of FeaturesContextless Object Recognition with Shape-enriched SIFT and Bags of Features
Contextless Object Recognition with Shape-enriched SIFT and Bags of Features
 
Image formation
Image formationImage formation
Image formation
 
CBIR
CBIRCBIR
CBIR
 
PPT s03-machine vision-s2
PPT s03-machine vision-s2PPT s03-machine vision-s2
PPT s03-machine vision-s2
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
 
Pre emphasis on data for an adaptive fingerprint image enhancement
Pre emphasis on data for an adaptive fingerprint image enhancementPre emphasis on data for an adaptive fingerprint image enhancement
Pre emphasis on data for an adaptive fingerprint image enhancement
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
 
Sign Language Recognition Using Image Processing For Mute People
Sign Language Recognition Using Image Processing For Mute PeopleSign Language Recognition Using Image Processing For Mute People
Sign Language Recognition Using Image Processing For Mute People
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and Now
 
Knowing when to look
Knowing when to lookKnowing when to look
Knowing when to look
 
Poster cs543
Poster cs543Poster cs543
Poster cs543
 
Deep image generating models
Deep image generating modelsDeep image generating models
Deep image generating models
 

Viewers also liked

A Linear-Algebraic Technique with an Application in Semantic Image Retrieval
A Linear-Algebraic Technique with an Application in Semantic Image RetrievalA Linear-Algebraic Technique with an Application in Semantic Image Retrieval
A Linear-Algebraic Technique with an Application in Semantic Image RetrievalJonathon Hare
 
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORY
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORYBUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORY
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORYJonathon Hare
 
IMAGE DIVERSITY ANALYSIS: CONTEXT, OPINION AND BIAS
IMAGE DIVERSITY ANALYSIS: CONTEXT, OPINION AND BIASIMAGE DIVERSITY ANALYSIS: CONTEXT, OPINION AND BIAS
IMAGE DIVERSITY ANALYSIS: CONTEXT, OPINION AND BIASJonathon Hare
 
The Art and Science of Image Retrieval
The Art and Science of Image RetrievalThe Art and Science of Image Retrieval
The Art and Science of Image RetrievalJonathon Hare
 
Sharp images and fuzzy concepts: Multimedia retrieval and the semantic gap
Sharp images and fuzzy concepts: Multimedia retrieval and the semantic gapSharp images and fuzzy concepts: Multimedia retrieval and the semantic gap
Sharp images and fuzzy concepts: Multimedia retrieval and the semantic gapJonathon Hare
 
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...Jonathon Hare
 
Mind the Gap: Another look at the problem of the semantic gap in image retrieval
Mind the Gap: Another look at the problem of the semantic gap in image retrievalMind the Gap: Another look at the problem of the semantic gap in image retrieval
Mind the Gap: Another look at the problem of the semantic gap in image retrievalJonathon Hare
 
Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and B...
Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and B...Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and B...
Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and B...Jonathon Hare
 
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...Jonathon Hare
 
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)Jonathon Hare
 
SEWM'14 keynote: Mining Events from Multimedia Streams
SEWM'14 keynote: Mining Events from Multimedia StreamsSEWM'14 keynote: Mining Events from Multimedia Streams
SEWM'14 keynote: Mining Events from Multimedia StreamsJonathon Hare
 
WAISFest 2011: Southampton Goggles
WAISFest 2011: Southampton GogglesWAISFest 2011: Southampton Goggles
WAISFest 2011: Southampton GogglesJonathon Hare
 

Viewers also liked (13)

A Linear-Algebraic Technique with an Application in Semantic Image Retrieval
A Linear-Algebraic Technique with an Application in Semantic Image RetrievalA Linear-Algebraic Technique with an Application in Semantic Image Retrieval
A Linear-Algebraic Technique with an Application in Semantic Image Retrieval
 
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORY
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORYBUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORY
BUILDING A SCALABLE MULTIMEDIA WEB OBSERVATORY
 
IMAGE DIVERSITY ANALYSIS: CONTEXT, OPINION AND BIAS
IMAGE DIVERSITY ANALYSIS: CONTEXT, OPINION AND BIASIMAGE DIVERSITY ANALYSIS: CONTEXT, OPINION AND BIAS
IMAGE DIVERSITY ANALYSIS: CONTEXT, OPINION AND BIAS
 
The Art and Science of Image Retrieval
The Art and Science of Image RetrievalThe Art and Science of Image Retrieval
The Art and Science of Image Retrieval
 
Sharp images and fuzzy concepts: Multimedia retrieval and the semantic gap
Sharp images and fuzzy concepts: Multimedia retrieval and the semantic gapSharp images and fuzzy concepts: Multimedia retrieval and the semantic gap
Sharp images and fuzzy concepts: Multimedia retrieval and the semantic gap
 
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
OpenIMAJ and ImageTerrier: Java Libraries and Tools for Scalable Multimedia A...
 
Mind the Gap: Another look at the problem of the semantic gap in image retrieval
Mind the Gap: Another look at the problem of the semantic gap in image retrievalMind the Gap: Another look at the problem of the semantic gap in image retrieval
Mind the Gap: Another look at the problem of the semantic gap in image retrieval
 
Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and B...
Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and B...Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and B...
Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and B...
 
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
Spot the Dog: An overview of semantic retrieval of unannotated images in the ...
 
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
 
SEWM'14 keynote: Mining Events from Multimedia Streams
SEWM'14 keynote: Mining Events from Multimedia StreamsSEWM'14 keynote: Mining Events from Multimedia Streams
SEWM'14 keynote: Mining Events from Multimedia Streams
 
WAISFest 2011: Southampton Goggles
WAISFest 2011: Southampton GogglesWAISFest 2011: Southampton Goggles
WAISFest 2011: Southampton Goggles
 
Multimedia Information Retrieval
Multimedia Information RetrievalMultimedia Information Retrieval
Multimedia Information Retrieval
 

Similar to Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlation and Semantic Spaces

Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchSujit Pal
 
Semantic-Aware Sky Replacement (SIGGRAPH 2016)
Semantic-Aware Sky Replacement (SIGGRAPH 2016)Semantic-Aware Sky Replacement (SIGGRAPH 2016)
Semantic-Aware Sky Replacement (SIGGRAPH 2016)Yi-Hsuan Tsai
 
Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Sujit Pal
 
Content Based Image Retrieval
Content Based Image Retrieval Content Based Image Retrieval
Content Based Image Retrieval Swati Chauhan
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017StampedeCon
 
Ch14-Part4-ImageRetrieval.pdf
Ch14-Part4-ImageRetrieval.pdfCh14-Part4-ImageRetrieval.pdf
Ch14-Part4-ImageRetrieval.pdfAbdullah Azzeh
 
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...kumari36
 
Optical modeling and design of freeform surfaces using anisotropic Radial Bas...
Optical modeling and design of freeform surfaces using anisotropic Radial Bas...Optical modeling and design of freeform surfaces using anisotropic Radial Bas...
Optical modeling and design of freeform surfaces using anisotropic Radial Bas...Milan Maksimovic
 
Programming in python
Programming in pythonProgramming in python
Programming in pythonIvan Rojas
 
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...Thanh Tran
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Graph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsGraph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsNYC Predictive Analytics
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelinesjeykottalam
 
Salient KeypointSelection for Object Representation
Salient KeypointSelection for Object RepresentationSalient KeypointSelection for Object Representation
Salient KeypointSelection for Object RepresentationPrerana Mukherjee
 
Enhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wildEnhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wildPrerana Mukherjee
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverviewMotaz El-Saban
 

Similar to Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlation and Semantic Spaces (20)

Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity Search
 
Semantic-Aware Sky Replacement (SIGGRAPH 2016)
Semantic-Aware Sky Replacement (SIGGRAPH 2016)Semantic-Aware Sky Replacement (SIGGRAPH 2016)
Semantic-Aware Sky Replacement (SIGGRAPH 2016)
 
Lec10 matching
Lec10 matchingLec10 matching
Lec10 matching
 
Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...
 
Content Based Image Retrieval
Content Based Image Retrieval Content Based Image Retrieval
Content Based Image Retrieval
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
 
Ch14-Part4-ImageRetrieval.pdf
Ch14-Part4-ImageRetrieval.pdfCh14-Part4-ImageRetrieval.pdf
Ch14-Part4-ImageRetrieval.pdf
 
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...
 
PPT s12-machine vision-s2
PPT s12-machine vision-s2PPT s12-machine vision-s2
PPT s12-machine vision-s2
 
Optical modeling and design of freeform surfaces using anisotropic Radial Bas...
Optical modeling and design of freeform surfaces using anisotropic Radial Bas...Optical modeling and design of freeform surfaces using anisotropic Radial Bas...
Optical modeling and design of freeform surfaces using anisotropic Radial Bas...
 
Programming in python
Programming in pythonProgramming in python
Programming in python
 
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Lec09 hough
Lec09 houghLec09 hough
Lec09 hough
 
Graph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media AnalyticsGraph Based Machine Learning with Applications to Media Analytics
Graph Based Machine Learning with Applications to Media Analytics
 
Machine Learning Pipelines
Machine Learning PipelinesMachine Learning Pipelines
Machine Learning Pipelines
 
Salient KeypointSelection for Object Representation
Salient KeypointSelection for Object RepresentationSalient KeypointSelection for Object Representation
Salient KeypointSelection for Object Representation
 
CBIR_white.ppt
CBIR_white.pptCBIR_white.ppt
CBIR_white.ppt
 
Enhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wildEnhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wild
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 

Recently uploaded

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 

Recently uploaded (20)

INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 

Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlation and Semantic Spaces

  • 1. Semantic Retrieval and Automatic Annotation Linear Transformations, Correlation and Semantic Spaces Jonathon Hare & Paul Lewis School of Electronics and Computer Science University of Southampton
  • 2. Introduction and Motivation • Introduce a new, simple linear-transform based annotation/retrieval technique • Compare against a number of similar existing techniques for automatic annotation & semantic retrieval that: • Represent images by a fixed length histogram (of visual-term occurrences) • Optionally use SVD for noise reduction • Are deterministic (no randomness) • Are (relatively) computationally efficient • Reflect on real-world performance
  • 3. SingularValue Decomposition • SVD can be used to filter noise by producing a rank-k estimate of the original data matrix • The rank-k estimate is optimal in the least-squares sense
  • 4. Nomenclature • F is a visual-term occurrence matrix (columns represent images, rows visual- terms) • W is a keyword occurrence matrix (columns represent images, rows keywords)
  • 5. Technique: linear transform Assume that visual-term occurrence vectors can be related to keyword occurrence vectors by a simple linear transformation, T. FT=W T can be estimated using the pseudo-inverse (calculated using the SVD, which allows noise reduction) given a training set with known F and W, then unknown W* can be calculated from F* (from unannotated images) and T.
  • 6. Technique: Semantic Spaces • Based around the factorisation [-]= TD • Calculated using truncated SVD • Rows of T represent coordinates of the features and words in a vector space • Columns of D represent coordinates of images in the same space • Similar objects have similar locations in the space, so it is possible to rank images on their distance to a given word F W Hare, J. S., Lewis, P. H., Enser, P. G. B., and Sandom, C. J., “A Linear-Algebraic Technique with an Application in Semantic Image Retrieval,” in CIVR 2006, Sundaram, H., Naphade, M., Smith, J. R., and Rui,Y., eds., LNCS 4071, 31–40, Springer (2006).
  • 7. Technique: Correlation • Pan et al defined four techniques for building translation tables between visual terms and keywords [i.e. the elements of the table/matrix represent p(wi,fj)]. • The Corr method used WTF to build the table • The Cos method used the cosine of wi and fj • The SVDCorr and SVDCos methods filtered the tables from the Corr and Cos methods reducing the rank using the SVD Pan, J.-Y.,Yang, H.-J., Duygulu, P., and Faloutsos, C., “Automatic image captioning,” IEEE International Conference on Multimedia and Expo 2004 (ICME ’04). Vol.3 (27-30 June 2004).
  • 8. Technique Summary Technique Variables Notes Transform feature-weighting, dimensionality reduction Words independent Corr, Cos feature-weighting Words independent SVDCorr, SVDCos feature-weighting, dimensionality reduction Words independent Semantic Space feature-weighting, dimensionality reduction Inter-word dependencies
  • 9. Image Features • Two types of visual-term feature considered: • Segmented-blob based (using shape, colour, texture descriptors) [500 terms] • Quantised DCT-based [500 terms]
  • 10. Experimental Protocol • 5000 image Corel data-set • 4000 training images • 500 validation images (for optimising reduced rank) • 500 test images • Two weighting types: unweighted and IDF • Evaluation performed as a hypothetical retrieval experiment • Unannotated test images retrieved in response to using each word in turn as a query • Mean-average precision used for comparison
  • 12. Real-world performance • ~20% mAP might sound low, but in reality many queries will work quite well (reasonable initial precision, but drops fast) • Choice of image features is very important • It would be difficult to learn the concept of “sun” from grey-level SIFT features! • See the paper for some more reflection on real-word performance...
  • 13. Conclusions • We have described a set of auto-annotation/semantic retrieval algorithms • Performance is less than the state-of-the-art, but this is partially explained by the use of different image features (see our MIR 2010 paper) • However, the methods; • Are computationally inexpensive (although this is proportional to the amount of training data) • Are deterministic, and don’t rely on algorithms such as EM which might get stuck in local minima/maxima