SlideShare ist ein Scribd-Unternehmen logo
1 von 27
MMRetrieval.net
A Multimodal Search Engine
Multimodal Information
 Single language text-only retrieval reach a limit.
 Content-based Image Retrieval is computational
costly and still in infancy stages.
 Digital Information is increasingly becoming
multimodal
 Example: Wikipedia
Modality
 Dictionary: A tendency to conform to a general
pattern or belong to a particular group or
category.
 Definition of Modality in Information Retrieval
 It is unclear, fuzzy
 1st Definition: Modality = Media
 2nd Definition: Modality = Data Stream
MMRetrieval.net
 A Product of Cooperation
 Started June, 2010
 Avi Arampatzis, Lecturer D.U.T.H.
 Konstantinos Zagoris, ph.D. D.U.T.H
 Savvas A. Chatzichristofis, ph.D. candidate D.U.T.H.
ImageCLEF 2010
Wikipedia Retrieval Task
 ImageCLEF 2010 Wikipedia Collection
 Consisting of 237434 items
 Image Primary Media
 Noisy and Incomplete User Supplied Textual
Annotations
 Wikipedia Articles Containing the Images
 Written in any combination of English, German,
French, or any other unidentified language
Wikipedia Collection
<image id="244845" file="images/25/244845.jpg">
<name>Balloons Festival - Chateaux d'Oex.jpg</name>
<text xml:lang="en">
<description/>
<comment/>
<caption article="text/en/4/331622">Balloon
festival </caption>
</text>
<text xml:lang="de">
<description/>
<comment/>
<caption/>
</text>
<text xml:lang="fr">
<description/>
<comment/>
<caption/>
</text>
<comment>(Balloon festival in Chateaux d'Oex.
Category:Chateau d'Oex Category:Hot air balloons)
</comment>
<license>GFDL</license>
</image>
ImageCLEF 2010
Wikipedia Retrieval Task
 70 test topics
 consisting of a textual and a visual part
 three title fields (one per language—English,
German, French)
 one or more example images
Wikipedia Topic
<topic>
<number>8</number>
<title xml:lang="en">tennis player on court</title>
<title xml:lang="de">tennisspieler auf dem platz</title>
<title xml:lang="fr">joueur de tennis sur le terrain</title>
<image>2197587684_94542c6fbd.jpg</image>
<image>777629689_443a25ba08.jpg</image>
</topic>
Extraction of Modalities
Joint Composite Descriptor (JCD)
Spartial Color Distribution (SpCD)
description
comment
caption
article
name
English,
French,
German
Lemur Toolkit V4.11 and Indri V2.11 with
the tf.idf retrieval model
MMRetrieval.net Structure
Fusion in Information Retrieval
 combining evidence about relevance from
different sources of information
 from several modalities
 fusion consists of two components
 score normalization
 score combination
Score Normalization
 the relevance scores are not comparable
 popular text retrieval models (tf.idf) can be turned to
probabilities of relevance via the score-distributional
method
 image descriptors does not fit
 MinMax (maps linearly to the [0,1] )
 Zscore (maps to the number of standard deviations it
lies above or below the mean score)
 non-linear Known-Item Aggregate Cumulative Density
Function (KIACDF)
Score Combination
 CompSUM
 CompMULT
 CompMAX
 CompMED
 CompWSUM
Results
Participant MAP
1 xrce 0.2765
2 unt 0.2251
3 telecom 0.2227
4 i2rcviu 0.2126
5 dcu 0.2039
6 cheshire 0.2014
7 duth 0.1998
8 uned 0.1927
9 daedalus 0.1820
10 sztaki 0.1794
11 nus 0.1581
12 rgu 0.0617
13 uaic 0.0423
Participant P@10
1 xrce 0.6114
2 duth 0.5200
3 i2rcviu 0.4971
4 cheshire 0.4929
5 telecom 0.4914
6 sztaki 0.4857
7 daedalus 0.4471
8 unt 0.4314
9 dcu 0.4271
10 uned 0.4200
11 nus 0.3529
12 rgu 0.2271
13 uaic 0.1543
Participant P@20
1 xrce 0.5407
2 duth 0.4836
3 telecom 0.4407
4 cheshire 0.4364
5 sztaki 0.4329
6 i2rcviu 0.4321
7 daedalus 0.4029
8 unt 0.3986
9 dcu 0.3907
10 uned 0.3671
11 nus 0.3264
12 uaic 0.1529
13 rgu 0.1514
Corrected Results
Participant MAP
1 xrce 0.2765
2 duth 0.2561
3 unt 0.2251
4 telecom 0.2227
5 i2rcviu 0.2126
6 dcu 0.2039
7 cheshire 0.2014
8 uned 0.1927
9 daedalus 0.1820
10 sztaki 0.1794
11 nus 0.1581
12 rgu 0.0617
13 uaic 0.0423
Participant P@10
1 xrce 0.6114
2 duth 0.5257
3 i2rcviu 0.4971
4 cheshire 0.4929
5 telecom 0.4914
6 sztaki 0.4857
7 daedalus 0.4471
8 unt 0.4314
9 dcu 0.4271
10 uned 0.4200
11 nus 0.3529
12 rgu 0.2271
13 uaic 0.1543
Participant P@20
1 xrce 0.5407
2 duth 0.4900
3 telecom 0.4407
4 cheshire 0.4364
5 sztaki 0.4329
6 i2rcviu 0.4321
7 daedalus 0.4029
8 unt 0.3986
9 dcu 0.3907
10 uned 0.3671
11 nus 0.3264
12 uaic 0.1529
13 rgu 0.1514
Fusion Problems
 appropriate weighing of modalities and score
normalization/combination are not trivial
problems
 if results are assessed by visual similarity only,
fusion is not a theoretically sound method
Content-based Image Retrieval
Problems
 Content-based Image Retrieval (CBIR) with global
features is notoriously noisy for image queries of
low generality, i.e. the fraction of relevant images
in a collection.
 does not scale up well to large databases
efficiency-wise
Two – Stage Image Retrieval
 how it works: first use the secondary modality to rank the
collection then perform CBIR only on the top-K items
 assumption: primary (image) – secondary (text) modalities
 hypothesis: CBIR can do better than text retrieval in small
sets or sets of high query generality
 efficient benefit: Using a ‘cheaper’ secondary modality, this
improves also efficiency by cutting down on costly CBIR
operations
 possible drawback: relevant images with empty or very
noise secondary modalities would be completely missed
Previous Work
 Best results re-ranking by visual content has been
seen before
 mostly in different setups
 All these approaches employed a static predefined
K for all queries
 not clear if it works
Our Two-Stage Method
 dynamic K
 calculated dynamically per query
 optimize a predefined effectiveness measure
 without using external information or training
data
Retrieval Results
cockpit of an airplane
Image Only
Text Only
Static K=25
Dynamic K
Best Fusion Method – Max of Sums
 i the index running over example images (i=1,2,…)
 j running over the visual descriptors (𝑗∈{1,2})
 DESCji is the score against the ith example image
for the jth descriptor
 parameter w controls the relative contribution of
the two media
𝑠 = 1 − 𝑤 max
𝑖
𝑗
𝑀𝑖𝑛𝑀𝑎𝑥 𝐷𝐸𝑆𝐶𝑗𝑖 + 𝑤𝑀𝑖𝑛𝑀𝑎𝑥 𝑡𝑓. 𝑖𝑑𝑓
Fusion vs Two-Stage
Implementation
• developed in the C#/.NET
Framework 4.0
• HTML, CSS and JavaScript (AJAX)
technologies for the interface
• requires a fairly modern browser
Directions for Further Research
 Multi-stage retrieval for multimodal databases
based on modality hierarchy.
 Fuzzy Fusion (replace w with membership
function m).
 Create artificial modalities (not only from
relevance scores)
 pseudo relevance feedback – cross media
feedback
Publications
 Multimedia Search with Noisy Modalities: Fusion and
Multistage Retrieval. Avi Arampatzis, Savvas A.
Chatzichristofis, and Konstantinos Zagoris. In: CLEF
(Notebook Papers/LABs/Workshops), 22-23
September, Padua, Italy, 2010.
 www.MMRetrieval.net: A Multimodal Search Engine.
Konstantinos Zagoris, Avi Arampatzis, and Savvas A.
Chatzichristofis. In: Proceedings of the 3rd
International Conference on SImilarity Search and
APplications, SISAP 2010, Istanbul, Turkey, September
18-19, 2010. © Association for Computing Machinery
(ACM).
MultiModal Retrieval Image

Weitere ähnliche Inhalte

Was ist angesagt?

Handwritten and Machine Printed Text Separation in Document Images using the ...
Handwritten and Machine Printed Text Separation in Document Images using the ...Handwritten and Machine Printed Text Separation in Document Images using the ...
Handwritten and Machine Printed Text Separation in Document Images using the ...Konstantinos Zagoris
 
Self-Directing Text Detection and Removal from Images with Smoothing
Self-Directing Text Detection and Removal from Images with SmoothingSelf-Directing Text Detection and Removal from Images with Smoothing
Self-Directing Text Detection and Removal from Images with SmoothingPriyanka Wagh
 
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence MatrixSteganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence MatrixCSCJournals
 
IRJET- Object Detection using Hausdorff Distance
IRJET-  	  Object Detection using Hausdorff DistanceIRJET-  	  Object Detection using Hausdorff Distance
IRJET- Object Detection using Hausdorff DistanceIRJET Journal
 
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...IJERA Editor
 
Btv thesis defense_v1.02-final
Btv thesis defense_v1.02-finalBtv thesis defense_v1.02-final
Btv thesis defense_v1.02-finalVinh Bui
 
Test PDF
Test PDFTest PDF
Test PDFAlgnuD
 
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkRunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkPutra Wanda
 
Optimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral ImagesOptimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral ImagesIDES Editor
 
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...ijcsit
 
Super Resolution with OCR Optimization
Super Resolution with OCR OptimizationSuper Resolution with OCR Optimization
Super Resolution with OCR OptimizationniveditJain
 
A Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detectionA Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detectionvivatechijri
 
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...IRJET Journal
 
Radial Thickness Calculation and Visualization for Volumetric Layers-8397
Radial Thickness Calculation and Visualization for Volumetric Layers-8397Radial Thickness Calculation and Visualization for Volumetric Layers-8397
Radial Thickness Calculation and Visualization for Volumetric Layers-8397Kitware Kitware
 
Kernel based similarity estimation and real time tracking of moving
Kernel based similarity estimation and real time tracking of movingKernel based similarity estimation and real time tracking of moving
Kernel based similarity estimation and real time tracking of movingIAEME Publication
 
Enhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wildEnhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wildPrerana Mukherjee
 
An adaptive-model-for-blind-image-restoration-using-bayesian-approach
An adaptive-model-for-blind-image-restoration-using-bayesian-approachAn adaptive-model-for-blind-image-restoration-using-bayesian-approach
An adaptive-model-for-blind-image-restoration-using-bayesian-approachCemal Ardil
 

Was ist angesagt? (20)

Handwritten and Machine Printed Text Separation in Document Images using the ...
Handwritten and Machine Printed Text Separation in Document Images using the ...Handwritten and Machine Printed Text Separation in Document Images using the ...
Handwritten and Machine Printed Text Separation in Document Images using the ...
 
Self-Directing Text Detection and Removal from Images with Smoothing
Self-Directing Text Detection and Removal from Images with SmoothingSelf-Directing Text Detection and Removal from Images with Smoothing
Self-Directing Text Detection and Removal from Images with Smoothing
 
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence MatrixSteganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
 
IRJET- Object Detection using Hausdorff Distance
IRJET-  	  Object Detection using Hausdorff DistanceIRJET-  	  Object Detection using Hausdorff Distance
IRJET- Object Detection using Hausdorff Distance
 
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...
An Efficient Frame Embedding Using Haar Wavelet Coefficients And Orthogonal C...
 
Btv thesis defense_v1.02-final
Btv thesis defense_v1.02-finalBtv thesis defense_v1.02-final
Btv thesis defense_v1.02-final
 
Test PDF
Test PDFTest PDF
Test PDF
 
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkRunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
 
Self-organizing map
Self-organizing mapSelf-organizing map
Self-organizing map
 
Optimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral ImagesOptimized Neural Network for Classification of Multispectral Images
Optimized Neural Network for Classification of Multispectral Images
 
50120140501016
5012014050101650120140501016
50120140501016
 
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
 
Super Resolution with OCR Optimization
Super Resolution with OCR OptimizationSuper Resolution with OCR Optimization
Super Resolution with OCR Optimization
 
A Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detectionA Literature Survey: Neural Networks for object detection
A Literature Survey: Neural Networks for object detection
 
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...IRJET-  	  Finding Dominant Color in the Artistic Painting using Data Mining ...
IRJET- Finding Dominant Color in the Artistic Painting using Data Mining ...
 
Radial Thickness Calculation and Visualization for Volumetric Layers-8397
Radial Thickness Calculation and Visualization for Volumetric Layers-8397Radial Thickness Calculation and Visualization for Volumetric Layers-8397
Radial Thickness Calculation and Visualization for Volumetric Layers-8397
 
Kernel based similarity estimation and real time tracking of moving
Kernel based similarity estimation and real time tracking of movingKernel based similarity estimation and real time tracking of moving
Kernel based similarity estimation and real time tracking of moving
 
Enhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wildEnhanced characterness for text detection in the wild
Enhanced characterness for text detection in the wild
 
An adaptive-model-for-blind-image-restoration-using-bayesian-approach
An adaptive-model-for-blind-image-restoration-using-bayesian-approachAn adaptive-model-for-blind-image-restoration-using-bayesian-approach
An adaptive-model-for-blind-image-restoration-using-bayesian-approach
 
A1804010105
A1804010105A1804010105
A1804010105
 

Ähnlich wie MultiModal Retrieval Image

Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in ImagesAnil Kumar Gupta
 
Big-Data Analytics for Media Management
Big-Data Analytics for Media ManagementBig-Data Analytics for Media Management
Big-Data Analytics for Media Managementtechkrish
 
Image super resolution using Generative Adversarial Network.
Image super resolution using Generative Adversarial Network.Image super resolution using Generative Adversarial Network.
Image super resolution using Generative Adversarial Network.IRJET Journal
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...Paolo Missier
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringIRJET Journal
 
IRJET - Visual Question Answering – Implementation using Keras
IRJET -  	  Visual Question Answering – Implementation using KerasIRJET -  	  Visual Question Answering – Implementation using Keras
IRJET - Visual Question Answering – Implementation using KerasIRJET Journal
 
IRJET - Explicit Content Detection using Faster R-CNN and SSD Mobilenet V2
 IRJET - Explicit Content Detection using Faster R-CNN and SSD Mobilenet V2 IRJET - Explicit Content Detection using Faster R-CNN and SSD Mobilenet V2
IRJET - Explicit Content Detection using Faster R-CNN and SSD Mobilenet V2IRJET Journal
 
An Overview of Supervised Machine Learning Paradigms and their Classifiers
An Overview of Supervised Machine Learning Paradigms and their ClassifiersAn Overview of Supervised Machine Learning Paradigms and their Classifiers
An Overview of Supervised Machine Learning Paradigms and their ClassifiersIJAEMSJORNAL
 
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵CHENHuiMei
 
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...IRJET Journal
 
DSDT meetup July 2021
DSDT meetup July 2021DSDT meetup July 2021
DSDT meetup July 2021DSDT_MTL
 
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALMETA-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALIJCSEIT Journal
 
Deep Convolutional Neural Network based Intrusion Detection System
Deep Convolutional Neural Network based Intrusion Detection SystemDeep Convolutional Neural Network based Intrusion Detection System
Deep Convolutional Neural Network based Intrusion Detection SystemSri Ram
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural NetworksYogendra Tamang
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
 
Automated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU ArchitectureAutomated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU ArchitectureIRJET Journal
 

Ähnlich wie MultiModal Retrieval Image (20)

Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 
Big-Data Analytics for Media Management
Big-Data Analytics for Media ManagementBig-Data Analytics for Media Management
Big-Data Analytics for Media Management
 
Image super resolution using Generative Adversarial Network.
Image super resolution using Generative Adversarial Network.Image super resolution using Generative Adversarial Network.
Image super resolution using Generative Adversarial Network.
 
Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question Answering
 
IRJET - Visual Question Answering – Implementation using Keras
IRJET -  	  Visual Question Answering – Implementation using KerasIRJET -  	  Visual Question Answering – Implementation using Keras
IRJET - Visual Question Answering – Implementation using Keras
 
IRJET - Explicit Content Detection using Faster R-CNN and SSD Mobilenet V2
 IRJET - Explicit Content Detection using Faster R-CNN and SSD Mobilenet V2 IRJET - Explicit Content Detection using Faster R-CNN and SSD Mobilenet V2
IRJET - Explicit Content Detection using Faster R-CNN and SSD Mobilenet V2
 
2008.11560v2.pdf
2008.11560v2.pdf2008.11560v2.pdf
2008.11560v2.pdf
 
An Overview of Supervised Machine Learning Paradigms and their Classifiers
An Overview of Supervised Machine Learning Paradigms and their ClassifiersAn Overview of Supervised Machine Learning Paradigms and their Classifiers
An Overview of Supervised Machine Learning Paradigms and their Classifiers
 
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
 
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
IRJET - Multi-Label Road Scene Prediction for Autonomous Vehicles using Deep ...
 
DSDT meetup July 2021
DSDT meetup July 2021DSDT meetup July 2021
DSDT meetup July 2021
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALMETA-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVAL
 
Deep Convolutional Neural Network based Intrusion Detection System
Deep Convolutional Neural Network based Intrusion Detection SystemDeep Convolutional Neural Network based Intrusion Detection System
Deep Convolutional Neural Network based Intrusion Detection System
 
A detailed analysis of the supervised machine Learning Algorithms
A detailed analysis of the supervised machine Learning AlgorithmsA detailed analysis of the supervised machine Learning Algorithms
A detailed analysis of the supervised machine Learning Algorithms
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
Automated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU ArchitectureAutomated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU Architecture
 
Poster
PosterPoster
Poster
 

Kürzlich hochgeladen

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

MultiModal Retrieval Image

  • 2. Multimodal Information  Single language text-only retrieval reach a limit.  Content-based Image Retrieval is computational costly and still in infancy stages.  Digital Information is increasingly becoming multimodal  Example: Wikipedia
  • 3. Modality  Dictionary: A tendency to conform to a general pattern or belong to a particular group or category.  Definition of Modality in Information Retrieval  It is unclear, fuzzy  1st Definition: Modality = Media  2nd Definition: Modality = Data Stream
  • 4. MMRetrieval.net  A Product of Cooperation  Started June, 2010  Avi Arampatzis, Lecturer D.U.T.H.  Konstantinos Zagoris, ph.D. D.U.T.H  Savvas A. Chatzichristofis, ph.D. candidate D.U.T.H.
  • 5. ImageCLEF 2010 Wikipedia Retrieval Task  ImageCLEF 2010 Wikipedia Collection  Consisting of 237434 items  Image Primary Media  Noisy and Incomplete User Supplied Textual Annotations  Wikipedia Articles Containing the Images  Written in any combination of English, German, French, or any other unidentified language
  • 6. Wikipedia Collection <image id="244845" file="images/25/244845.jpg"> <name>Balloons Festival - Chateaux d'Oex.jpg</name> <text xml:lang="en"> <description/> <comment/> <caption article="text/en/4/331622">Balloon festival </caption> </text> <text xml:lang="de"> <description/> <comment/> <caption/> </text> <text xml:lang="fr"> <description/> <comment/> <caption/> </text> <comment>(Balloon festival in Chateaux d'Oex. Category:Chateau d'Oex Category:Hot air balloons) </comment> <license>GFDL</license> </image>
  • 7. ImageCLEF 2010 Wikipedia Retrieval Task  70 test topics  consisting of a textual and a visual part  three title fields (one per language—English, German, French)  one or more example images
  • 8. Wikipedia Topic <topic> <number>8</number> <title xml:lang="en">tennis player on court</title> <title xml:lang="de">tennisspieler auf dem platz</title> <title xml:lang="fr">joueur de tennis sur le terrain</title> <image>2197587684_94542c6fbd.jpg</image> <image>777629689_443a25ba08.jpg</image> </topic>
  • 9. Extraction of Modalities Joint Composite Descriptor (JCD) Spartial Color Distribution (SpCD) description comment caption article name English, French, German Lemur Toolkit V4.11 and Indri V2.11 with the tf.idf retrieval model
  • 11. Fusion in Information Retrieval  combining evidence about relevance from different sources of information  from several modalities  fusion consists of two components  score normalization  score combination
  • 12. Score Normalization  the relevance scores are not comparable  popular text retrieval models (tf.idf) can be turned to probabilities of relevance via the score-distributional method  image descriptors does not fit  MinMax (maps linearly to the [0,1] )  Zscore (maps to the number of standard deviations it lies above or below the mean score)  non-linear Known-Item Aggregate Cumulative Density Function (KIACDF)
  • 13. Score Combination  CompSUM  CompMULT  CompMAX  CompMED  CompWSUM
  • 14. Results Participant MAP 1 xrce 0.2765 2 unt 0.2251 3 telecom 0.2227 4 i2rcviu 0.2126 5 dcu 0.2039 6 cheshire 0.2014 7 duth 0.1998 8 uned 0.1927 9 daedalus 0.1820 10 sztaki 0.1794 11 nus 0.1581 12 rgu 0.0617 13 uaic 0.0423 Participant P@10 1 xrce 0.6114 2 duth 0.5200 3 i2rcviu 0.4971 4 cheshire 0.4929 5 telecom 0.4914 6 sztaki 0.4857 7 daedalus 0.4471 8 unt 0.4314 9 dcu 0.4271 10 uned 0.4200 11 nus 0.3529 12 rgu 0.2271 13 uaic 0.1543 Participant P@20 1 xrce 0.5407 2 duth 0.4836 3 telecom 0.4407 4 cheshire 0.4364 5 sztaki 0.4329 6 i2rcviu 0.4321 7 daedalus 0.4029 8 unt 0.3986 9 dcu 0.3907 10 uned 0.3671 11 nus 0.3264 12 uaic 0.1529 13 rgu 0.1514
  • 15. Corrected Results Participant MAP 1 xrce 0.2765 2 duth 0.2561 3 unt 0.2251 4 telecom 0.2227 5 i2rcviu 0.2126 6 dcu 0.2039 7 cheshire 0.2014 8 uned 0.1927 9 daedalus 0.1820 10 sztaki 0.1794 11 nus 0.1581 12 rgu 0.0617 13 uaic 0.0423 Participant P@10 1 xrce 0.6114 2 duth 0.5257 3 i2rcviu 0.4971 4 cheshire 0.4929 5 telecom 0.4914 6 sztaki 0.4857 7 daedalus 0.4471 8 unt 0.4314 9 dcu 0.4271 10 uned 0.4200 11 nus 0.3529 12 rgu 0.2271 13 uaic 0.1543 Participant P@20 1 xrce 0.5407 2 duth 0.4900 3 telecom 0.4407 4 cheshire 0.4364 5 sztaki 0.4329 6 i2rcviu 0.4321 7 daedalus 0.4029 8 unt 0.3986 9 dcu 0.3907 10 uned 0.3671 11 nus 0.3264 12 uaic 0.1529 13 rgu 0.1514
  • 16. Fusion Problems  appropriate weighing of modalities and score normalization/combination are not trivial problems  if results are assessed by visual similarity only, fusion is not a theoretically sound method
  • 17. Content-based Image Retrieval Problems  Content-based Image Retrieval (CBIR) with global features is notoriously noisy for image queries of low generality, i.e. the fraction of relevant images in a collection.  does not scale up well to large databases efficiency-wise
  • 18. Two – Stage Image Retrieval  how it works: first use the secondary modality to rank the collection then perform CBIR only on the top-K items  assumption: primary (image) – secondary (text) modalities  hypothesis: CBIR can do better than text retrieval in small sets or sets of high query generality  efficient benefit: Using a ‘cheaper’ secondary modality, this improves also efficiency by cutting down on costly CBIR operations  possible drawback: relevant images with empty or very noise secondary modalities would be completely missed
  • 19. Previous Work  Best results re-ranking by visual content has been seen before  mostly in different setups  All these approaches employed a static predefined K for all queries  not clear if it works
  • 20. Our Two-Stage Method  dynamic K  calculated dynamically per query  optimize a predefined effectiveness measure  without using external information or training data
  • 21. Retrieval Results cockpit of an airplane Image Only Text Only Static K=25 Dynamic K
  • 22. Best Fusion Method – Max of Sums  i the index running over example images (i=1,2,…)  j running over the visual descriptors (𝑗∈{1,2})  DESCji is the score against the ith example image for the jth descriptor  parameter w controls the relative contribution of the two media 𝑠 = 1 − 𝑤 max 𝑖 𝑗 𝑀𝑖𝑛𝑀𝑎𝑥 𝐷𝐸𝑆𝐶𝑗𝑖 + 𝑤𝑀𝑖𝑛𝑀𝑎𝑥 𝑡𝑓. 𝑖𝑑𝑓
  • 24. Implementation • developed in the C#/.NET Framework 4.0 • HTML, CSS and JavaScript (AJAX) technologies for the interface • requires a fairly modern browser
  • 25. Directions for Further Research  Multi-stage retrieval for multimodal databases based on modality hierarchy.  Fuzzy Fusion (replace w with membership function m).  Create artificial modalities (not only from relevance scores)  pseudo relevance feedback – cross media feedback
  • 26. Publications  Multimedia Search with Noisy Modalities: Fusion and Multistage Retrieval. Avi Arampatzis, Savvas A. Chatzichristofis, and Konstantinos Zagoris. In: CLEF (Notebook Papers/LABs/Workshops), 22-23 September, Padua, Italy, 2010.  www.MMRetrieval.net: A Multimodal Search Engine. Konstantinos Zagoris, Avi Arampatzis, and Savvas A. Chatzichristofis. In: Proceedings of the 3rd International Conference on SImilarity Search and APplications, SISAP 2010, Istanbul, Turkey, September 18-19, 2010. © Association for Computing Machinery (ACM).