SlideShare ist ein Scribd-Unternehmen logo
1 von 15
CERTH/CEA LIST at MediaEval Placing Task 2015
Giorgos Kordopatis-Zilos1, Adrian Popescu2, Symeon Papadopoulos1 and
Yiannis Kompatsiaris1
1 Information Technologies Institute (ITI), CERTH, Greece
2 CEA LIST, 91190 Gif-sur-Yvette, France
MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany
Summary
#2
Tag-based location estimation (2 runs)
• Based on a geographic Language Model
• Built upon the scheme of our 2014 participation [2] (Kordopatis-Zilos et
al., MediaEval 2014)
• Extensions from [3]: improved feature selection and weighting
(Kordopatis-Zilos et al., PAISI 2015)
Visual-based location estimation (1 run)
• Geospatial clustering scheme of the most visually similar images
Hybrid location estimation (2 run)
• Combination of the textual and visual approaches
Training sets
• Training set released by the organisers (≈4.7M geotagged items)
• YFCC dataset, excl. images from users in test set (≈40M geotagged items)
Tag-based location estimation
#3
• Processing steps of the approach
– Offline: language model construction
– Online: location estimation
Language Model (LM)
• LM generation scheme
– divide earth surface in rectangular cells with a side length of 0.01°
– calculate tag-cell probabilities based on the users that used the tag inside the cell
• LM-based estimation
– the probability of each cell is calculated from the summation of the respective
tag-cell probabilities
– Most Likely Cell (MLC) considered the cell with the highest probability and used
to produce the estimation
Inspired from [4]: (Popescu, MediaEval 2013)
#4
Feature Selection and Weighting
Feature Selection
• The final tag set 𝑇 is the intersection of the two tag sets
𝑇 = 𝑇𝑎 ∩ 𝑇𝑙
Feature Weighting
• Locality weight function, sort tags in 𝑇 based on their locality score
𝑤𝑙 =
𝑇 − (𝑗 − 1)
|𝑇|
• Normalize the weights from the Spatial Entropy (SE) function
𝑤𝑠𝑒 = 𝑁(𝑒(𝑡), 𝜇, 𝜎) max
𝑡∈𝑇
(𝑁(𝑒(𝑡), 𝜇, 𝜎))
• Combine the two weighting functions
𝑤 = 𝜔 ∗ 𝑤𝑠𝑒 + (1 − 𝜔) ∗ 𝑤𝑙
#5
accuracy locality
Accuracy
• Partition training set into p folds (p = 10)
• Keep one partition at a time, and build LM with
the rest p − 1
• Estimate the location of every item of the
withheld partition
• Accuracy score of every tag
tgeo 𝑡 =
𝑁𝑟
𝑁𝑡
𝑁𝑟: correctly geotagged items
𝑁𝑡: total items tagged with 𝑡
• Tags with non-zero accuracy score form the tag
set 𝑇𝑎
From [3]: Kordopatis-Zilos et al., PAISI 2015
#6
Estimated
Locations
Locality
#7
• Captures the spatial awareness of tags
• When a user uses a tag, he/she is assigned to the respective location cell
• Each cell has a set of users assigned to it
• All users assigned to the same cell are considered neighbours
• Locality score of every tag
loc 𝑡 = 𝑁𝑡 ∗
𝑐∈𝐶 𝑢∈𝑈𝑡,𝑐
|{𝑢′|𝑢′
∈ 𝑈𝑡,𝑐, 𝑢′ ≠ 𝑢}|
𝑁𝑡
2
𝑁𝑡: total occurrences of 𝑡
𝐶 : set of all cells
𝑈𝑡,𝑐: set of users that used tag 𝑡 inside cell c
• Tags with non-zero locality score form the tag set 𝑇𝑙
Locality – value distribution
#8
london (6975), paris (5452), nyc (3917)
luminancehdr (0.0035), dsc6362 (0.003), air photo (0.002)
Extensions
• Spatial Entropy (SE) function
– calculate entropy values applying the Shannon entropy formula in the tag-cell
probabilities
– build a Gaussian weight function based on the values of the tag SE
#9
• Internal Grid
– Built an additional LM using a finer grid, cell side length of 0.001°
– combine the MLC of the individual language models
• Similarity search [6] (Van Laere et al., ICMR 2011)
– determine 𝑘 most similar training images in the MLC
– their center-of-gravity is the final location estimation
From [2]: (Kordopatis-Zilos et al., MediaEval 2014)
Visual-based location estimation
#10
Model building
• CNN features adapted by fine-tuning the VGG model [5] (Simonyan & Zisserman,
ICLR 2015)
• Training: ~1K Points Of Interest (POIs), ~1200 images/POI
• Caffe [1] (Jia et al., arxiv 2014) is fed directly with the CNN features
• Compressed outputs of fc7 layer (4096d) to 128d using PCA
• CNN features used to compute image similarities 𝑠 𝑣𝑖𝑠,𝑖𝑗
Location Estimation
• Geospatial clustering of 𝑘 = 20 visually most similar images
• If 𝑗-th image is within 1km from the closest one of the previous j − 1 images, it is
assigned to its cluster, otherwise it forms its own cluster
• The largest cluster (or the first in case of equal size) is selected and its centroid is
used as the location estimate
Hybrid-based location estimation
Model building
• Combination of the textual and visual approaches
• Build LM model using the tag-based approach above and use it for MLC selection
Similarity Calculation
• Combination of the visual and textual similarities.
• Normalize the visual similarities to the range [0, 1]
• Similarity between two images
𝑠𝑖𝑗 =
𝑠𝑡𝑒𝑥,𝑖𝑗 + 𝑠 𝑣𝑖𝑠,𝑖𝑗
2
• The final estimation is the center-of-gravity of the 𝑘 = 5 most similar images
Low Confidence Estimations
• For those test images, with no estimate or confidence lower than 0.02 (≈10% of
the test set), the visual approach is used to produce the estimated locations
#11
Confidence
• Evaluate the confidence of the LM estimation of each query image
• Measures how localized are the language model cell estimations, based on
cell probabilities
• Confidence measure
conf 𝑖 =
𝑐∈𝐶{𝑝 𝑐 𝑖 |dist 𝑐, mlc < 𝑙}
𝑐∈𝐶 𝑝 𝑐 𝑖
𝑝(𝑐|𝑖): cell probability of cell c for image 𝑖
𝑑𝑖𝑠𝑡(𝑐1, 𝑐2): distance between 𝑐1 and 𝑐2
mlc: Most Likely Cell
#12
Runs and Results
#13
measure RUN-1 RUN-2 RUN-3 RUN-4 RUN-5
acc(1m) 0.15 0.01 0.15 0.16 0.16
acc(10m) 0.61 0.08 0.62 0.75 0.76
acc(100m) 6.40 1.76 6.52 7.73 7.83
acc(1km) 24.33 5.19 24.61 27.30 27.54
acc(10km) 43.07 7.43 43.41 46.48 46.77
m. error (km) 69 5663 61 24 22
RUN-1: Tag-based location estimation + released training set
RUN-2: Visual-based location estimation + released training set
RUN-3: Hybrid location estimation + released training set
RUN-4: Tag-based location estimation + YFCC dataset
RUN-5: Hybrid location estimation + YFCC dataset
Thank you!
• Code:
https://github.com/MKLab-ITI/multimedia-geotagging
• Get in touch:
@sympapadopoulos / papadop@iti.gr
@georgekordopatis / georgekordopatis@iti.gr
#14
References
#15
[1] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama,
and T. Darrell. Caffe: Convolutional architecture for fast feature embedding.
arXiv preprint arXiv:1408.5093, 2014.
[2] G. Kordopatis-Zilos, G. Orfanidis, S. Papadopoulos, and Y. Kompatsiaris.
Socialsensor at mediaeval placing task 2014. In MediaEval 2014 Placing Task,
2014.
[3] G. Kordopatis-Zilos, S. Papadopoulos, and Y. Kompatsiaris. Geotagging social
media content with a refined language modelling approach. In Intelligence and
Security Informatics, pages 21–40, 2015.
[4] A. Popescu. CEA LIST's participation at mediaeval 2013 placing task. In
MediaEval 2013 Placing Task, 2013.
[5] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-
scale image recognition. In International Conference on Learning
Representations, 2015.
[6] O. Van Laere, S. Schockaert, and B. Dhoedt. Finding locations of Flickr resources
using language models and similarity search. ICMR ’11, pages 48:1–48:8, New
York, NY, USA, 2011. ACM.

Weitere ähnliche Inhalte

Was ist angesagt?

Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Sunando Sengupta
 
unrban-building-damage-detection-by-PJLi.ppt
unrban-building-damage-detection-by-PJLi.pptunrban-building-damage-detection-by-PJLi.ppt
unrban-building-damage-detection-by-PJLi.ppt
grssieee
 
Visual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local FeaturesVisual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local Features
Universitat Politècnica de Catalunya
 
Project presentation
Project presentationProject presentation
Project presentation
Maham Sajid
 

Was ist angesagt? (9)

Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
 
CSTalks - Object detection and tracking - 25th May
CSTalks - Object detection and tracking - 25th MayCSTalks - Object detection and tracking - 25th May
CSTalks - Object detection and tracking - 25th May
 
unrban-building-damage-detection-by-PJLi.ppt
unrban-building-damage-detection-by-PJLi.pptunrban-building-damage-detection-by-PJLi.ppt
unrban-building-damage-detection-by-PJLi.ppt
 
Visual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local FeaturesVisual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local Features
 
Image formation
Image formationImage formation
Image formation
 
Hyougo iv2014 slide
Hyougo iv2014 slideHyougo iv2014 slide
Hyougo iv2014 slide
 
Project presentation
Project presentationProject presentation
Project presentation
 
Henrik Christensen - Vision for co-robot applications
Henrik Christensen  -  Vision for co-robot applicationsHenrik Christensen  -  Vision for co-robot applications
Henrik Christensen - Vision for co-robot applications
 
Simultaneous Localization and Mapping for Pedestrians using Distortions of th...
Simultaneous Localization and Mapping for Pedestrians using Distortions of th...Simultaneous Localization and Mapping for Pedestrians using Distortions of th...
Simultaneous Localization and Mapping for Pedestrians using Distortions of th...
 

Andere mochten auch

How to write a good newspaper article
How to write a good newspaper articleHow to write a good newspaper article
How to write a good newspaper article
Yevgeniya Grigoryeva
 

Andere mochten auch (18)

O absolutismo europeu
O absolutismo europeuO absolutismo europeu
O absolutismo europeu
 
50 terrifying facts about UK personal finance
50 terrifying facts about UK personal finance50 terrifying facts about UK personal finance
50 terrifying facts about UK personal finance
 
Alternatives to power point
Alternatives to power pointAlternatives to power point
Alternatives to power point
 
Macrosolutions Training: Project Communications Management
Macrosolutions Training: Project Communications ManagementMacrosolutions Training: Project Communications Management
Macrosolutions Training: Project Communications Management
 
Travel blog presentation
Travel blog presentationTravel blog presentation
Travel blog presentation
 
Predicting News Popularity by Mining Online Discussions
Predicting News Popularity by Mining Online DiscussionsPredicting News Popularity by Mining Online Discussions
Predicting News Popularity by Mining Online Discussions
 
Frictionless Bicycle Dynamo
Frictionless Bicycle DynamoFrictionless Bicycle Dynamo
Frictionless Bicycle Dynamo
 
Parcerias público-privada PPP
Parcerias público-privada PPP Parcerias público-privada PPP
Parcerias público-privada PPP
 
IR Based Home Automation
IR Based Home AutomationIR Based Home Automation
IR Based Home Automation
 
A República Populista
A República PopulistaA República Populista
A República Populista
 
Natural Enviroment
Natural EnviromentNatural Enviroment
Natural Enviroment
 
Auguste comte e o positivismo 2
Auguste comte e o positivismo 2Auguste comte e o positivismo 2
Auguste comte e o positivismo 2
 
Solar Irrigation Pumps in India: Can Electicity Buy-Back Curb Groundwater Ove...
Solar Irrigation Pumps in India: Can Electicity Buy-Back Curb Groundwater Ove...Solar Irrigation Pumps in India: Can Electicity Buy-Back Curb Groundwater Ove...
Solar Irrigation Pumps in India: Can Electicity Buy-Back Curb Groundwater Ove...
 
SIMULATION OF TEMPERATURE SENSOR USING LABVIEW
SIMULATION OF TEMPERATURE SENSOR USING LABVIEWSIMULATION OF TEMPERATURE SENSOR USING LABVIEW
SIMULATION OF TEMPERATURE SENSOR USING LABVIEW
 
Sociologia introdução fundamentos e bases
Sociologia introdução fundamentos e basesSociologia introdução fundamentos e bases
Sociologia introdução fundamentos e bases
 
Hv ppt
Hv pptHv ppt
Hv ppt
 
How to write a good newspaper article
How to write a good newspaper articleHow to write a good newspaper article
How to write a good newspaper article
 
5 Key Chart Project Management (TM) Methodology
5 Key Chart Project Management (TM) Methodology5 Key Chart Project Management (TM) Methodology
5 Key Chart Project Management (TM) Methodology
 

Ähnlich wie CERTH/CEA LIST at MediaEval Placing Task 2015

PCA and Classification
PCA and ClassificationPCA and Classification
PCA and Classification
Fatwa Ramdani
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
Yaxin Liu
 
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
Edge AI and Vision Alliance
 

Ähnlich wie CERTH/CEA LIST at MediaEval Placing Task 2015 (20)

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
 
MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015
MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015
MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015
 
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachGeotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
 
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachGeotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
 
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHMA ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
 
PCA and Classification
PCA and ClassificationPCA and Classification
PCA and Classification
 
Human action recognition with kinect using a joint motion descriptor
Human action recognition with kinect using a joint motion descriptorHuman action recognition with kinect using a joint motion descriptor
Human action recognition with kinect using a joint motion descriptor
 
Improved nonlocal means based on pre classification and invariant block matching
Improved nonlocal means based on pre classification and invariant block matchingImproved nonlocal means based on pre classification and invariant block matching
Improved nonlocal means based on pre classification and invariant block matching
 
Improved nonlocal means based on pre classification and invariant block matching
Improved nonlocal means based on pre classification and invariant block matchingImproved nonlocal means based on pre classification and invariant block matching
Improved nonlocal means based on pre classification and invariant block matching
 
Face recognition v1
Face recognition v1Face recognition v1
Face recognition v1
 
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
 
Human Pose Estimation by Deep Learning
Human Pose Estimation by Deep LearningHuman Pose Estimation by Deep Learning
Human Pose Estimation by Deep Learning
 
Video Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAPVideo Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAP
 
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
 
Feature extraction based retrieval of
Feature extraction based retrieval ofFeature extraction based retrieval of
Feature extraction based retrieval of
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
Data-Driven Motion Estimation With Spatial Adaptation
Data-Driven Motion Estimation With Spatial AdaptationData-Driven Motion Estimation With Spatial Adaptation
Data-Driven Motion Estimation With Spatial Adaptation
 
NetVLAD: CNN architecture for weakly supervised place recognition
NetVLAD:  CNN architecture for weakly supervised place recognitionNetVLAD:  CNN architecture for weakly supervised place recognition
NetVLAD: CNN architecture for weakly supervised place recognition
 
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
 

Mehr von Symeon Papadopoulos

Mehr von Symeon Papadopoulos (20)

DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
 
Deepfakes: An Emerging Internet Threat and their Detection
Deepfakes: An Emerging Internet Threat and their DetectionDeepfakes: An Emerging Internet Threat and their Detection
Deepfakes: An Emerging Internet Threat and their Detection
 
Knowledge-based Fusion for Image Tampering Localization
Knowledge-based Fusion for Image Tampering LocalizationKnowledge-based Fusion for Image Tampering Localization
Knowledge-based Fusion for Image Tampering Localization
 
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
 
COVID-19 Infodemic vs Contact Tracing
COVID-19 Infodemic vs Contact TracingCOVID-19 Infodemic vs Contact Tracing
COVID-19 Infodemic vs Contact Tracing
 
Similarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia content
 
Twitter-based Sensing of City-level Air Quality
Twitter-based Sensing of City-level Air QualityTwitter-based Sensing of City-level Air Quality
Twitter-based Sensing of City-level Air Quality
 
Aggregating and Analyzing the Context of Social Media Content
Aggregating and Analyzing the Context of Social Media ContentAggregating and Analyzing the Context of Social Media Content
Aggregating and Analyzing the Context of Social Media Content
 
Verifying Multimedia Content on the Internet
Verifying Multimedia Content on the InternetVerifying Multimedia Content on the Internet
Verifying Multimedia Content on the Internet
 
A Web-based Service for Image Tampering Detection
A Web-based Service for Image Tampering DetectionA Web-based Service for Image Tampering Detection
A Web-based Service for Image Tampering Detection
 
Learning to detect Misleading Content on Twitter
Learning to detect Misleading Content on TwitterLearning to detect Misleading Content on Twitter
Learning to detect Misleading Content on Twitter
 
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersNear-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
 
Verifying Multimedia Use at MediaEval 2016
Verifying Multimedia Use at MediaEval 2016Verifying Multimedia Use at MediaEval 2016
Verifying Multimedia Use at MediaEval 2016
 
Multimedia Privacy
Multimedia PrivacyMultimedia Privacy
Multimedia Privacy
 
In-depth Exploration of Geotagging Performance
In-depth Exploration of Geotagging PerformanceIn-depth Exploration of Geotagging Performance
In-depth Exploration of Geotagging Performance
 
Perceived versus Actual Predictability of Personal Information in Social Netw...
Perceived versus Actual Predictability of Personal Information in Social Netw...Perceived versus Actual Predictability of Personal Information in Social Netw...
Perceived versus Actual Predictability of Personal Information in Social Netw...
 
Web and Social Media Image Forensics for News Professionals
Web and Social Media Image Forensics for News ProfessionalsWeb and Social Media Image Forensics for News Professionals
Web and Social Media Image Forensics for News Professionals
 
Verifying Multimedia Use at MediaEval 2015
Verifying Multimedia Use at MediaEval 2015Verifying Multimedia Use at MediaEval 2015
Verifying Multimedia Use at MediaEval 2015
 
Detecting image splicing in the wild Web
Detecting image splicing in the wild WebDetecting image splicing in the wild Web
Detecting image splicing in the wild Web
 
Learning to Classify Users in Online Interaction Networks
Learning to Classify Users in Online Interaction NetworksLearning to Classify Users in Online Interaction Networks
Learning to Classify Users in Online Interaction Networks
 

Kürzlich hochgeladen

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 

Kürzlich hochgeladen (20)

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 

CERTH/CEA LIST at MediaEval Placing Task 2015

  • 1. CERTH/CEA LIST at MediaEval Placing Task 2015 Giorgos Kordopatis-Zilos1, Adrian Popescu2, Symeon Papadopoulos1 and Yiannis Kompatsiaris1 1 Information Technologies Institute (ITI), CERTH, Greece 2 CEA LIST, 91190 Gif-sur-Yvette, France MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany
  • 2. Summary #2 Tag-based location estimation (2 runs) • Based on a geographic Language Model • Built upon the scheme of our 2014 participation [2] (Kordopatis-Zilos et al., MediaEval 2014) • Extensions from [3]: improved feature selection and weighting (Kordopatis-Zilos et al., PAISI 2015) Visual-based location estimation (1 run) • Geospatial clustering scheme of the most visually similar images Hybrid location estimation (2 run) • Combination of the textual and visual approaches Training sets • Training set released by the organisers (≈4.7M geotagged items) • YFCC dataset, excl. images from users in test set (≈40M geotagged items)
  • 3. Tag-based location estimation #3 • Processing steps of the approach – Offline: language model construction – Online: location estimation
  • 4. Language Model (LM) • LM generation scheme – divide earth surface in rectangular cells with a side length of 0.01° – calculate tag-cell probabilities based on the users that used the tag inside the cell • LM-based estimation – the probability of each cell is calculated from the summation of the respective tag-cell probabilities – Most Likely Cell (MLC) considered the cell with the highest probability and used to produce the estimation Inspired from [4]: (Popescu, MediaEval 2013) #4
  • 5. Feature Selection and Weighting Feature Selection • The final tag set 𝑇 is the intersection of the two tag sets 𝑇 = 𝑇𝑎 ∩ 𝑇𝑙 Feature Weighting • Locality weight function, sort tags in 𝑇 based on their locality score 𝑤𝑙 = 𝑇 − (𝑗 − 1) |𝑇| • Normalize the weights from the Spatial Entropy (SE) function 𝑤𝑠𝑒 = 𝑁(𝑒(𝑡), 𝜇, 𝜎) max 𝑡∈𝑇 (𝑁(𝑒(𝑡), 𝜇, 𝜎)) • Combine the two weighting functions 𝑤 = 𝜔 ∗ 𝑤𝑠𝑒 + (1 − 𝜔) ∗ 𝑤𝑙 #5 accuracy locality
  • 6. Accuracy • Partition training set into p folds (p = 10) • Keep one partition at a time, and build LM with the rest p − 1 • Estimate the location of every item of the withheld partition • Accuracy score of every tag tgeo 𝑡 = 𝑁𝑟 𝑁𝑡 𝑁𝑟: correctly geotagged items 𝑁𝑡: total items tagged with 𝑡 • Tags with non-zero accuracy score form the tag set 𝑇𝑎 From [3]: Kordopatis-Zilos et al., PAISI 2015 #6 Estimated Locations
  • 7. Locality #7 • Captures the spatial awareness of tags • When a user uses a tag, he/she is assigned to the respective location cell • Each cell has a set of users assigned to it • All users assigned to the same cell are considered neighbours • Locality score of every tag loc 𝑡 = 𝑁𝑡 ∗ 𝑐∈𝐶 𝑢∈𝑈𝑡,𝑐 |{𝑢′|𝑢′ ∈ 𝑈𝑡,𝑐, 𝑢′ ≠ 𝑢}| 𝑁𝑡 2 𝑁𝑡: total occurrences of 𝑡 𝐶 : set of all cells 𝑈𝑡,𝑐: set of users that used tag 𝑡 inside cell c • Tags with non-zero locality score form the tag set 𝑇𝑙
  • 8. Locality – value distribution #8 london (6975), paris (5452), nyc (3917) luminancehdr (0.0035), dsc6362 (0.003), air photo (0.002)
  • 9. Extensions • Spatial Entropy (SE) function – calculate entropy values applying the Shannon entropy formula in the tag-cell probabilities – build a Gaussian weight function based on the values of the tag SE #9 • Internal Grid – Built an additional LM using a finer grid, cell side length of 0.001° – combine the MLC of the individual language models • Similarity search [6] (Van Laere et al., ICMR 2011) – determine 𝑘 most similar training images in the MLC – their center-of-gravity is the final location estimation From [2]: (Kordopatis-Zilos et al., MediaEval 2014)
  • 10. Visual-based location estimation #10 Model building • CNN features adapted by fine-tuning the VGG model [5] (Simonyan & Zisserman, ICLR 2015) • Training: ~1K Points Of Interest (POIs), ~1200 images/POI • Caffe [1] (Jia et al., arxiv 2014) is fed directly with the CNN features • Compressed outputs of fc7 layer (4096d) to 128d using PCA • CNN features used to compute image similarities 𝑠 𝑣𝑖𝑠,𝑖𝑗 Location Estimation • Geospatial clustering of 𝑘 = 20 visually most similar images • If 𝑗-th image is within 1km from the closest one of the previous j − 1 images, it is assigned to its cluster, otherwise it forms its own cluster • The largest cluster (or the first in case of equal size) is selected and its centroid is used as the location estimate
  • 11. Hybrid-based location estimation Model building • Combination of the textual and visual approaches • Build LM model using the tag-based approach above and use it for MLC selection Similarity Calculation • Combination of the visual and textual similarities. • Normalize the visual similarities to the range [0, 1] • Similarity between two images 𝑠𝑖𝑗 = 𝑠𝑡𝑒𝑥,𝑖𝑗 + 𝑠 𝑣𝑖𝑠,𝑖𝑗 2 • The final estimation is the center-of-gravity of the 𝑘 = 5 most similar images Low Confidence Estimations • For those test images, with no estimate or confidence lower than 0.02 (≈10% of the test set), the visual approach is used to produce the estimated locations #11
  • 12. Confidence • Evaluate the confidence of the LM estimation of each query image • Measures how localized are the language model cell estimations, based on cell probabilities • Confidence measure conf 𝑖 = 𝑐∈𝐶{𝑝 𝑐 𝑖 |dist 𝑐, mlc < 𝑙} 𝑐∈𝐶 𝑝 𝑐 𝑖 𝑝(𝑐|𝑖): cell probability of cell c for image 𝑖 𝑑𝑖𝑠𝑡(𝑐1, 𝑐2): distance between 𝑐1 and 𝑐2 mlc: Most Likely Cell #12
  • 13. Runs and Results #13 measure RUN-1 RUN-2 RUN-3 RUN-4 RUN-5 acc(1m) 0.15 0.01 0.15 0.16 0.16 acc(10m) 0.61 0.08 0.62 0.75 0.76 acc(100m) 6.40 1.76 6.52 7.73 7.83 acc(1km) 24.33 5.19 24.61 27.30 27.54 acc(10km) 43.07 7.43 43.41 46.48 46.77 m. error (km) 69 5663 61 24 22 RUN-1: Tag-based location estimation + released training set RUN-2: Visual-based location estimation + released training set RUN-3: Hybrid location estimation + released training set RUN-4: Tag-based location estimation + YFCC dataset RUN-5: Hybrid location estimation + YFCC dataset
  • 14. Thank you! • Code: https://github.com/MKLab-ITI/multimedia-geotagging • Get in touch: @sympapadopoulos / papadop@iti.gr @georgekordopatis / georgekordopatis@iti.gr #14
  • 15. References #15 [1] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014. [2] G. Kordopatis-Zilos, G. Orfanidis, S. Papadopoulos, and Y. Kompatsiaris. Socialsensor at mediaeval placing task 2014. In MediaEval 2014 Placing Task, 2014. [3] G. Kordopatis-Zilos, S. Papadopoulos, and Y. Kompatsiaris. Geotagging social media content with a refined language modelling approach. In Intelligence and Security Informatics, pages 21–40, 2015. [4] A. Popescu. CEA LIST's participation at mediaeval 2013 placing task. In MediaEval 2013 Placing Task, 2013. [5] K. Simonyan and A. Zisserman. Very deep convolutional networks for large- scale image recognition. In International Conference on Learning Representations, 2015. [6] O. Van Laere, S. Schockaert, and B. Dhoedt. Finding locations of Flickr resources using language models and similarity search. ICMR ’11, pages 48:1–48:8, New York, NY, USA, 2011. ACM.

Hinweis der Redaktion

  1. Different kinds of user classification: topic-oriented (e.g., interest/expertise) role-based/behavioral (e.g., bot/spammer) geographical location Useful for advertising, user recommendation, expert search, etc. For personal accounts, user classification raises privacy concerns Challenges multi-linguality Brevity informal language