How Spatial Segmentation improves the Multimodal Geo-Tagging

•Als PPTX, PDF herunterladen•

0 gefällt mir•682 views

MediaEval2012

Technologie

Pascal Kelm
Kelm@nue.tu-berlin.de
Communication Systems Group
www.nue.tu-berlin.de Technische Universität Berlin

Thursday, 04 October 2012

What is meant by Spatial Segmentation? 2

World map is iteratively divided into segments of
different sizes

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Run4: only audio/visual information 3

Descriptions are pooled for each spatial segment (k-d tree) in
the different hierarchy level
Visual nearest neighbour search in lowest hierarchy

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Visual Region Model 4

Returns the visually most similar areas, which are
represented by a mean feature vector of all training images
and videos of the respective area

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Run4: Results 5

UG-CU 100

Th [km] TUB [%] [%]
90
1 0,1 0,1
10 0,1 0,7 80

20 0,1 0,9 70

50 0,1 1,1 Accuracy [%] 60

100 0,2 2,6
50
200 0,8 6,9 TUB
UG-CU
40
500 4,1 14,7
1000 14,8 21,2 30

2000 44,5 28,5 20

5000 81,0 29,6
10
10000 98,7 91,4
0
15000 100,0 95,7 1 10 20 50 100 200 500 1000 2000 5000 100001500020000
Margin of Error [km]
20000 100,0 100,0
Kelm: How Spatial Segmentation improves the Multimodal Geo-

Run1: No additional data or gazetteers 6

combines textual and visual features: translation of tags and
extracted words (NLP) from the title and the description.
Porter stemmer and stop-word elimination for each segment
and granularity in the spatial segmentation.
Visual Search for the k-nearest segments in the lowest
hierarchy

Kelm: How Spatial Segmentation improves the Multimodal Geo-

7

Term-location-distribution:

Term frequency-inverse document frequency:

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Example 8

Condence scores of the visual approach (right) restricted
to be in the most likely spatial segment determined by
the textual approach (left)

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Run1: Results 9

Th [km] TUB [%] 100

1 13,7 90

10 32,7 80
20 36,5
70
50 39,4
100 41,8 60
Accuracy [%]

200 44,8 50

500 51,7 40
TUB

1000 62,4 30
2000 76,5
20
5000 92,3
10000 99,4 10

15000 100,0 0
1 10 20 50 100 200 500 1000 2000 5000 10000 15000 20000
20000 100,0 Margin of Error [km]

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Run2: No additional data 10

For the highest hierarchy level the boundaries extraction using
gazetteers (GeoNames, Wikipedia and Google Maps) for the
spell checked words is added.

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Collaborative Systems: Example 11

這是我上次去巴黎。在那裡，我得
到了我的城堡在迪斯尼樂園看。…

這是我上次去巴黎。在那裡，我得到了我的城堡在迪斯尼樂園看。

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Collaborative Systems: Example 12

這是我上次去巴黎。在那裡，我得到了我的城堡在迪斯尼樂園看。…

Which language is it?
Chinese
This was my last trip to Paris. I visited the castle in Disneyland…

Which words gives us information? Tags?
Trip, Paris, Castle, Disneyland

Which of these nouns have got geographical information?
Paris, Disneyland

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Geographical Ambiguity 13

Paris Disneyland
R(ci) = Rank sum
France China ci = Countries
N = Number of toponym
Canada USA

Puerto
France
Rico

… …

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Extracted geo. items 14

kauii
hawaii

usa

00001: hawaii, kauai, usa

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Results 15

100

90

80

70

60
Accuracy [%]

50 Run1
Run2
40 Run4

30

20

10

0
1 10 20 50 100 200 500 1000 2000 5000 10000 15000 20000
Margin of Error [km]

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Question 16

Thanks for your attention!

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Training Set: Weighting 17

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Training Set: Features 18

Kelm: How Spatial Segmentation improves the Multimodal Geo-

Empfohlen

M tech seminar on dem uncertaintyBanamali Panigrahi

ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...MediaEval2012

The L2F Spoken Web Search system for Mediaeval 2012MediaEval2012

KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesMediaEval2012

TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...MediaEval2012

Intro totransportphenomenanewilovepurin

14 10 21_презентация стуStanislav Litvinenko

6dicas– veda 4souzadea1

Empfohlen

M tech seminar on dem uncertaintyBanamali Panigrahi

ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...MediaEval2012

The L2F Spoken Web Search system for Mediaeval 2012MediaEval2012

KIT at MediaEval 2012 – Content–based Genre Classification with Visual CuesMediaEval2012

TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...MediaEval2012

Intro totransportphenomenanewilovepurin

14 10 21_презентация стуStanislav Litvinenko

6dicas– veda 4souzadea1

Ghent and Cardiff University at the 2012 Placing TaskMediaEval2012

Designinteração– veda 3souzadea1

The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskMediaEval2012

TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMMediaEval2012

NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskMediaEval2012

Brave New Task: User Account MatchingMediaEval2012

Como hacer una pagina web en wix sharonSharon Jimenez

MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval2012

TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...MediaEval2012

Activities for journalistic skillsJNavarro0321

κειμενοΚΑΠΑΝΤΑΗΣ ΜΑΝΩΛΗΣ

Papiloma humanoalexitolindoo

2010 Marketing PlanJPemberton15

The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...MediaEval2012

14 10 21_презентация стуStanislav Litvinenko

GTTS System for the Spoken Web Search Task at MediaEval 2012MediaEval2012

10 ρ. δρακουλησΚΑΠΑΝΤΑΗΣ ΜΑΝΩΛΗΣ

Core companies for eeenarenans

Mr. & Mrs. S Before & AfterMichael Kret

National publishing company-Titli caseVishwa Bhaskar

Motion-Speedgiordepasamba

A Novel Approach for Ship Recognition using Shape and Texture ijait

Weitere ähnliche Inhalte

Andere mochten auch

Ghent and Cardiff University at the 2012 Placing TaskMediaEval2012

Designinteração– veda 3souzadea1

The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskMediaEval2012

TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMMediaEval2012

NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskMediaEval2012

Brave New Task: User Account MatchingMediaEval2012

Como hacer una pagina web en wix sharonSharon Jimenez

MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval2012

TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...MediaEval2012

Activities for journalistic skillsJNavarro0321

κειμενοΚΑΠΑΝΤΑΗΣ ΜΑΝΩΛΗΣ

Papiloma humanoalexitolindoo

2010 Marketing PlanJPemberton15

The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...MediaEval2012

14 10 21_презентация стуStanislav Litvinenko

GTTS System for the Spoken Web Search Task at MediaEval 2012MediaEval2012

10 ρ. δρακουλησΚΑΠΑΝΤΑΗΣ ΜΑΝΩΛΗΣ

Core companies for eeenarenans

Mr. & Mrs. S Before & AfterMichael Kret

National publishing company-Titli caseVishwa Bhaskar

Andere mochten auch (20)

Ghent and Cardiff University at the 2012 Placing Task

Designinteração– veda 3

The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task

TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM

NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task

Brave New Task: User Account Matching

Como hacer una pagina web en wix sharon

MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...

TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...

Activities for journalistic skills

κειμενο

Papiloma humano

2010 Marketing Plan

The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...

14 10 21_презентация сту

GTTS System for the Spoken Web Search Task at MediaEval 2012

10 ρ. δρακουλησ

Core companies for eee

Mr. & Mrs. S Before & After

National publishing company-Titli case

Ähnlich wie How Spatial Segmentation improves the Multimodal Geo-Tagging

Motion-Speedgiordepasamba

A Novel Approach for Ship Recognition using Shape and Texture ijait

IRJET- Analysis of Ship Detection Techniques in Remote Sensing ImagesIRJET Journal

EDIT GeoTools presentation in TDWG 2009 (Montpellier)Pere Roca Ristol

[論文紹介] DPSNet: End-to-end Deep Plane Sweep StereoSeiya Ito

2009 11 11 Byrd A Different Approach To Decom LiabilityRobert Byrd

M tech seminar-Effect of dems on flood inundation modelingBanamali Panigrahi

AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...Ravi Kiran B.

GDP Viva SlidesCyril Jacques

SSII2018企画: センシングデバイスの多様化と空間モデリングの未来SSII

GeoCAPE StrategiesPat Cappelaere

Optimization Of K-Means Clustering For DECT Using ACOIRJET Journal

Recent Object Detection Research & Person DetectionKai-Wen Zhao

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation LearningToru Tamaki

Optimising Autonomous Robot Swarm Parameters for Stable Formation DesignDaniel H. Stolfi

2010 ITE Annual Meeting and ExhibitFranz Loewenherz

Ähnlich wie How Spatial Segmentation improves the Multimodal Geo-Tagging (16)

Motion-Speed

A Novel Approach for Ship Recognition using Shape and Texture

IRJET- Analysis of Ship Detection Techniques in Remote Sensing Images

EDIT GeoTools presentation in TDWG 2009 (Montpellier)

[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo

2009 11 11 Byrd A Different Approach To Decom Liability

M tech seminar-Effect of dems on flood inundation modeling

AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...

GDP Viva Slides

SSII2018企画: センシングデバイスの多様化と空間モデリングの未来

GeoCAPE Strategies

Optimization Of K-Means Clustering For DECT Using ACO

Recent Object Detection Research & Person Detection

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation Learning

Optimising Autonomous Robot Swarm Parameters for Stable Formation Design

2010 ITE Annual Meeting and Exhibit

Mehr von MediaEval2012

MediaEval 2012 OpeningMediaEval2012

ClosingMediaEval2012

A Multimodal Approach for Video Geocoding MediaEval2012

Brave New Task: Musiclef Multimodal Music TaggingMediaEval2012

Search and Hyperlinking Task at MediaEval 2012MediaEval2012

CUNI at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012

DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012

Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...MediaEval2012

The CLEF Initiative From 2010 to 2012 and OnwardsMediaEval2012

Overview of MediaEval 2012 Visual Privacy TaskMediaEval2012

MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval2012

Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...MediaEval2012

mevd2012 esra_MediaEval2012

Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...MediaEval2012

The MediaEval 2012 Affect Task: Violent Scenes DetectioMediaEval2012

LIG at MediaEval 2012 affect task: use of a generic methodMediaEval2012

Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...MediaEval2012

UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskMediaEval2012

ARF @ MediaEval 2012: Multimodal Video ClassificationMediaEval2012

Overview of the MediaEval 2012 Tagging TaskMediaEval2012

Mehr von MediaEval2012 (20)

MediaEval 2012 Opening

Closing

A Multimodal Approach for Video Geocoding

Brave New Task: Musiclef Multimodal Music Tagging

Search and Hyperlinking Task at MediaEval 2012

CUNI at MediaEval 2012: Search and Hyperlinking Task

DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task

Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...

The CLEF Initiative From 2010 to 2012 and Onwards

Overview of MediaEval 2012 Visual Privacy Task

MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...

Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...

mevd2012 esra_

Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...

The MediaEval 2012 Affect Task: Violent Scenes Detectio

LIG at MediaEval 2012 affect task: use of a generic method

Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...

UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task

ARF @ MediaEval 2012: Multimodal Video Classification

Overview of the MediaEval 2012 Tagging Task

Kürzlich hochgeladen

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

How to convert PDF to text with Nanonetsnaman860154

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Kürzlich hochgeladen (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

GenCyber Cyber Security Day Presentation

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

How to convert PDF to text with Nanonets

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

The Codex of Business Writing Software for Real-World Solutions 2.pptx

A Domino Admins Adventures (Engage 2024)

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Advantages of Hiring UIUX Design Service Providers for Your Business

Finology Group – Insurtech Innovation Award 2024

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

08448380779 Call Girls In Civil Lines Women Seeking Men

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Exploring the Future Potential of AI-Enabled Smartphone Processors

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

How Spatial Segmentation improves the Multimodal Geo-Tagging

1. Pascal Kelm Kelm@nue.tu-berlin.de Communication Systems Group www.nue.tu-berlin.de Technische Universität Berlin Thursday, 04 October 2012

2. What is meant by Spatial Segmentation? 2 World map is iteratively divided into segments of different sizes Kelm: How Spatial Segmentation improves the Multimodal Geo-

3. Run4: only audio/visual information 3 Descriptions are pooled for each spatial segment (k-d tree) in the different hierarchy level Visual nearest neighbour search in lowest hierarchy Kelm: How Spatial Segmentation improves the Multimodal Geo-

4. Visual Region Model 4 Returns the visually most similar areas, which are represented by a mean feature vector of all training images and videos of the respective area Kelm: How Spatial Segmentation improves the Multimodal Geo-

5. Run4: Results 5 UG-CU 100 Th [km] TUB [%] [%] 90 1 0,1 0,1 10 0,1 0,7 80 20 0,1 0,9 70 50 0,1 1,1 Accuracy [%] 60 100 0,2 2,6 50 200 0,8 6,9 TUB UG-CU 40 500 4,1 14,7 1000 14,8 21,2 30 2000 44,5 28,5 20 5000 81,0 29,6 10 10000 98,7 91,4 0 15000 100,0 95,7 1 10 20 50 100 200 500 1000 2000 5000 100001500020000 Margin of Error [km] 20000 100,0 100,0 Kelm: How Spatial Segmentation improves the Multimodal Geo-

6. Run1: No additional data or gazetteers 6 combines textual and visual features: translation of tags and extracted words (NLP) from the title and the description. Porter stemmer and stop-word elimination for each segment and granularity in the spatial segmentation. Visual Search for the k-nearest segments in the lowest hierarchy Kelm: How Spatial Segmentation improves the Multimodal Geo-

7. 7 Term-location-distribution: Term frequency-inverse document frequency: Kelm: How Spatial Segmentation improves the Multimodal Geo-

8. Example 8 Condence scores of the visual approach (right) restricted to be in the most likely spatial segment determined by the textual approach (left) Kelm: How Spatial Segmentation improves the Multimodal Geo-

9. Run1: Results 9 Th [km] TUB [%] 100 1 13,7 90 10 32,7 80 20 36,5 70 50 39,4 100 41,8 60 Accuracy [%] 200 44,8 50 500 51,7 40 TUB 1000 62,4 30 2000 76,5 20 5000 92,3 10000 99,4 10 15000 100,0 0 1 10 20 50 100 200 500 1000 2000 5000 10000 15000 20000 20000 100,0 Margin of Error [km] Kelm: How Spatial Segmentation improves the Multimodal Geo-

10. Run2: No additional data 10 For the highest hierarchy level the boundaries extraction using gazetteers (GeoNames, Wikipedia and Google Maps) for the spell checked words is added. Kelm: How Spatial Segmentation improves the Multimodal Geo-

11. Collaborative Systems: Example 11 這是我上次去巴黎。在那裡，我得到了我的城堡在迪斯尼樂園看。… 這是我上次去巴黎。在那裡，我得到了我的城堡在迪斯尼樂園看。 Kelm: How Spatial Segmentation improves the Multimodal Geo-

12. Collaborative Systems: Example 12 這是我上次去巴黎。在那裡，我得到了我的城堡在迪斯尼樂園看。… Which language is it? Chinese This was my last trip to Paris. I visited the castle in Disneyland… Which words gives us information? Tags? Trip, Paris, Castle, Disneyland Which of these nouns have got geographical information? Paris, Disneyland Kelm: How Spatial Segmentation improves the Multimodal Geo-

13. Geographical Ambiguity 13 Paris Disneyland R(ci) = Rank sum France China ci = Countries N = Number of toponym Canada USA Puerto France Rico … … Kelm: How Spatial Segmentation improves the Multimodal Geo-

14. Extracted geo. items 14 kauii hawaii usa 00001: hawaii, kauai, usa Kelm: How Spatial Segmentation improves the Multimodal Geo-

15. Results 15 100 90 80 70 60 Accuracy [%] 50 Run1 Run2 40 Run4 30 20 10 0 1 10 20 50 100 200 500 1000 2000 5000 10000 15000 20000 Margin of Error [km] Kelm: How Spatial Segmentation improves the Multimodal Geo-

16. Question 16 Thanks for your attention! Kelm: How Spatial Segmentation improves the Multimodal Geo-

17. Training Set: Weighting 17 Kelm: How Spatial Segmentation improves the Multimodal Geo-

18. Training Set: Features 18 Kelm: How Spatial Segmentation improves the Multimodal Geo-