SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Exploring Term Selection for Geographic
            Blind Feedback

                  Johannes Leveling

   Intelligent Information and Communication Systems (IICS)
          University of Hagen (FernUniversität in Hagen)
                      58084 Hagen, Germany
          firstname.lastname@fernuni-hagen.de


        GIR 2007 Workshop, Lisbon, Portugal
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                                                         Outline
  Feedback
  Johannes
   Leveling
                        1 Introduction
Introduction

Creating a
Geographical            2 Creating a Geographical Knowledge Base
Knowledge
Base
                                   GeoNames Data
GeoNames Data
PND Data
                                   PND Data
Experiments
on
Geographic              3 Experiments on Geographic Blind Feedback
Blind
Feedback
                                   Experimental Settings
Experimental
Settings                           Results
Results
Discussion                         Discussion
Outlook

References              4 Outlook



               Johannes Leveling               Exploring Term Selection for Geographic Blind Feedback   2 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                                       Blind Feedback
  Feedback
  Johannes
   Leveling              General idea:
Introduction
                         Improve IR performance by expanding a query
Creating a                  1      The original query Qo is processed and an initial
Geographical
Knowledge                          ranked result set Ro of documents is obtained
Base
GeoNames Data
PND Data
                            2      D documents from Ro are selected and presumed to be
Experiments
                                   relevant
on
Geographic                  3      T terms from these documents are extracted for
Blind
Feedback                           relevance feedback
Experimental
Settings
Results
                            4      Qo is modified into the final query Qf , merging the
Discussion
                                   extracted terms into the query and possibly
Outlook
                                   re-weighting all terms
References
                            5      The final result set Rf is retrieved with the query Qf


               Johannes Leveling                Exploring Term Selection for Geographic Blind Feedback   3 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                   Application of Blind Feedback to
  Feedback
  Johannes
                                                          GIR (1/2)
   Leveling

                            • Gey and Larson (2):
Introduction
                              an improvement on the order of 53% to 72% MAP (mean
Creating a
Geographical                  average precision) was achieved for some monolingual
Knowledge
Base                          German GIR topics on the GeoCLEF 2006 data (using
GeoNames Data
PND Data
                              T = 30, D = 5); no significant improvement for English
Experiments                 • Gey and Petras (1):
on
Geographic                    “the most improved queries seem to add mostly proper
Blind
Feedback
                              names and word variations and very few irrelevant words
Experimental
Settings
                              that won’t distort the search towards another direction”
Results
Discussion
                              and “blind feedback improves precision, but it seems to do
Outlook                       so for only a particular kind of query”
References




               Johannes Leveling           Exploring Term Selection for Geographic Blind Feedback   4 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                       Application of Blind Feedback to
  Feedback
  Johannes
                                                              GIR (2/2)
   Leveling


Introduction

Creating a                  • Blind feedback (BF) is a method originating (and
Geographical
Knowledge                          intended for) ad-hoc retrieval
Base
GeoNames Data             → BF does not yet reflect the geographic orientation of
PND Data

Experiments
                            GIR
on
Geographic
                            → novel methods for document and term selection are
Blind
Feedback
                            required, preferably based on geographic knowledge
Experimental
Settings                  → BF does not generally increase performance
Results
Discussion                  significantly, even in standard IR
Outlook                     → application to GIR without adaptations seems
References                  questionable



               Johannes Leveling                Exploring Term Selection for Geographic Blind Feedback   5 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                          The Geographical Knowledge
  Feedback
  Johannes
                                                         Base (GKB)
   Leveling


Introduction

Creating a
Geographical
Knowledge                   • Avoid ambiguities for location names; sacrifice
Base
GeoNames Data
                                   coverage (i.e. focus on important places)
PND Data

Experiments
                          → Create small geographic knowledge base (GKB) with
on                           meronymy relations (part-whole-relations)
Geographic
Blind                      • GKB based on two resources:
Feedback
Experimental
Settings
                                     • Linking between Wikipedia articles and authority
Results
Discussion
                                       records for persons (PND), and
                                     • GeoNames data for the largest cities world-wide
Outlook

References




               Johannes Leveling                Exploring Term Selection for Geographic Blind Feedback   6 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                                     GeoNames data
  Feedback
  Johannes                  • GeoNames provides data for populated places world-wide
   Leveling
                                   with more than 1,000, 5,000, or 15,000 inhabitants
Introduction                •      Entries contain geographic codes for the continent,
Creating a                         country, and administrational divisions
Geographical
Knowledge                   •      Data for cities with more than 5,000 inhabitants
Base
GeoNames Data
                                   → meronymy relations for 41,228 entries
PND Data
                            •      Names are translated by utilizing the Wikipedia linking
Experiments
on
                                   between articles in English and German
Geographic
Blind
                            •      Example: Nuenen is a populated place in North Brabant,
Feedback                           in The Netherlands in Europe
Experimental
Settings
Results
                                   → meronym(Nuenen, North Brabant),
Discussion                         → meronym(North Brabant, The Netherlands),
Outlook                            → meronym(The Netherlands, Europe)
References                →        A place is important if it is highly populated



               Johannes Leveling                Exploring Term Selection for Geographic Blind Feedback   7 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                                                   PND Data
  Feedback
  Johannes                  • Wikipedia articles are linked with authority records for
   Leveling
                                   persons from the PND (Personennamendatei)
Introduction                •      PND contains information such as a person’s name, his or
Creating a                         her place and date of birth, place and date of death, and
Geographical
Knowledge                          profession
Base
GeoNames Data
                            •      Specification of a place often encodes meronymy
PND Data
                                   information
Experiments
on
                            •      152,650 PND entries → 27,734 unique meronymy
Geographic
Blind
                                   relations
Feedback                    •      Example: Edsger Wybe Dijkstra was born in Rotterdam,
Experimental
Settings
Results
                                   Niederlande/the Netherlands in 1930; died in Nuenen,
Discussion                         Niederlande/the Netherlands in 2002
Outlook                            → meronym(Rotterdam, The Netherlands),
References                         → meronym(Nuenen, The Netherlands)
                          →        A place is important if some well-known person was born
                                   or died there
               Johannes Leveling                Exploring Term Selection for Geographic Blind Feedback   8 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                           Towards Less Ambiguity in
  Feedback
  Johannes
                                              Geographic Resources
   Leveling
                          characteristic                 GeoNames cities (pop. > X )
Introduction
                                                       X=1,000          X=5,000          X=15,000
Creating a
Geographical
Knowledge                 unique loc. names            124,315            83,680              57,172
Base
GeoNames Data
                          ambiguous loc. names          22,616            13,133               7,551
PND Data
                          senses per loc. name           1.587             1.455               1.345
Experiments
on
Geographic
Blind
Feedback
Experimental
Settings
Results
Discussion

Outlook

References




               Johannes Leveling           Exploring Term Selection for Geographic Blind Feedback      9 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                The Meronymy Predicate
  Feedback
  Johannes
   Leveling
                         Transitive meronymy predicate mero? for two location
Introduction             names:
Creating a
Geographical                                              true          if L1 is a meronym of L2
Knowledge                      mero?(L1, L2) :=
Base
GeoNames Data
                                                          false         otherwise
PND Data

Experiments
on                          • Example:
Geographic
Blind                              mero?(Berlin, Germany) returns true
Feedback
Experimental                       mero?(Hong Kong, France) returns false
Settings
Results
Discussion
                          → Allows term selection in BF based on meronymy
Outlook                     information in GKB
References                → Geographic Blind Feedback



               Johannes Leveling               Exploring Term Selection for Geographic Blind Feedback   10 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                             Experimental Setup
  Feedback
  Johannes
   Leveling


Introduction

Creating a                  • GeoCLEF documents: 275,000 German newspaper
Geographical
Knowledge                          articles from Frankfurter Rundschau, Schweizerische
Base
GeoNames Data
                                   Depeschenagentur, and Der Spiegel from the years
PND Data
                                   1994 and 1995
Experiments
on                          • GeoCLEF topics: 25 topics from 2006 with a title, a
Geographic
Blind
Feedback
                                   short description, and a narrative part
Experimental
Settings                    • GIRSA system: setup similar to previous GIR
Results
Discussion                         experiments on GeoCLEF data (4; 3)
Outlook

References




               Johannes Leveling                Exploring Term Selection for Geographic Blind Feedback   11 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                         Experimental Settings for
  Feedback
  Johannes
                                       Retrieval Experiments (D=5)
   Leveling


Introduction               L: only location names are selected from the top ranked
Creating a                    documents as blind feedback terms
Geographical
Knowledge
Base
                          M: location names are filtered utilizing the mero?
GeoNames Data                predicate, keeping meronyms of a search term in the
PND Data

Experiments
                             original query as BF terms
on
Geographic                H: a location name is filtered from the BF terms if it there
Blind
Feedback                     is an inverse meronymy relation to a search term in the
Experimental
Settings                     original query (holonym)
Results
Discussion
                         B1 : (Baseline) no blind feedback; query terms are
Outlook
                              associated with static weights
References
                         B2 : (Baseline) no blind feedback; bag-of-words query;
                              query terms are not weighted

               Johannes Leveling          Exploring Term Selection for Geographic Blind Feedback   12 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                          Results for Retrieval
  Feedback
  Johannes
                                                            Experiments (1/2)
   Leveling
                                                             Performance plot
Introduction
                                   0.25
Creating a                                                                                        B1 ×
Geographical
Knowledge                                                                                          L ♦
Base                                                                                              H +
GeoNames Data                                                                                   + M
                                                                                                ♦
PND Data
                                   0.24   ×       ×        ×        ×        ×         ♦
                                                                                       ×        ×   ×
                                                                                                    +
                                                                             ♦                      ♦
Experiments                                                         ♦                  +
on
Geographic
                                                                             +
                        MAP                                ♦        +
Blind                                                      +
Feedback                                          +
Experimental                                      ♦
Settings                           0.23   +
                                          ♦
Results
Discussion

Outlook

References
                                   0.22
                                          5      10        15  20 25       30                   35     40
                                                           Number of terms T
               Johannes Leveling              Exploring Term Selection for Geographic Blind Feedback        13 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                        Results for Retrieval
  Feedback
  Johannes
                                                          Experiments (2/2)
   Leveling

                           Topic                     experiment
Introduction

Creating a
Geographical
                                     B1        L           H           M           B2
Knowledge
Base                       GC028     0.38      0.24        0.22        0.41        0.28
GeoNames Data
PND Data                   GC030 ∗   0.81      0.65        0.66        0.63        0.71
Experiments
on
                           GC032     0.60      0.62        0.62        0.70        0.49
Geographic
Blind
                           GC039     0.00      0.03        0.03        0.01        0.00
Feedback
Experimental
                           GC044     0.33      0.33        0.33        0.33        0.33
Settings
Results
                           GC048     0.87      0.89        0.89        0.66        0.85
Discussion

Outlook                    MAP       0.24      0.23        0.23        0.24        0.19
References
                           P@5       0.31      0.32        0.31        0.34        0.24
                           P@10      0.27      0.24        0.24        0.29        0.21

               Johannes Leveling            Exploring Term Selection for Geographic Blind Feedback   14 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                         Discussion of Results
  Feedback
  Johannes
   Leveling


Introduction                • MAP did not change considerably when using BF
Creating a                         compared to the upper baseline B1 (0.24)
Geographical
Knowledge                   • The BF strategy M (selecting meronyms) clearly
Base
GeoNames Data
PND Data
                                   outperforms the second baseline B2 (0.24 vs. 0.19)
Experiments                 • Precision at five documents was increased (from
on
Geographic                         0.31/0.24 in the baseline experiments to 0.34 in the
Blind
Feedback                           M-run)
Experimental
Settings
Results
                            • Per-topic comparison of MAP between B1 and M:
Discussion
                                   MAP was increased for nine, decreased for three topics
Outlook
                                   in M-run
References




               Johannes Leveling                Exploring Term Selection for Geographic Blind Feedback   15 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                                                 Discussion
  Feedback
  Johannes
   Leveling


Introduction
                            • Geographic semantic relation in is not used in all topics.
Creating a
Geographical                       Seven topics with near, in a distance of, alongside, or
Knowledge
Base                               around. Five of these with MAP of less than 0.03
GeoNames Data
PND Data                    • GKB mostly covers cities and does not include
Experiments
on
                                   information on rivers, seas, lakes, etc.
Geographic
Blind                       • The initial result set may be difficult to improve. Highest
Feedback
Experimental
                                   MAP for official monolingual German experiments in
Settings
Results                            GeoCLEF 2006: 0.22 (see (3))
Discussion
                                   Baseline experiment B1 : 0.24 MAP
Outlook

References




               Johannes Leveling                Exploring Term Selection for Geographic Blind Feedback   16 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                                                        Outlook
  Feedback
  Johannes
   Leveling


Introduction

Creating a
Geographical                • Focus on finding even more geographically oriented
Knowledge
Base                               term and document selection criteria
GeoNames Data
PND Data
                            • Investigate setting the parameters T and D in a flexible
Experiments
on                                 way
Geographic
Blind                       • Consider more geographic semantic relations (other
Feedback
Experimental
Settings
                                   than meronymy) in term selection for blind feedback
Results
Discussion

Outlook

References




               Johannes Leveling               Exploring Term Selection for Geographic Blind Feedback   17 / 18
Exploring
    Term
 Selection for
 Geographic
    Blind
                                                         Selected References
  Feedback
  Johannes
                         [1] Fredric C. Gey and Vivien Petras. Berkeley2 at GeoCLEF:
   Leveling                  Cross-language geographic information retrieval of English and
                             German documents. In Carol Peters, editor, Results of the CLEF
Introduction
                             2005 Cross-Language System Evaluation Campaign , Vienna,
Creating a                   Austria, 2005.
Geographical
Knowledge                [2] Ray Larson and Fredric C. Gey. GeoCLEF text retrieval and manual
Base
GeoNames Data
                             expansion approaches. In Alessandro Nardi, Carol Peters, and
PND Data                     José Luis Vicedo, editors, Results of the CLEF 2006 Cross-Language
Experiments                  System Evaluation Campaign , Alicante, Spain, 2006.
on
Geographic               [3] Johannes Leveling and Dirk Veiel. Experiments on the exclusion of
Blind                        metonymic location names from GIR. In Carol Peters, et al., editors,
Feedback
Experimental                 Evaluation of Multilingual and Multi-modal Information Retrieval: 7th
Settings
Results
                             Workshop of the Cross-Language Evaluation Forum, CLEF 2006,
Discussion                   volume 4730 of LNCS, pages 901–904. Springer, Berlin, 2007.
Outlook                  [4] Johannes Leveling, Sven Hartrumpf, and Dirk Veiel. Using semantic
References                   networks for geographic information retrieval. In Carol Peters, et al.,
                             editors, Accessing Multilingual Information Repositories: 6th
                             Workshop of the Cross-Language Evaluation Forum, CLEF 2005,
                             volume 4022 of LNCS, pages 977–986. Springer, Berlin, 2006.
               Johannes Leveling              Exploring Term Selection for Geographic Blind Feedback   18 / 18

Weitere ähnliche Inhalte

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Empfohlen

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Empfohlen (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Exploring Term Selection for Geographic Blind Feedback

  • 1. Exploring Term Selection for Geographic Blind Feedback Johannes Leveling Intelligent Information and Communication Systems (IICS) University of Hagen (FernUniversität in Hagen) 58084 Hagen, Germany firstname.lastname@fernuni-hagen.de GIR 2007 Workshop, Lisbon, Portugal
  • 2. Exploring Term Selection for Geographic Blind Outline Feedback Johannes Leveling 1 Introduction Introduction Creating a Geographical 2 Creating a Geographical Knowledge Base Knowledge Base GeoNames Data GeoNames Data PND Data PND Data Experiments on Geographic 3 Experiments on Geographic Blind Feedback Blind Feedback Experimental Settings Experimental Settings Results Results Discussion Discussion Outlook References 4 Outlook Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 2 / 18
  • 3. Exploring Term Selection for Geographic Blind Blind Feedback Feedback Johannes Leveling General idea: Introduction Improve IR performance by expanding a query Creating a 1 The original query Qo is processed and an initial Geographical Knowledge ranked result set Ro of documents is obtained Base GeoNames Data PND Data 2 D documents from Ro are selected and presumed to be Experiments relevant on Geographic 3 T terms from these documents are extracted for Blind Feedback relevance feedback Experimental Settings Results 4 Qo is modified into the final query Qf , merging the Discussion extracted terms into the query and possibly Outlook re-weighting all terms References 5 The final result set Rf is retrieved with the query Qf Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 3 / 18
  • 4. Exploring Term Selection for Geographic Blind Application of Blind Feedback to Feedback Johannes GIR (1/2) Leveling • Gey and Larson (2): Introduction an improvement on the order of 53% to 72% MAP (mean Creating a Geographical average precision) was achieved for some monolingual Knowledge Base German GIR topics on the GeoCLEF 2006 data (using GeoNames Data PND Data T = 30, D = 5); no significant improvement for English Experiments • Gey and Petras (1): on Geographic “the most improved queries seem to add mostly proper Blind Feedback names and word variations and very few irrelevant words Experimental Settings that won’t distort the search towards another direction” Results Discussion and “blind feedback improves precision, but it seems to do Outlook so for only a particular kind of query” References Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 4 / 18
  • 5. Exploring Term Selection for Geographic Blind Application of Blind Feedback to Feedback Johannes GIR (2/2) Leveling Introduction Creating a • Blind feedback (BF) is a method originating (and Geographical Knowledge intended for) ad-hoc retrieval Base GeoNames Data → BF does not yet reflect the geographic orientation of PND Data Experiments GIR on Geographic → novel methods for document and term selection are Blind Feedback required, preferably based on geographic knowledge Experimental Settings → BF does not generally increase performance Results Discussion significantly, even in standard IR Outlook → application to GIR without adaptations seems References questionable Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 5 / 18
  • 6. Exploring Term Selection for Geographic Blind The Geographical Knowledge Feedback Johannes Base (GKB) Leveling Introduction Creating a Geographical Knowledge • Avoid ambiguities for location names; sacrifice Base GeoNames Data coverage (i.e. focus on important places) PND Data Experiments → Create small geographic knowledge base (GKB) with on meronymy relations (part-whole-relations) Geographic Blind • GKB based on two resources: Feedback Experimental Settings • Linking between Wikipedia articles and authority Results Discussion records for persons (PND), and • GeoNames data for the largest cities world-wide Outlook References Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 6 / 18
  • 7. Exploring Term Selection for Geographic Blind GeoNames data Feedback Johannes • GeoNames provides data for populated places world-wide Leveling with more than 1,000, 5,000, or 15,000 inhabitants Introduction • Entries contain geographic codes for the continent, Creating a country, and administrational divisions Geographical Knowledge • Data for cities with more than 5,000 inhabitants Base GeoNames Data → meronymy relations for 41,228 entries PND Data • Names are translated by utilizing the Wikipedia linking Experiments on between articles in English and German Geographic Blind • Example: Nuenen is a populated place in North Brabant, Feedback in The Netherlands in Europe Experimental Settings Results → meronym(Nuenen, North Brabant), Discussion → meronym(North Brabant, The Netherlands), Outlook → meronym(The Netherlands, Europe) References → A place is important if it is highly populated Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 7 / 18
  • 8. Exploring Term Selection for Geographic Blind PND Data Feedback Johannes • Wikipedia articles are linked with authority records for Leveling persons from the PND (Personennamendatei) Introduction • PND contains information such as a person’s name, his or Creating a her place and date of birth, place and date of death, and Geographical Knowledge profession Base GeoNames Data • Specification of a place often encodes meronymy PND Data information Experiments on • 152,650 PND entries → 27,734 unique meronymy Geographic Blind relations Feedback • Example: Edsger Wybe Dijkstra was born in Rotterdam, Experimental Settings Results Niederlande/the Netherlands in 1930; died in Nuenen, Discussion Niederlande/the Netherlands in 2002 Outlook → meronym(Rotterdam, The Netherlands), References → meronym(Nuenen, The Netherlands) → A place is important if some well-known person was born or died there Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 8 / 18
  • 9. Exploring Term Selection for Geographic Blind Towards Less Ambiguity in Feedback Johannes Geographic Resources Leveling characteristic GeoNames cities (pop. > X ) Introduction X=1,000 X=5,000 X=15,000 Creating a Geographical Knowledge unique loc. names 124,315 83,680 57,172 Base GeoNames Data ambiguous loc. names 22,616 13,133 7,551 PND Data senses per loc. name 1.587 1.455 1.345 Experiments on Geographic Blind Feedback Experimental Settings Results Discussion Outlook References Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 9 / 18
  • 10. Exploring Term Selection for Geographic Blind The Meronymy Predicate Feedback Johannes Leveling Transitive meronymy predicate mero? for two location Introduction names: Creating a Geographical true if L1 is a meronym of L2 Knowledge mero?(L1, L2) := Base GeoNames Data false otherwise PND Data Experiments on • Example: Geographic Blind mero?(Berlin, Germany) returns true Feedback Experimental mero?(Hong Kong, France) returns false Settings Results Discussion → Allows term selection in BF based on meronymy Outlook information in GKB References → Geographic Blind Feedback Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 10 / 18
  • 11. Exploring Term Selection for Geographic Blind Experimental Setup Feedback Johannes Leveling Introduction Creating a • GeoCLEF documents: 275,000 German newspaper Geographical Knowledge articles from Frankfurter Rundschau, Schweizerische Base GeoNames Data Depeschenagentur, and Der Spiegel from the years PND Data 1994 and 1995 Experiments on • GeoCLEF topics: 25 topics from 2006 with a title, a Geographic Blind Feedback short description, and a narrative part Experimental Settings • GIRSA system: setup similar to previous GIR Results Discussion experiments on GeoCLEF data (4; 3) Outlook References Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 11 / 18
  • 12. Exploring Term Selection for Geographic Blind Experimental Settings for Feedback Johannes Retrieval Experiments (D=5) Leveling Introduction L: only location names are selected from the top ranked Creating a documents as blind feedback terms Geographical Knowledge Base M: location names are filtered utilizing the mero? GeoNames Data predicate, keeping meronyms of a search term in the PND Data Experiments original query as BF terms on Geographic H: a location name is filtered from the BF terms if it there Blind Feedback is an inverse meronymy relation to a search term in the Experimental Settings original query (holonym) Results Discussion B1 : (Baseline) no blind feedback; query terms are Outlook associated with static weights References B2 : (Baseline) no blind feedback; bag-of-words query; query terms are not weighted Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 12 / 18
  • 13. Exploring Term Selection for Geographic Blind Results for Retrieval Feedback Johannes Experiments (1/2) Leveling Performance plot Introduction 0.25 Creating a B1 × Geographical Knowledge L ♦ Base H + GeoNames Data + M ♦ PND Data 0.24 × × × × × ♦ × × × + ♦ ♦ Experiments ♦ + on Geographic + MAP ♦ + Blind + Feedback + Experimental ♦ Settings 0.23 + ♦ Results Discussion Outlook References 0.22 5 10 15 20 25 30 35 40 Number of terms T Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 13 / 18
  • 14. Exploring Term Selection for Geographic Blind Results for Retrieval Feedback Johannes Experiments (2/2) Leveling Topic experiment Introduction Creating a Geographical B1 L H M B2 Knowledge Base GC028 0.38 0.24 0.22 0.41 0.28 GeoNames Data PND Data GC030 ∗ 0.81 0.65 0.66 0.63 0.71 Experiments on GC032 0.60 0.62 0.62 0.70 0.49 Geographic Blind GC039 0.00 0.03 0.03 0.01 0.00 Feedback Experimental GC044 0.33 0.33 0.33 0.33 0.33 Settings Results GC048 0.87 0.89 0.89 0.66 0.85 Discussion Outlook MAP 0.24 0.23 0.23 0.24 0.19 References P@5 0.31 0.32 0.31 0.34 0.24 P@10 0.27 0.24 0.24 0.29 0.21 Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 14 / 18
  • 15. Exploring Term Selection for Geographic Blind Discussion of Results Feedback Johannes Leveling Introduction • MAP did not change considerably when using BF Creating a compared to the upper baseline B1 (0.24) Geographical Knowledge • The BF strategy M (selecting meronyms) clearly Base GeoNames Data PND Data outperforms the second baseline B2 (0.24 vs. 0.19) Experiments • Precision at five documents was increased (from on Geographic 0.31/0.24 in the baseline experiments to 0.34 in the Blind Feedback M-run) Experimental Settings Results • Per-topic comparison of MAP between B1 and M: Discussion MAP was increased for nine, decreased for three topics Outlook in M-run References Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 15 / 18
  • 16. Exploring Term Selection for Geographic Blind Discussion Feedback Johannes Leveling Introduction • Geographic semantic relation in is not used in all topics. Creating a Geographical Seven topics with near, in a distance of, alongside, or Knowledge Base around. Five of these with MAP of less than 0.03 GeoNames Data PND Data • GKB mostly covers cities and does not include Experiments on information on rivers, seas, lakes, etc. Geographic Blind • The initial result set may be difficult to improve. Highest Feedback Experimental MAP for official monolingual German experiments in Settings Results GeoCLEF 2006: 0.22 (see (3)) Discussion Baseline experiment B1 : 0.24 MAP Outlook References Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 16 / 18
  • 17. Exploring Term Selection for Geographic Blind Outlook Feedback Johannes Leveling Introduction Creating a Geographical • Focus on finding even more geographically oriented Knowledge Base term and document selection criteria GeoNames Data PND Data • Investigate setting the parameters T and D in a flexible Experiments on way Geographic Blind • Consider more geographic semantic relations (other Feedback Experimental Settings than meronymy) in term selection for blind feedback Results Discussion Outlook References Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 17 / 18
  • 18. Exploring Term Selection for Geographic Blind Selected References Feedback Johannes [1] Fredric C. Gey and Vivien Petras. Berkeley2 at GeoCLEF: Leveling Cross-language geographic information retrieval of English and German documents. In Carol Peters, editor, Results of the CLEF Introduction 2005 Cross-Language System Evaluation Campaign , Vienna, Creating a Austria, 2005. Geographical Knowledge [2] Ray Larson and Fredric C. Gey. GeoCLEF text retrieval and manual Base GeoNames Data expansion approaches. In Alessandro Nardi, Carol Peters, and PND Data José Luis Vicedo, editors, Results of the CLEF 2006 Cross-Language Experiments System Evaluation Campaign , Alicante, Spain, 2006. on Geographic [3] Johannes Leveling and Dirk Veiel. Experiments on the exclusion of Blind metonymic location names from GIR. In Carol Peters, et al., editors, Feedback Experimental Evaluation of Multilingual and Multi-modal Information Retrieval: 7th Settings Results Workshop of the Cross-Language Evaluation Forum, CLEF 2006, Discussion volume 4730 of LNCS, pages 901–904. Springer, Berlin, 2007. Outlook [4] Johannes Leveling, Sven Hartrumpf, and Dirk Veiel. Using semantic References networks for geographic information retrieval. In Carol Peters, et al., editors, Accessing Multilingual Information Repositories: 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005, volume 4022 of LNCS, pages 977–986. Springer, Berlin, 2006. Johannes Leveling Exploring Term Selection for Geographic Blind Feedback 18 / 18