SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Scenario-Driven Selection and Exploitation of
  Semantic Data for Optimal Named Entity
              Disambiguation


 Panos Alexopoulos, Carlos Ruiz, Jose Manuel Gomez Perez

  1st Semantic Web and Information Extraction Workshop,

            Galway, Ireland, October 9th, 2012
Agenda

   Introduction
       Problem Definition and Paper
         Focus
       Approach Overview and Rationale

   Proposed Disambiguation Framework
      Disambiguation Evidence Model
      Entity Disambiguation Process

   Framework Evaluation
      Evaluation Process
      Evaluation Results

   Conclusions and Future Work

                                          2
Introduction
                                                                      Problem Definition
Entity Resolution & Disambiguation

 ● Named entity resolution involves detecting mentions of named entities (e.g. people,
   organizations or locations) within texts and mapping them to their corresponding
   entities in a given knowledge source.

 ● One important challenge in this task is the correct disambiguation of the detected
   entities.

 ● For example:

      ● “Siege of Tripolitsa took place in Tripoli with Theodoros Kolokotronis being the
        leader of the Greeks. This event marked an early victory for the fight for
        independence from Turkey but it was also a massacre against the Muslim and
        Jewish population of the city”.

      ● The term “Tripoli” here refers to http://dbpedia.org/resource/Tripoli,_Greece
        but it can be mistaken, for example, with Tripoli in Libya or that in Lebanon.


                                                                                           3
Introduction
                                                           Disambiguation Approaches

● The majority of disambiguation approaches rely on the strong contextual
  hypothesis that terms with similar meanings are often used in similar contexts.

● The role of these contexts is typically played by already annotated documents (e.g.
  wikipedia articles) which are used to train term classifiers.

● These classifiers link a term to its correct meaning entity, based on the similarity
  between the term’s textual context and the contexts of its potential entities.

● Some more recent approaches utilize semantic structures in order to determine this
  similarity in a semantic way.

● The effectiveness of these latter approaches is highly dependent on:

    ● The availability of comprehensive semantic information.

    ● The degree of alignment between the content of the texts to be
      disambiguated and the semantic data to be used.
                                                                                             4
Introduction
                                                        Alignment and its Importance

● Alignment means that the ontology’s elements should cover the domain(s) of the
  texts to be disambiguated but should not contain other additional elements that:
   ● Do not belong to the domain.
   ● Do belong to it but do not appear in the texts.

● For example assume the text “Ronaldo scored two goals for Real Madrid“ from a
  contemporary football match description.

● To disambiguate the term “Ronaldo” using an ontology, the only contextual
  evidence that can be used is the entity “Real Madrid”.

● Yet there are two players with that name that are semantically related to Real:
    ● Cristiano Ronaldo (current player)
    ● Ronaldo Luis Nazario de Lima (former player).

● This means that if both relations are considered then the term will not be
  disambiguated.

                                                                                          5
Introduction
                                                             Towards Better Alignment

● In the previous example the fact that the text describes a contemporary football
  match suggests that, in general, the relation between a team and its former players
  is not expected to appear in it.

● Thus, for such texts, it would make sense to ignore this relation in order to achieve
  more accurate disambiguation.

● Based on this observation, we make two claims:

    ● That there are certain scenarios where there is available a priori knowledge
      about what entities and relations are expected to be present in the text.

    ● That this knowledge can be exploited for better alignment between semantic
      information and content leading to more effective disambiguation.

● To verify these claims we define an entity disambiguation framework that can
  perform better disambiguation in such scenarios.

                                                                                           6
Proposed Framework
                                                                                   Approach

● We target the task of entity disambiguation based on the intuition that a given
  ontological entity is more likely to represent the meaning of an ambiguous term
  when there are many ontologically related to it entities in the text.

● E.g. in the example text the entities “Siege of Tripolitsa” and “Theodoros
  Kolokotronis” indicate that the term “Tripoli” refers to the city of Greece.

● These evidential entities are derived from one or more domain ontologies.

● However, which entities and to what extent may serve as evidence in a given
  application scenario depends on the domain and expected content of the texts.

● For that, the key ability our framework provides to its users is to construct, in a
  semi-automatic manner, semantic evidence models for specific disambiguation
  scenarios and use them to perform entity disambiguation within them.




                                                                                          7
Proposed Framework
                                                             Framework Components

● A Disambiguation Evidence Model that contains the semantic entities that may
  serve as disambiguation evidence for the scenario’s target entities in the given
  scenario.

    ● Each pair of a target entity and an evidential one is accompanied by a degree
      that quantifies the latter’s evidential power for the given target entity.

● A Disambiguation Evidence Model Construction Process that builds, in a semi-
  automatic manner, a disambiguation evidence model for a given scenario.

● An Entity Disambiguation Process that uses the evidence model to detect and
  extract from a given text terms that refer to the scenario’s target entities.

    ● Each term is linked to one or more possible entity uris along with a confidence
      score calculated for each of them.

    ● The entity with the highest confidence should be the one the term actually refers
      to.
                                                                                          8
Proposed Framework
                                                       Disambiguation Evidence Model

● Defines for each ontology entity which other instances and to what extent should be
  used as evidence towards its correct meaning interpretation.

● It consists of entity pairs where a particular entity provides quantified evidence for a
  another one.




                                                                                             9
Proposed Framework
                                                         Evidence Model Construction

● Construction of the evidence model depends on the characteristics of the domain and
  the texts.

● The first step of the construction is manual and involves:

    ● The identification of the concepts whose instances we wish to disambiguate
      (e.g. locations)

    ● The determination, for each of these concepts, of the related to them concepts
      whose instances may serve as contextual disambiguation evidence/

              ● For example, in texts that describe historical events, some concepts
                whose instances may act as location evidence are related locations,
                historical events, and historical groups and persons.

    ● The identification, for each pair of evidence and target concept, of the relation
      paths that links them.

                                                                                          10
Proposed Framework
                                                             Evidence Model Construction

● The result of this first step is a table like the following ones:




                                                                                        11
Proposed Framework
                                                        Evidence Model Construction

● Based on these tables, the second step of the construction is automatic and involves
  the generation of the target-evidence entity pairs along with a disambiguation
  evidential strength.

● This strength is inversely proportional to the number of different same-name target
  entities a given evidential entity provides evidence for.

● For example, “Getafe” provides evidence for “Pedro Leon” to a strength of 0.5
  because it has another player called Pedro.




                                                                                         12
Proposed Framework
                                                              Entity Resolution Process

● Step 1: We extract from the text the terms that possibly refer to the target entities as
  well as those that refer to their respective evidential entities.

    ● Extraction is performed with Knowledge Tagger, an in-house tool based on
      GATE.

● Step 2: Using the evidential entities we compute for each extracted target entity term
  the confidence that it refers to a particular target entity.

    ● The target entity with the highest confidence is expected to be the correct one.




                                                                                             13
Proposed Framework
       Disambiguation Example




Correct disambiguation
 of the term Atletico




                                    14
Framework Evaluation
                                                                    Evaluation Process
Description

 ● Two disambiguation scenarios:
     ● Football match descriptions.
     ● Texts describing military conflicts.

 ● DBPedia as a source of semantic information in both cases.

 ● Disambiguation effectiveness measured through precision and recall.

 ● Evaluation results were compared to those achieved by two publicly available
   semantic annotation and disambiguation systems;
     ● DBPedia Spotlight
     ● AIDA

 ● The two systems:
     ● Use also DBPedia as a knowledge source.
     ● Provide the users the capability to select the classes whose instances are to
        be included in the process.
                                                                                       15
Framework Evaluation
                                                                      Evaluation Results
Football Match Descriptions Scenario

 ● 50 texts describing football matches.

 ● E.g. “It's the 70th minute of the game and after a magnificent pass by Pedro, Messi
   managed to beat Claudio Bravo. Barcelona now leads 1-0 against Real."




Disambiguation Results




                                                                                         16
Framework Evaluation
                                                                       Evaluation Results
Military Conflict Texts Scenario

 ● 50 historical texts describing military conflicts.

 ● E.g. “The Siege of Augusta was a significant battle of the American Revolution.
   Fought for control of Fort Cornwallis, a British fort near Augusta, the battle was a
   major victory for the Patriot forces of Lighthorse Harry Lee and a stunning reverse to
   the British and Loyalist forces in the South”.




Disambiguation Results




                                                                                            17
Conclusions and Future Work
                                                                                 Key Points

● We proposed a novel framework for optimizing named entity disambiguation in well-
  defined and adequately constrained scenarios through the customized selection
  and exploitation of semantic data.

● Our purpose was not to build another generic disambiguation system but rather a
  reusable framework that can:

    ● Be relatively easily adapted to the particular characteristics of the domain and
      application scenario at hand.

    ● Exploit these characteristics to increase the effectiveness of the
      disambiguation process.

● The key aspect of the framework is the semi-automatic process it defines for
  selecting the optimal evidence model for the scenario at hand.




                                                                                         18
Conclusions and Future Work
                                                                                 Key Points

● Comparative evaluation in two specific scenarios verified the framework’s
  superiority over existing approaches that are designed to work in open domains and
  unconstrained scenarios.

● This verified our hypothesis that the scenario adaptation capabilities of such generic
  disambiguation systems can be inadequate in certain scenarios.

● Of course, the framework’s usability and effectiveness is directly proportional to the
  content specificity of the texts to be disambiguated and the availability and
  quality of a priori semantic knowledge about their content.

    ● The greater these two parameters are, the more applicable is our approach
      and the more effective the disambiguation is expected to be.

    ● The opposite is true as the texts become more generic and the information we
      have out about them more scarce.



                                                                                           19
Conclusions and Future Work
                                                                Framework Extensions


● Fully automated construction of the disambiguation evidence model.

    ● Challenge here is how to automatically identify the text’s domain/topic.

● Combination with statistical methods for cases where available domain semantic
  information is incomplete.

    ● Challenge here is how to select the optimal ratio of ontological evidence v.s.
      statistical one.

● Development of tool to enable users to dynamically build such models out of existing
  semantic data and use them for disambiguation purposes




                                                                                         20
Thank you!
                                                                                       Contact iSOCO


                                            Dr. Panos Alexopoulos
                                               Senior Researcher
                                           palexopoulos@isoco.com




                                                Questions?


Barcelona                   Madrid                          Pamplona              Valencia
Tel +34 935 677 200         Tel +34 913 349 797             Tel +34 948 102 408   Tel +34 963 467 143
Edificio Testa A            Av. del Partenón, 16-18, 1º7ª   Parque Tomás          Oficina 107
C/ Alcalde Barnils, 64-68   Campo de las Naciones           Caballero, 2, 6º-4ª   C/ Prof. Beltrán Báguena, 4
St. Cugat del Vallès        28042 Madrid                    31006 Pamplona        46009 Valencia
08174 Barcelona




                                                                                                                21

Weitere ähnliche Inhalte

Ähnlich wie Scenario-Driven Selection and Exploitation of Semantic Data for Optimal Named Entity Disambiguation

Learning Vague Knowledge From Socially Generated Content in an Enterprise Fra...
Learning Vague Knowledge From Socially Generated Content in an Enterprise Fra...Learning Vague Knowledge From Socially Generated Content in an Enterprise Fra...
Learning Vague Knowledge From Socially Generated Content in an Enterprise Fra...Panos Alexopoulos
 
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL DevelopmentProposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL DevelopmentJorge Barreto
 
From Multiple Data Models To One Modeling Language V1
From Multiple Data Models To One Modeling Language  V1From Multiple Data Models To One Modeling Language  V1
From Multiple Data Models To One Modeling Language V1Andries_vanRenssen
 
Domain Driven Design Quickly
Domain Driven Design QuicklyDomain Driven Design Quickly
Domain Driven Design QuicklyMariam Hakobyan
 
A Survey of Object Oriented Programming LanguagesMaya Hris.docx
A Survey of Object Oriented Programming LanguagesMaya Hris.docxA Survey of Object Oriented Programming LanguagesMaya Hris.docx
A Survey of Object Oriented Programming LanguagesMaya Hris.docxdaniahendric
 
Conceptual Interoperability and Biomedical Data
Conceptual Interoperability and Biomedical DataConceptual Interoperability and Biomedical Data
Conceptual Interoperability and Biomedical DataJim McCusker
 
Towards Vagueness-Aware Semantic Data
Towards Vagueness-Aware Semantic DataTowards Vagueness-Aware Semantic Data
Towards Vagueness-Aware Semantic DataPanos Alexopoulos
 
Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction  Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction dannyijwest
 
Jarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing OntologiesJarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing OntologiesMustafa Jarrar
 
SWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professionalSWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professionalgowthamnaidu0986
 
Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction                Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction dannyijwest
 
Jarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologiesJarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologiesPalGov
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Saeedeh Shekarpour
 
Semantic Modeling for Information Federation
Semantic Modeling for Information FederationSemantic Modeling for Information Federation
Semantic Modeling for Information FederationCory Casanave
 
Trust evaluation using an improved context
Trust evaluation using an improved contextTrust evaluation using an improved context
Trust evaluation using an improved contextijbiss
 
Trust Evaluation Using an Improved Context Similarity Measurement
Trust Evaluation Using an Improved Context Similarity MeasurementTrust Evaluation Using an Improved Context Similarity Measurement
Trust Evaluation Using an Improved Context Similarity Measurementijbiss
 
Document Summarization
Document SummarizationDocument Summarization
Document SummarizationPratik Kumar
 

Ähnlich wie Scenario-Driven Selection and Exploitation of Semantic Data for Optimal Named Entity Disambiguation (20)

Learning Vague Knowledge From Socially Generated Content in an Enterprise Fra...
Learning Vague Knowledge From Socially Generated Content in an Enterprise Fra...Learning Vague Knowledge From Socially Generated Content in an Enterprise Fra...
Learning Vague Knowledge From Socially Generated Content in an Enterprise Fra...
 
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL DevelopmentProposal of an Ontology Applied to Technical Debt on PL/SQL Development
Proposal of an Ontology Applied to Technical Debt on PL/SQL Development
 
Ontologies
OntologiesOntologies
Ontologies
 
From Multiple Data Models To One Modeling Language V1
From Multiple Data Models To One Modeling Language  V1From Multiple Data Models To One Modeling Language  V1
From Multiple Data Models To One Modeling Language V1
 
Domain Driven Design Quickly
Domain Driven Design QuicklyDomain Driven Design Quickly
Domain Driven Design Quickly
 
A Survey of Object Oriented Programming LanguagesMaya Hris.docx
A Survey of Object Oriented Programming LanguagesMaya Hris.docxA Survey of Object Oriented Programming LanguagesMaya Hris.docx
A Survey of Object Oriented Programming LanguagesMaya Hris.docx
 
Conceptual Interoperability and Biomedical Data
Conceptual Interoperability and Biomedical DataConceptual Interoperability and Biomedical Data
Conceptual Interoperability and Biomedical Data
 
Towards Vagueness-Aware Semantic Data
Towards Vagueness-Aware Semantic DataTowards Vagueness-Aware Semantic Data
Towards Vagueness-Aware Semantic Data
 
Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction  Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction
 
Jarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing OntologiesJarrar: Stepwise Methodologies for Developing Ontologies
Jarrar: Stepwise Methodologies for Developing Ontologies
 
SWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professionalSWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professional
 
Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction                Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction
 
Jarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologiesJarrar.lecture notes.aai.2011s.ontology part4_methodologies
Jarrar.lecture notes.aai.2011s.ontology part4_methodologies
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
 
Semantic Modeling for Information Federation
Semantic Modeling for Information FederationSemantic Modeling for Information Federation
Semantic Modeling for Information Federation
 
Trust evaluation using an improved context
Trust evaluation using an improved contextTrust evaluation using an improved context
Trust evaluation using an improved context
 
Trust Evaluation Using an Improved Context Similarity Measurement
Trust Evaluation Using an Improved Context Similarity MeasurementTrust Evaluation Using an Improved Context Similarity Measurement
Trust Evaluation Using an Improved Context Similarity Measurement
 
Document Summarization
Document SummarizationDocument Summarization
Document Summarization
 
Ma
MaMa
Ma
 
Ontology
OntologyOntology
Ontology
 

Mehr von Panos Alexopoulos

How many truths can you handle?
How many truths can you handle?How many truths can you handle?
How many truths can you handle?Panos Alexopoulos
 
Troubleshooting and Optimizing Named Entity Resolution Systems in the Industry
Troubleshooting and Optimizing Named Entity Resolution Systems in the IndustryTroubleshooting and Optimizing Named Entity Resolution Systems in the Industry
Troubleshooting and Optimizing Named Entity Resolution Systems in the IndustryPanos Alexopoulos
 
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in OntologiesA Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in OntologiesPanos Alexopoulos
 
Towards Purposeful Reuse of Semantic Datasets Through Goal-Driven Summarization
Towards Purposeful Reuse of Semantic Datasets Through Goal-Driven SummarizationTowards Purposeful Reuse of Semantic Datasets Through Goal-Driven Summarization
Towards Purposeful Reuse of Semantic Datasets Through Goal-Driven SummarizationPanos Alexopoulos
 
Vagueness in Semantic Information Management
Vagueness in Semantic Information ManagementVagueness in Semantic Information Management
Vagueness in Semantic Information ManagementPanos Alexopoulos
 

Mehr von Panos Alexopoulos (7)

How many truths can you handle?
How many truths can you handle?How many truths can you handle?
How many truths can you handle?
 
Mind the Semantic Gap
Mind the Semantic GapMind the Semantic Gap
Mind the Semantic Gap
 
Troubleshooting and Optimizing Named Entity Resolution Systems in the Industry
Troubleshooting and Optimizing Named Entity Resolution Systems in the IndustryTroubleshooting and Optimizing Named Entity Resolution Systems in the Industry
Troubleshooting and Optimizing Named Entity Resolution Systems in the Industry
 
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in OntologiesA Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
 
Towards Purposeful Reuse of Semantic Datasets Through Goal-Driven Summarization
Towards Purposeful Reuse of Semantic Datasets Through Goal-Driven SummarizationTowards Purposeful Reuse of Semantic Datasets Through Goal-Driven Summarization
Towards Purposeful Reuse of Semantic Datasets Through Goal-Driven Summarization
 
Vagueness in Semantic Information Management
Vagueness in Semantic Information ManagementVagueness in Semantic Information Management
Vagueness in Semantic Information Management
 
Semantic Tag Recommendation
Semantic Tag RecommendationSemantic Tag Recommendation
Semantic Tag Recommendation
 

Kürzlich hochgeladen

VIP 7001035870 Find & Meet Hyderabad Call Girls Abids high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls Abids high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls Abids high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls Abids high-profile Call Girladitipandeya
 
CALL ON ➥8923113531 🔝Call Girls Vineet Khand Lucknow best Night Fun service 🧦
CALL ON ➥8923113531 🔝Call Girls Vineet Khand Lucknow best Night Fun service  🧦CALL ON ➥8923113531 🔝Call Girls Vineet Khand Lucknow best Night Fun service  🧦
CALL ON ➥8923113531 🔝Call Girls Vineet Khand Lucknow best Night Fun service 🧦anilsa9823
 
Collective Mining | Corporate Presentation - May 2024
Collective Mining | Corporate Presentation - May 2024Collective Mining | Corporate Presentation - May 2024
Collective Mining | Corporate Presentation - May 2024CollectiveMining1
 
BDSM⚡Call Girls in Hari Nagar Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Hari Nagar Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Hari Nagar Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Hari Nagar Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
SME IPO and sme ipo listing consultants .pptx
SME IPO and sme ipo listing consultants .pptxSME IPO and sme ipo listing consultants .pptx
SME IPO and sme ipo listing consultants .pptxindia IPO
 
Q3 FY24 Earnings Conference Call Presentation
Q3 FY24 Earnings Conference Call PresentationQ3 FY24 Earnings Conference Call Presentation
Q3 FY24 Earnings Conference Call PresentationSysco_Investors
 
✂️ 👅 Independent Goregaon Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Goregaon Escorts With Room Vashi Call Girls 💃 9004004663✂️ 👅 Independent Goregaon Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Goregaon Escorts With Room Vashi Call Girls 💃 9004004663Call Girls Mumbai
 
VVIP Pune Call Girls Handewadi WhatSapp Number 8005736733 With Elite Staff An...
VVIP Pune Call Girls Handewadi WhatSapp Number 8005736733 With Elite Staff An...VVIP Pune Call Girls Handewadi WhatSapp Number 8005736733 With Elite Staff An...
VVIP Pune Call Girls Handewadi WhatSapp Number 8005736733 With Elite Staff An...SUHANI PANDEY
 
Pakistani Call girls in Ajman +971563133746 Ajman Call girls
Pakistani Call girls in Ajman +971563133746 Ajman Call girlsPakistani Call girls in Ajman +971563133746 Ajman Call girls
Pakistani Call girls in Ajman +971563133746 Ajman Call girlsgwenoracqe6
 
Teck Investor Presentation, April 24, 2024
Teck Investor Presentation, April 24, 2024Teck Investor Presentation, April 24, 2024
Teck Investor Presentation, April 24, 2024TeckResourcesLtd
 

Kürzlich hochgeladen (20)

Vip Call Girls South Ex ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls South Ex ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls South Ex ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls South Ex ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Russian Call Girls Rohini Sector 3 💓 Delhi 9999965857 @Sabina Modi VVIP MODEL...
Russian Call Girls Rohini Sector 3 💓 Delhi 9999965857 @Sabina Modi VVIP MODEL...Russian Call Girls Rohini Sector 3 💓 Delhi 9999965857 @Sabina Modi VVIP MODEL...
Russian Call Girls Rohini Sector 3 💓 Delhi 9999965857 @Sabina Modi VVIP MODEL...
 
Sensual Moments: +91 9999965857 Independent Call Girls Noida Delhi {{ Monika}...
Sensual Moments: +91 9999965857 Independent Call Girls Noida Delhi {{ Monika}...Sensual Moments: +91 9999965857 Independent Call Girls Noida Delhi {{ Monika}...
Sensual Moments: +91 9999965857 Independent Call Girls Noida Delhi {{ Monika}...
 
Rohini Sector 15 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 15 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 15 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 15 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Abids high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls Abids high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls Abids high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls Abids high-profile Call Girl
 
CALL ON ➥8923113531 🔝Call Girls Vineet Khand Lucknow best Night Fun service 🧦
CALL ON ➥8923113531 🔝Call Girls Vineet Khand Lucknow best Night Fun service  🧦CALL ON ➥8923113531 🔝Call Girls Vineet Khand Lucknow best Night Fun service  🧦
CALL ON ➥8923113531 🔝Call Girls Vineet Khand Lucknow best Night Fun service 🧦
 
Collective Mining | Corporate Presentation - May 2024
Collective Mining | Corporate Presentation - May 2024Collective Mining | Corporate Presentation - May 2024
Collective Mining | Corporate Presentation - May 2024
 
BDSM⚡Call Girls in Hari Nagar Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Hari Nagar Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Hari Nagar Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Hari Nagar Delhi >༒8448380779 Escort Service
 
SME IPO and sme ipo listing consultants .pptx
SME IPO and sme ipo listing consultants .pptxSME IPO and sme ipo listing consultants .pptx
SME IPO and sme ipo listing consultants .pptx
 
(‿ˠ‿) Independent Call Girls Laxmi Nagar 👉 9999965857 👈 Delhi : 9999 Cash Pa...
(‿ˠ‿) Independent Call Girls Laxmi Nagar 👉 9999965857 👈 Delhi  : 9999 Cash Pa...(‿ˠ‿) Independent Call Girls Laxmi Nagar 👉 9999965857 👈 Delhi  : 9999 Cash Pa...
(‿ˠ‿) Independent Call Girls Laxmi Nagar 👉 9999965857 👈 Delhi : 9999 Cash Pa...
 
Call Girls 🫤 East Of Kailash ➡️ 9999965857 ➡️ Delhi 🫦 Russian Escorts FULL ...
Call Girls 🫤 East Of Kailash ➡️ 9999965857  ➡️ Delhi 🫦  Russian Escorts FULL ...Call Girls 🫤 East Of Kailash ➡️ 9999965857  ➡️ Delhi 🫦  Russian Escorts FULL ...
Call Girls 🫤 East Of Kailash ➡️ 9999965857 ➡️ Delhi 🫦 Russian Escorts FULL ...
 
Vip Call Girls Vasant Kunj ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Vasant Kunj ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Vasant Kunj ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Vasant Kunj ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
(👉゚9999965857 ゚)👉 Russian Call Girls Aerocity 👉 Delhi 👈 : 9999 Cash Payment F...
(👉゚9999965857 ゚)👉 Russian Call Girls Aerocity 👉 Delhi 👈 : 9999 Cash Payment F...(👉゚9999965857 ゚)👉 Russian Call Girls Aerocity 👉 Delhi 👈 : 9999 Cash Payment F...
(👉゚9999965857 ゚)👉 Russian Call Girls Aerocity 👉 Delhi 👈 : 9999 Cash Payment F...
 
(INDIRA) Call Girl Kashmir Call Now 8617697112 Kashmir Escorts 24x7
(INDIRA) Call Girl Kashmir Call Now 8617697112 Kashmir Escorts 24x7(INDIRA) Call Girl Kashmir Call Now 8617697112 Kashmir Escorts 24x7
(INDIRA) Call Girl Kashmir Call Now 8617697112 Kashmir Escorts 24x7
 
Call Girls 🫤 Hauz Khas ➡️ 9999965857 ➡️ Delhi 🫦 Russian Escorts FULL ENJOY
Call Girls 🫤 Hauz Khas ➡️ 9999965857  ➡️ Delhi 🫦  Russian Escorts FULL ENJOYCall Girls 🫤 Hauz Khas ➡️ 9999965857  ➡️ Delhi 🫦  Russian Escorts FULL ENJOY
Call Girls 🫤 Hauz Khas ➡️ 9999965857 ➡️ Delhi 🫦 Russian Escorts FULL ENJOY
 
Q3 FY24 Earnings Conference Call Presentation
Q3 FY24 Earnings Conference Call PresentationQ3 FY24 Earnings Conference Call Presentation
Q3 FY24 Earnings Conference Call Presentation
 
✂️ 👅 Independent Goregaon Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Goregaon Escorts With Room Vashi Call Girls 💃 9004004663✂️ 👅 Independent Goregaon Escorts With Room Vashi Call Girls 💃 9004004663
✂️ 👅 Independent Goregaon Escorts With Room Vashi Call Girls 💃 9004004663
 
VVIP Pune Call Girls Handewadi WhatSapp Number 8005736733 With Elite Staff An...
VVIP Pune Call Girls Handewadi WhatSapp Number 8005736733 With Elite Staff An...VVIP Pune Call Girls Handewadi WhatSapp Number 8005736733 With Elite Staff An...
VVIP Pune Call Girls Handewadi WhatSapp Number 8005736733 With Elite Staff An...
 
Pakistani Call girls in Ajman +971563133746 Ajman Call girls
Pakistani Call girls in Ajman +971563133746 Ajman Call girlsPakistani Call girls in Ajman +971563133746 Ajman Call girls
Pakistani Call girls in Ajman +971563133746 Ajman Call girls
 
Teck Investor Presentation, April 24, 2024
Teck Investor Presentation, April 24, 2024Teck Investor Presentation, April 24, 2024
Teck Investor Presentation, April 24, 2024
 

Scenario-Driven Selection and Exploitation of Semantic Data for Optimal Named Entity Disambiguation

  • 1. Scenario-Driven Selection and Exploitation of Semantic Data for Optimal Named Entity Disambiguation Panos Alexopoulos, Carlos Ruiz, Jose Manuel Gomez Perez 1st Semantic Web and Information Extraction Workshop, Galway, Ireland, October 9th, 2012
  • 2. Agenda  Introduction  Problem Definition and Paper Focus  Approach Overview and Rationale  Proposed Disambiguation Framework  Disambiguation Evidence Model  Entity Disambiguation Process  Framework Evaluation  Evaluation Process  Evaluation Results  Conclusions and Future Work 2
  • 3. Introduction Problem Definition Entity Resolution & Disambiguation ● Named entity resolution involves detecting mentions of named entities (e.g. people, organizations or locations) within texts and mapping them to their corresponding entities in a given knowledge source. ● One important challenge in this task is the correct disambiguation of the detected entities. ● For example: ● “Siege of Tripolitsa took place in Tripoli with Theodoros Kolokotronis being the leader of the Greeks. This event marked an early victory for the fight for independence from Turkey but it was also a massacre against the Muslim and Jewish population of the city”. ● The term “Tripoli” here refers to http://dbpedia.org/resource/Tripoli,_Greece but it can be mistaken, for example, with Tripoli in Libya or that in Lebanon. 3
  • 4. Introduction Disambiguation Approaches ● The majority of disambiguation approaches rely on the strong contextual hypothesis that terms with similar meanings are often used in similar contexts. ● The role of these contexts is typically played by already annotated documents (e.g. wikipedia articles) which are used to train term classifiers. ● These classifiers link a term to its correct meaning entity, based on the similarity between the term’s textual context and the contexts of its potential entities. ● Some more recent approaches utilize semantic structures in order to determine this similarity in a semantic way. ● The effectiveness of these latter approaches is highly dependent on: ● The availability of comprehensive semantic information. ● The degree of alignment between the content of the texts to be disambiguated and the semantic data to be used. 4
  • 5. Introduction Alignment and its Importance ● Alignment means that the ontology’s elements should cover the domain(s) of the texts to be disambiguated but should not contain other additional elements that: ● Do not belong to the domain. ● Do belong to it but do not appear in the texts. ● For example assume the text “Ronaldo scored two goals for Real Madrid“ from a contemporary football match description. ● To disambiguate the term “Ronaldo” using an ontology, the only contextual evidence that can be used is the entity “Real Madrid”. ● Yet there are two players with that name that are semantically related to Real: ● Cristiano Ronaldo (current player) ● Ronaldo Luis Nazario de Lima (former player). ● This means that if both relations are considered then the term will not be disambiguated. 5
  • 6. Introduction Towards Better Alignment ● In the previous example the fact that the text describes a contemporary football match suggests that, in general, the relation between a team and its former players is not expected to appear in it. ● Thus, for such texts, it would make sense to ignore this relation in order to achieve more accurate disambiguation. ● Based on this observation, we make two claims: ● That there are certain scenarios where there is available a priori knowledge about what entities and relations are expected to be present in the text. ● That this knowledge can be exploited for better alignment between semantic information and content leading to more effective disambiguation. ● To verify these claims we define an entity disambiguation framework that can perform better disambiguation in such scenarios. 6
  • 7. Proposed Framework Approach ● We target the task of entity disambiguation based on the intuition that a given ontological entity is more likely to represent the meaning of an ambiguous term when there are many ontologically related to it entities in the text. ● E.g. in the example text the entities “Siege of Tripolitsa” and “Theodoros Kolokotronis” indicate that the term “Tripoli” refers to the city of Greece. ● These evidential entities are derived from one or more domain ontologies. ● However, which entities and to what extent may serve as evidence in a given application scenario depends on the domain and expected content of the texts. ● For that, the key ability our framework provides to its users is to construct, in a semi-automatic manner, semantic evidence models for specific disambiguation scenarios and use them to perform entity disambiguation within them. 7
  • 8. Proposed Framework Framework Components ● A Disambiguation Evidence Model that contains the semantic entities that may serve as disambiguation evidence for the scenario’s target entities in the given scenario. ● Each pair of a target entity and an evidential one is accompanied by a degree that quantifies the latter’s evidential power for the given target entity. ● A Disambiguation Evidence Model Construction Process that builds, in a semi- automatic manner, a disambiguation evidence model for a given scenario. ● An Entity Disambiguation Process that uses the evidence model to detect and extract from a given text terms that refer to the scenario’s target entities. ● Each term is linked to one or more possible entity uris along with a confidence score calculated for each of them. ● The entity with the highest confidence should be the one the term actually refers to. 8
  • 9. Proposed Framework Disambiguation Evidence Model ● Defines for each ontology entity which other instances and to what extent should be used as evidence towards its correct meaning interpretation. ● It consists of entity pairs where a particular entity provides quantified evidence for a another one. 9
  • 10. Proposed Framework Evidence Model Construction ● Construction of the evidence model depends on the characteristics of the domain and the texts. ● The first step of the construction is manual and involves: ● The identification of the concepts whose instances we wish to disambiguate (e.g. locations) ● The determination, for each of these concepts, of the related to them concepts whose instances may serve as contextual disambiguation evidence/ ● For example, in texts that describe historical events, some concepts whose instances may act as location evidence are related locations, historical events, and historical groups and persons. ● The identification, for each pair of evidence and target concept, of the relation paths that links them. 10
  • 11. Proposed Framework Evidence Model Construction ● The result of this first step is a table like the following ones: 11
  • 12. Proposed Framework Evidence Model Construction ● Based on these tables, the second step of the construction is automatic and involves the generation of the target-evidence entity pairs along with a disambiguation evidential strength. ● This strength is inversely proportional to the number of different same-name target entities a given evidential entity provides evidence for. ● For example, “Getafe” provides evidence for “Pedro Leon” to a strength of 0.5 because it has another player called Pedro. 12
  • 13. Proposed Framework Entity Resolution Process ● Step 1: We extract from the text the terms that possibly refer to the target entities as well as those that refer to their respective evidential entities. ● Extraction is performed with Knowledge Tagger, an in-house tool based on GATE. ● Step 2: Using the evidential entities we compute for each extracted target entity term the confidence that it refers to a particular target entity. ● The target entity with the highest confidence is expected to be the correct one. 13
  • 14. Proposed Framework Disambiguation Example Correct disambiguation of the term Atletico 14
  • 15. Framework Evaluation Evaluation Process Description ● Two disambiguation scenarios: ● Football match descriptions. ● Texts describing military conflicts. ● DBPedia as a source of semantic information in both cases. ● Disambiguation effectiveness measured through precision and recall. ● Evaluation results were compared to those achieved by two publicly available semantic annotation and disambiguation systems; ● DBPedia Spotlight ● AIDA ● The two systems: ● Use also DBPedia as a knowledge source. ● Provide the users the capability to select the classes whose instances are to be included in the process. 15
  • 16. Framework Evaluation Evaluation Results Football Match Descriptions Scenario ● 50 texts describing football matches. ● E.g. “It's the 70th minute of the game and after a magnificent pass by Pedro, Messi managed to beat Claudio Bravo. Barcelona now leads 1-0 against Real." Disambiguation Results 16
  • 17. Framework Evaluation Evaluation Results Military Conflict Texts Scenario ● 50 historical texts describing military conflicts. ● E.g. “The Siege of Augusta was a significant battle of the American Revolution. Fought for control of Fort Cornwallis, a British fort near Augusta, the battle was a major victory for the Patriot forces of Lighthorse Harry Lee and a stunning reverse to the British and Loyalist forces in the South”. Disambiguation Results 17
  • 18. Conclusions and Future Work Key Points ● We proposed a novel framework for optimizing named entity disambiguation in well- defined and adequately constrained scenarios through the customized selection and exploitation of semantic data. ● Our purpose was not to build another generic disambiguation system but rather a reusable framework that can: ● Be relatively easily adapted to the particular characteristics of the domain and application scenario at hand. ● Exploit these characteristics to increase the effectiveness of the disambiguation process. ● The key aspect of the framework is the semi-automatic process it defines for selecting the optimal evidence model for the scenario at hand. 18
  • 19. Conclusions and Future Work Key Points ● Comparative evaluation in two specific scenarios verified the framework’s superiority over existing approaches that are designed to work in open domains and unconstrained scenarios. ● This verified our hypothesis that the scenario adaptation capabilities of such generic disambiguation systems can be inadequate in certain scenarios. ● Of course, the framework’s usability and effectiveness is directly proportional to the content specificity of the texts to be disambiguated and the availability and quality of a priori semantic knowledge about their content. ● The greater these two parameters are, the more applicable is our approach and the more effective the disambiguation is expected to be. ● The opposite is true as the texts become more generic and the information we have out about them more scarce. 19
  • 20. Conclusions and Future Work Framework Extensions ● Fully automated construction of the disambiguation evidence model. ● Challenge here is how to automatically identify the text’s domain/topic. ● Combination with statistical methods for cases where available domain semantic information is incomplete. ● Challenge here is how to select the optimal ratio of ontological evidence v.s. statistical one. ● Development of tool to enable users to dynamically build such models out of existing semantic data and use them for disambiguation purposes 20
  • 21. Thank you! Contact iSOCO Dr. Panos Alexopoulos Senior Researcher palexopoulos@isoco.com Questions? Barcelona Madrid Pamplona Valencia Tel +34 935 677 200 Tel +34 913 349 797 Tel +34 948 102 408 Tel +34 963 467 143 Edificio Testa A Av. del Partenón, 16-18, 1º7ª Parque Tomás Oficina 107 C/ Alcalde Barnils, 64-68 Campo de las Naciones Caballero, 2, 6º-4ª C/ Prof. Beltrán Báguena, 4 St. Cugat del Vallès 28042 Madrid 31006 Pamplona 46009 Valencia 08174 Barcelona 21