SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Keyword-driven SPARQL Query Generation Leveraging Background Knowledge Authors: Saeedeh Shekarpour,  S ö ren Auer,  Axel-Cyrille Ngonga Ngomo, Daniel Gerber,  Sebastian Hellmann, Claus Stadler AKSW group Universität  Leipzig WI-IAT conference
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Querying web of documents AKSW group - Universität Leipzig  Text retrieval 24 August 2011
Web of Data AKSW group - Universität Leipzig  24 August 2011
Motivations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Birds-eye-view of the envisioned search approach AKSW group - Universität Leipzig  24 August 2011
Overview of the proposed method AKSW group - Universität Leipzig  24 August 2011
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Mapping keywords to IRIs ,[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Ranking and Selecting Anchor Points ,[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Ranking and Selecting Anchor Points ,[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Graph pattern template ,[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Categorization of  all graph pattern templates AKSW group - Universität Leipzig  24 August 2011 Category  Possible Patterns  Pattern Schema  Instance-Property (IP)  IP.P1 IP.P2  IP.P3  IP.P4  IP.P5  IP.P6  (s, p, ?o) (?s, p, o) (?s1, ?p1, o1)(?s1, p2, ?o2)  (?s1, ?p1, o1)(?o2, p2, ?s1) (s1, ?p1, ?o1)(?s2, p2, ?o1) (s1, ?p1, ?o1)(?o1, p2, ?o2)  Class-Instance (CI)  CI.P7 CI.P8  (?s1, a, c)(?s1, ?p1, o1)  (?s1, a, c)(s2, ?p1, ?s1)  Instance-Instance (II)  II.P9  II.P10  II.P11  II.P12  (s, ?p, o) (s, ?p1, ?x)(?x, ?p2, o) (s1, ?p1, ?x)(s2, ?p2, ?x) (?s, ?p1, o1)(?s, ?p2, o2)  Class-Property (CP)  CP.P13 CP.P14  (?s, a, c)(?s, p, ?o) (?s, a, c)(?x, p, ?s)  Property-Property (PP)  PP.P15 PP.P16  PP.P17  (?s, p1, ?x)(?x, p2, ?o) (?s1, p1, ?o)(?s2, p2, ?o)  (?s, p1, ?o1)(?s, p2, ?o2)
Appropriate identified  graph pattern templates AKSW group - Universität Leipzig  24 August 2011 Category  Possible Patterns  Pattern Schema  Instance-Property (IP)  IP.P1 IP.P4  IP.P6  (s, p, ?o) (?s1, ?p1, o1)(?o2, p2, ?s1) (s1, ?p1, ?o1)(?o1, p2, ?o2)  Class-Instance (CI)  CI.P7 CI.P8  (?s1, a, c)(?s1, ?p1, o1)  (?s1, a, c)(s2, ?p1, ?s1)  Instance-Instance (II)  II.P9  II.P10  (s, ?p, o) (s, ?p1, ?x)(?x, ?p2, o)  Class-Property (CP)  CP.P14  (?s, a, c)(?x, p, ?s)  Property-Property (PP)  - -
Query generation algorithm AKSW group - Universität Leipzig  24 August 2011
Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Online interface AKSW group - Universität Leipzig   lod-query.aksw.org 24 August 2011
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Accuracy metrics ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Accuracy metrics ,[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Accuracy metrics ,[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Accuracy metrics ,[object Object],AKSW group - Universität Leipzig  24 August 2011
Accuracy of each categorized  graph pattern AKSW group - Universität Leipzig  24 August 2011
Categorization  based on the matter of information. ,[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Samples of  keywords and results AKSW group - Universität Leipzig  24 August 2011
Accuracy results  for  different categories AKSW group - Universität Leipzig  24 August 2011 Category Recall Fuzzy precision F-score Similar instances 0.700 0.735 0.717 Characteristics of an instance 0.625 0.700 0.660 Associations between instances 0.500 0.710 0.580 General accuracy 0.625 0.724 0.670
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Conclusion and future work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],AKSW group - Universität Leipzig  24 August 2011
Thank you for your attention. Thanks to my colleague from AKSW research group. Any Question? AKSW group - Universität Leipzig  24 August 2011

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (16)

LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
 
R programming groundup-basic-section-i
R programming groundup-basic-section-iR programming groundup-basic-section-i
R programming groundup-basic-section-i
 
Introduction into R for historians (part 1: introduction)
Introduction into R for historians (part 1: introduction)Introduction into R for historians (part 1: introduction)
Introduction into R for historians (part 1: introduction)
 
Modelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic StudyModelling and Querying Lists in RDF. A Pragmatic Study
Modelling and Querying Lists in RDF. A Pragmatic Study
 
R programming language: conceptual overview
R programming language: conceptual overviewR programming language: conceptual overview
R programming language: conceptual overview
 
R Programming
R ProgrammingR Programming
R Programming
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology Applications
 
What's Spain's Paris? Mining Analogical Libraries from Q&A Discussions
What's Spain's Paris? Mining Analogical Libraries from Q&A DiscussionsWhat's Spain's Paris? Mining Analogical Libraries from Q&A Discussions
What's Spain's Paris? Mining Analogical Libraries from Q&A Discussions
 
A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
 
Automatic Mathematical Information Retrieval to Perform Translations up to Co...
Automatic Mathematical Information Retrieval to Perform Translations up to Co...Automatic Mathematical Information Retrieval to Perform Translations up to Co...
Automatic Mathematical Information Retrieval to Perform Translations up to Co...
 
Automatic Assessment of Programming Assignments
Automatic Assessment of Programming AssignmentsAutomatic Assessment of Programming Assignments
Automatic Assessment of Programming Assignments
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
 
R programming for data science
R programming for data scienceR programming for data science
R programming for data science
 
Software bug prediction
Software bug prediction Software bug prediction
Software bug prediction
 
msf566-syllabus
msf566-syllabusmsf566-syllabus
msf566-syllabus
 
Information Content based Ranking Metric for Linked Open Vocabularies
Information Content based Ranking Metric for Linked Open VocabulariesInformation Content based Ranking Metric for Linked Open Vocabularies
Information Content based Ranking Metric for Linked Open Vocabularies
 

Ähnlich wie Wi presentation

Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
DBOnto
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Text
butest
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
Marko Rodriguez
 

Ähnlich wie Wi presentation (20)

Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
 
Sem facet paper
Sem facet paperSem facet paper
Sem facet paper
 
SemFacet paper
SemFacet paperSemFacet paper
SemFacet paper
 
Visual Querying LOD sources with LODeX
 Visual Querying LOD sources with LODeX Visual Querying LOD sources with LODeX
Visual Querying LOD sources with LODeX
 
OntoMaven Repositories and OMG API4KP
OntoMaven Repositories and OMG API4KPOntoMaven Repositories and OMG API4KP
OntoMaven Repositories and OMG API4KP
 
OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018OKE2018 Challenge @ ESWC2018
OKE2018 Challenge @ ESWC2018
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research Objects
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)
 
Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19
 
From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...
 
Deriving human readable labels from sparql queries
Deriving human readable labels from sparql queries Deriving human readable labels from sparql queries
Deriving human readable labels from sparql queries
 
Scientific Publication Retrieval in Linked Data
Scientific Publication Retrieval in Linked DataScientific Publication Retrieval in Linked Data
Scientific Publication Retrieval in Linked Data
 
Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the web
 
Multilingual qa
Multilingual qaMultilingual qa
Multilingual qa
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Text
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics Interoperability
 

Mehr von Saeedeh Shekarpour

CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on RelationsCEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
Saeedeh Shekarpour
 

Mehr von Saeedeh Shekarpour (7)

Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
 
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on RelationsCEVO: Comprehensive EVent Ontology  Enhancing Cognitive Annotation on Relations
CEVO: Comprehensive EVent Ontology Enhancing Cognitive Annotation on Relations
 
A quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment researchA quality type aware annotated corpus and lexicon for harassment research
A quality type aware annotated corpus and lexicon for harassment research
 
Windowing of attention
Windowing of attentionWindowing of attention
Windowing of attention
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
Semantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked DataSemantic Interpretation of User Query for Question Answering on Interlinked Data
Semantic Interpretation of User Query for Question Answering on Interlinked Data
 
Sina presentation in IBM
Sina presentation in IBMSina presentation in IBM
Sina presentation in IBM
 

Kürzlich hochgeladen

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 

Kürzlich hochgeladen (20)

Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 

Wi presentation

  • 1. Keyword-driven SPARQL Query Generation Leveraging Background Knowledge Authors: Saeedeh Shekarpour, S ö ren Auer, Axel-Cyrille Ngonga Ngomo, Daniel Gerber, Sebastian Hellmann, Claus Stadler AKSW group Universität Leipzig WI-IAT conference
  • 2.
  • 3. Querying web of documents AKSW group - Universität Leipzig Text retrieval 24 August 2011
  • 4. Web of Data AKSW group - Universität Leipzig 24 August 2011
  • 5.
  • 6. Birds-eye-view of the envisioned search approach AKSW group - Universität Leipzig 24 August 2011
  • 7. Overview of the proposed method AKSW group - Universität Leipzig 24 August 2011
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14. Categorization of all graph pattern templates AKSW group - Universität Leipzig 24 August 2011 Category Possible Patterns Pattern Schema Instance-Property (IP) IP.P1 IP.P2 IP.P3 IP.P4 IP.P5 IP.P6 (s, p, ?o) (?s, p, o) (?s1, ?p1, o1)(?s1, p2, ?o2) (?s1, ?p1, o1)(?o2, p2, ?s1) (s1, ?p1, ?o1)(?s2, p2, ?o1) (s1, ?p1, ?o1)(?o1, p2, ?o2) Class-Instance (CI) CI.P7 CI.P8 (?s1, a, c)(?s1, ?p1, o1) (?s1, a, c)(s2, ?p1, ?s1) Instance-Instance (II) II.P9 II.P10 II.P11 II.P12 (s, ?p, o) (s, ?p1, ?x)(?x, ?p2, o) (s1, ?p1, ?x)(s2, ?p2, ?x) (?s, ?p1, o1)(?s, ?p2, o2) Class-Property (CP) CP.P13 CP.P14 (?s, a, c)(?s, p, ?o) (?s, a, c)(?x, p, ?s) Property-Property (PP) PP.P15 PP.P16 PP.P17 (?s, p1, ?x)(?x, p2, ?o) (?s1, p1, ?o)(?s2, p2, ?o) (?s, p1, ?o1)(?s, p2, ?o2)
  • 15. Appropriate identified graph pattern templates AKSW group - Universität Leipzig 24 August 2011 Category Possible Patterns Pattern Schema Instance-Property (IP) IP.P1 IP.P4 IP.P6 (s, p, ?o) (?s1, ?p1, o1)(?o2, p2, ?s1) (s1, ?p1, ?o1)(?o1, p2, ?o2) Class-Instance (CI) CI.P7 CI.P8 (?s1, a, c)(?s1, ?p1, o1) (?s1, a, c)(s2, ?p1, ?s1) Instance-Instance (II) II.P9 II.P10 (s, ?p, o) (s, ?p1, ?x)(?x, ?p2, o) Class-Property (CP) CP.P14 (?s, a, c)(?x, p, ?s) Property-Property (PP) - -
  • 16. Query generation algorithm AKSW group - Universität Leipzig 24 August 2011
  • 17.
  • 18.
  • 19. Online interface AKSW group - Universität Leipzig lod-query.aksw.org 24 August 2011
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25. Accuracy of each categorized graph pattern AKSW group - Universität Leipzig 24 August 2011
  • 26.
  • 27. Samples of keywords and results AKSW group - Universität Leipzig 24 August 2011
  • 28. Accuracy results for different categories AKSW group - Universität Leipzig 24 August 2011 Category Recall Fuzzy precision F-score Similar instances 0.700 0.735 0.717 Characteristics of an instance 0.625 0.700 0.660 Associations between instances 0.500 0.710 0.580 General accuracy 0.625 0.724 0.670
  • 29.
  • 30.
  • 31. Thank you for your attention. Thanks to my colleague from AKSW research group. Any Question? AKSW group - Universität Leipzig 24 August 2011

Hinweis der Redaktion

  1. The search for information on the Web of Data is becoming increasingly difficult due to its dramatic growth. Especially novice users need to acquire both knowledge about the underlying ontology structure and proficiency in formulating formal queries (e. g. SPARQL queries) to retrieve information from Linked Data sources. So as to simplify and automate the querying and retrieval of information from such sources, we In this paper, we propose a novel approach for generating SPARQL queries based on user-supplied keywords.
  2. With the new way of representation of data on the Web as RDF the nature of the web switches from web of document to web of data Here all datasets along with their interlinking relationship are visualized. It covers a  different topical domains and currently the amount of triples is more than 30 billion triples. Each node in this cloud diagram represents a distinct data set published as Linked Data. The arcs indicate that RDF links exist between items in the two connected data sets. SPARQL  (pronounced " sparkle " [1] ) is an  RDF query language ;
  3. User needs knowledge about the underlying ontology structure and proficiency in formulating formal queries (e. g. SPARQL queries) We aim to simplify the access by providing search interfaces that resemble the search interfaces commonly used on the document-oriented Web. keyword-based search is the most popular way Nowadays, keyword-based search is the most popular and convenient way for nding information on the Web. The successful experience of keyword-based search in document retrieval and the satisfactory research results about the usability of this paradigm [21] are convincing reasons for using the keyword search paradigm to the Semantic Web.
  4. shows a birds-eye-view of the envisioned research. Based on a set of user-supplied keywords, rst, candidate IRIs (Internationalized Resource Identi- er) for each of the keywords issued by the user is computed. Then, by using an inference mechanism, a subgraph based on the identied IRIs is extracted and represented to the user as the answer.
  5. we encounter two issues. First, we need to find a set of IRIs corresponding to each keyword. Second, we have to construct suitable triple patterns based on the anchor points extracted previously so as to retrieve appropriate data. Figure 1 shows an overview of our approach. Our approach firstly retrieves relevant IRIs related to each user-supplied keyword from the underlying knowledge base and secondly injects them to a series of graph pattern templates for constructing formal queries. So as to find these relevant IRIs, the following two steps are carried out
  6. The goal is the retrieval of entities that match with the user-supplied keywords. Matching is carried out by applying a string similarity function on the keywords and the label properties of all entities in the knowledge base. This similarity evaluation is carried out on all types of entities (i.e., classes, properties and instances). As a result, for each keyword, we retrieve a list of IRI candidates, i.e. anchor points
  7. This step aims at excluding anchor points which are probably unrelated to any interpretation of the user keyword; thereby reducing the potentially high number of anchor points to a minimum. This reduction is carried out by applying a ranking method over the string similarity score and the connectivity degree of the previously detected IRIs in each APKi . In DBpedia, for example, classes have an average connectivity degree of 14,022, while properties have in average 1,243 and instances 37.
  8. The ranking and selection function RS maps APKi to the set UKi as top- 10 of the IRIs contained in APKi sorted in descending order based on S(u) where u 2 APKi .
  9. The SPARQL queries generated with our approach are a restricted kind of SPARQL queries, since they use only basic graph patterns without blank nodes. We analysed 1,000 distinct queries from the query log of the public DBpedia endpoint7 and learned that the number of IRIs is usually larger than the number of triple patterns occurring in the query. As a consequence of this finding we decided to assume graph patterns for generating SPARQL queries for two user-supplied keywords to consist of either one or two triple patterns. For generating SPARQL queries we define the concept of graph pattern template, each graph pattern template is a set of triple patterns which are connected to each other via common subject or object. Each element of triple pattern in a graph pattern template is eihter a variable or placeholder that the number of placeholders is exactly 2. After detecting IRIs from the previous phase we replace them with the placeholders. By this replacement a graphpattern template converted to a graoh pattern with triple pattern of this form. Our graph pattern template which we are using has a maximum length of 2 Since the mine query log of dbpedia and found that the number of iris is larger than the number of graph patterns.
  10. CATEGORIZATION OF ALL POSSIBLE GRAPH PATTERN TEMPLATES FOR EACH TYPED PAIR OF PLACEHOLDERSAPH PATTERN TEMPLATES This definition of graph pattern template leads to the 17 possible graph Shown in this table These graph pattern templates are divided to some categories based on the type of input iris which are replaced with placeholders. pattern templates as shown in Table I.
  11. As detailed in Section V, we performed an accuracy study on all combinatorial possible graph pattern templates. This study showed that the patterns contained in Table II limit the search space (thus leading to more efficiency) without reducing the accuracy of our approach significantly. Consequently, we only considered these patterns during the SPARQL-query generation process described below. We reduced the graph pattern templates to these 8 grapg pattern templates. This selection was done based an accuracy study which in the evaluation section we discuss about that. In fact we limited the search space wiouth any significate loss in accuracy
  12. The results of this algorithm are the output of our approach. To validate the approach, we implemented it as a Java Web application which is publicly available at: http://lod-query. aksw.org. A screenshot of the search results is shown in Figure 2. The whole query interpretation and processing is performed typically on average in 15 seconds (while first results are already obtained after one second) when using DBpedia as knowledge base. The function query constructs the query based on the query pattern given as first argument and the entity identifier to placeholder mapping supplied as 2nd and 3rd argument.
  13. We will measure the preference of an answer based on the occurring RDF terms. RDF terms (short terms) comprise each individual subject, object or predicate in the triples of the answer. Therefore, besides distinguishing between answers related to different interpretations, we should also differentiate between pure answers (just containing preferred terms) and those which contain some impurity. In fact, the correctness of an answer is not a bivalent value but based on the user’s perception. As such it may vary between completely irrelevant and exactly correct. In essence for evaluation, we investigate two questions:
  14. Since we are interested in using those graph pattern templates which typically result in precise answers with respect to the user intention of keywords, we evaluated the accuracy of each graph pattern template by running a SPARQL query containing each individual graph pattern template introduced in Table I by injecting a series of IRI pairs. We selected 53 natural language queries of TREC 9 from which we extracted the two main keywords conveying the general meaning. For example, the query ’How many people live in Chile?’ can be expressed by the keywords Chile and population. Thereafter, the mapping function was applied to these keywords and from the retrieved IRIs, the most suitable ones were manually selected and assigned to the related dataset with regard to their type. We used DBpedia 3.5.1 [12] as the underlying knowledge base. After preparing the datasets, we performed a series of SPARQL queries for each single graph pattern template over the corresponding dataset. The results of the SPARQL queries along with the keywords were shown to two evaluators to score the CR metric for each individual answer. After rating CR for all retrieved answers related to a graph pattern template, fuzzy precision, recall and F 􀀀 score were computed. Figure 3 shows the accuracy of each graph pattern template based on these three metrics. In the category Property-Property, the number of retrieved answers for all graph pattern templates was zero. Our results show that some pattern templates such as P1 in the Instance-Property category as well as P7 and P8 in the Instance-Class category have a high fuzzy precision while their recall is low. In the case of P11 from the Instance-Instance category we have a high recall while the fuzzy precision is low. Hence, this graph pattern template generates a large number of irrelevant answers. We discarded all templates with a fuzzy precision of less than 0.5, resulting in an increase of the overall precision and only a small loss in recall. We monitored the ACR for a set of queries before and after the reduction of graph pattern templates in the category IP and II, because most reductions occurred there. In the category IP, all queries with ACR higher than 0.4 and in the category II with ACR higher than 0.6 were properly answered with the same accuracy. So, this reduction maintained precise results (i.e. high ACR value). As an interpretation of graph pattern templates,
  15. categorization is based on the matter of information which is retrieved from the knowledge base. Finding special characteristics of an instance: Datatype properties which emanate from instances/classes to literals or simple types and also some kinds of object properties state characteristics of an entity and information around them. So, in the simplest case of a query, a user intends to retrieve specific information of an entity such as “Population of Canada” or 7 “ Language of Malaysia”. Since this information is explicit, the simple graph patterns IP.P1, IP.P4 and IP.P6 can be used for retrieving this kind of information. Finding similar instances: In this case, the user asks for a list of instances which have a specific characteristic in common. Examples for these type of queries are: ”Germany Island” or ”Countries with English as official language”. A possible graph structure capturing potential answers for this query type is depicted in Figure 4. It shows a set of instances from the same class which have a certain property in common. Graph pattern templates CI.P7, CI.P8, and CP.P14 retrieve this kind of information. Fig. 4. Similar instances with an instance in common. Finding associations between instances: Associations between instances in knowledge bases are defined as a sequence of properties and instances connecting two given instances (cf. Figure 5). Therefore, each association contains a set of instances and object properties connecting them which is the purpose of the user query. As an example, the query Volkswagen Porsche can be used to find associations between the two car makers. The graph pattern templates II.P9 and II.P10 extract these associations.
  16. The experimental setup consisted of giving a novice user 40 queries from TREC 9, and asking him to run each of the queries against DBpedia using our prototype implementation. Then, for each single answer of a query, he assigned CR according to his own intention. Subsequently, fuzzy precision and recall were computed based on the user’s ratings. Note, that since hyperlinks among pages are inserted as wikilink in DBpedia and they do not convey special meaning between resources, we removed all triples containing the IRIs http:// dbpedia.org/property/wikilink. Table IV shows the evaluation results after running 40 queries against DBpedia. The overall precision of our system is 0.72. Essentially, the accuracy of this method, specifically recall, does not depend on using suitable graph pattern templates, because on the one hand, the mapping approach for choosing relevant IRIs significantly influences the results, and on the Category Recall Fuzzy precision F-score General accuracy 0.625 0.724 0.670 Similar instances 0.700 0.735 0.717 Characteristics of an instance 0.625 0.700 0.660 Associations between instances 0.500 0.710 0.580 TABLE IV ACCURACY RESULTS. other hand the quality of the data in DBpedia severely affects the accuracy. For example, the query “Greece population” returns the correct answer while the similar query “Canada population” led to no results. In addition to the overall evaluation, in order to make a comparison between functionality of the approach for different types of queries (i.e. finding special characteristics of an instance, finding similar instances and finding associations between instances) the employed queries were categorized based on their type and a separate evaluation was computed for each type. Our evaluation in Table IV shows the precision does not differ significantly for different types of queries, while the recall is type dependent. For instance, in the category “similar instances” the recall is significantly higher rather than in the category “association between instances”.
  17. First step towards the user-friendly querying of the Data Web