SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Creating Knowledge out of Interlinked Data
            JIST 2012 – Page 1                                                       http://lod2.eu




Navigation-induced Knowledge Engineering
             by Example (NKE)
         Sebastian Hellmann, Jens Lehmann, Jörg Unbehauen, Claus Stadler,
                       Thanh Nghia Lam, Markus Strohmaier

                           http://slideshare.net/kurzum



                                                          http://aksw.org/Projects/NKE
                                                                  http://lod2.eu
                                                                 AKSW, Universität Leipzig
  LOD2 Presentation . 02.09.2010 . Page                                        http://lod2.eu
JIST 2012 – Page 2                                               http://lod2.eu




          Problem description

Why is there a Knowledge Acquisition Bottleneck?

Questions you might ask an Ontology Engineer:

• What is the purpose of my Ontology?
• For which application is it created?
• What are sensible categories?
• How do I design the concept hierarchy to be useful for browsing?
• How do I use my resources efficiently, yet still produce a reasonable
  good result?
• With how many Domain experts do I have to communicate to reach
  consensus?
JIST 2012 – Page 3   http://lod2.eu
JIST 2012 – Page 4                                   http://lod2.eu




                How many Ontology Engineers are
              necessary to structure 31 Billion Facts?




               Who will guard the guards?
           Does their schema fit my use case?
What kind of schemas do we need to effectively query and
                   browse this data?
JIST 2012 – Page 5                         http://lod2.eu




            NKE




Navigation-induced Knowledge Engineering by Example
JIST 2012 – Page 6                                                http://lod2.eu




          NKE Methodology

Based on the idea that each information need of a user might be a
potential ontological concept (set of instances)

                              Search <=> Ontological Concept

There are three steps involved:

I. Navigation: NKE starts by interpreting navigational behavior of users to
    infer an initial (seed) set of positive and negative examples.
II. Iterative Feedback: NKE supports users in interactively refining the seed
    set of examples such that the final set of objects satisfies the users’
    intent
III.Retention: NKE allows users to retain previously explored sets of objects
    by grouping them and saving them for later retrieval.
JIST 2012 – Page 7                    http://lod2.eu




 Future Work: DRUNKE = Drupal + NKE
JIST 2012 – Page 8      http://lod2.eu




          Overview



●
    Current prototype for NKE
●
    Introduction to DL-Learner
●
    Show more GUIs and Mockups
●
    Evaluation
JIST 2012 – Page 9       http://lod2.eu




 Current NKE prototype
JIST 2012 – Page 10              http://lod2.eu




 HANNE – http://hanne.aksw.org
JIST 2012 – Page 11              http://lod2.eu




 HANNE – http://hanne.aksw.org
JIST 2012 – Page 12                    http://lod2.eu




 GUIs




      Start Learning with DL-Learner
JIST 2012 – Page 13                                              http://lod2.eu




              DL-Learner
DL-Learner is a tool for learning concepts in Description Logics (DLs) from user-
provided examples.
JIST 2012 – Page 14        http://lod2.eu




 Introduction DL-Learner
JIST 2012 – Page 15                                                 http://lod2.eu




               Introduction DL-Learner




                                          Good properties for active learning:
                                          - Biased towards high recall
                                          - Scales well: Number of training examples is
                                          more important than the size of the
                                          background knowledge




Didier Cherix, Sebastian Hellmann und Jens Lehmann:
Improving the Performance of a SPARQL Component for Semantic Web Applications
In: JIST 2012
JIST 2012 – Page 16        http://lod2.eu




 Introduction DL-Learner
JIST 2012 – Page 17                                     http://lod2.eu




 GUIs




                      Northeast football league south
JIST 2012 – Page 18              http://lod2.eu




 HANNE – http://hanne.aksw.org
JIST 2012 – Page 19                                                 http://lod2.eu




 GUIs




                      With only 2 positives and 4 negatives,
                      it is possible to find 13 more instances, which are
                      football clubs situated close to Saxony, Germany



                      Possible to add more positives and complete the
                      list
JIST 2012 – Page 20                                             http://lod2.eu




 Vision



Integrate NKE processes seamlessly into existing applications
JIST 2012 – Page 21                                    http://lod2.eu




             GUIs




dbo:President and dbo:geoRelated value United_States
     and dbo:spouse some Thing
Retrieves 42 of 44 instances → acceptable intensional definition
JIST 2012 – Page 22   http://lod2.eu




 GUIs
JIST 2012 – Page 23   http://lod2.eu




 GUIs
JIST 2012 – Page 24                                                       http://lod2.eu




                Geizhals

Softer criteria: Retention / “Saving” is replaced by a hit count on the concept, which is a
navigation suggestion (popularity)
JIST 2012 – Page 25                                               http://lod2.eu




           Evaluation

•   Based on Wikipedia Categories
(1) the categories can be considered a hierarchical structure to more effectively
    group and browse Wikipedia articles
(2) the categories are maintained manually (which is very tedious and time-
    consuming)
(3) they do not enforce a strict is-a relation to their member articles, which
    means that the data contains errors from a supervised learning point of
    view.


•   list of 98 categories from DBpedia, which contained exactly 100 members
    that had an infobox as well as an abstract property
JIST 2012 – Page 26                                            http://lod2.eu




              Keyword search vs. DL-Learner


Keyword search
 •   Find all “Wrestlers at the 1938 British Empire Games”
     {
         {Wrestler, 1938, British, Empire, Game},
         {Wrestler, 1938, British, Empire},
         {Wrestler, 1938, British, Game},
         {Wrestler, 1938, Empire, Game},
         …
     }
 •   Total of 31 searches for five words (Power set minus the empty word)
JIST 2012 – Page 27                            http://lod2.eu




             Keyword search vs. DL-Learner


Keyword search




   Limit = Based on the assumption that a user
           only looks at the first 20, 100, 200 examples
JIST 2012 – Page 28                                           http://lod2.eu




              Keyword search vs. DL-Learner


DL-Learner
 •   Used same metrics
 •   5 randomly selected positive seed instances from the category (navigation
     history, string search or facet-based browsing )
 •   5 negatives from parallel sister categories (with same predecessor)
 •   5 iterations (with a total of 25 positives and negatives)
JIST 2012 – Page 29              http://lod2.eu




             Keyword search vs. DL-Learner


Quantitative results
JIST 2012 – Page 30                                  http://lod2.eu




             Qualitative Results




Detailed results are available at http://aksw.org/Projects/NKE
JIST 2012 – Page 31               http://lod2.eu




 Qualitative Results - Examples
JIST 2012 – Page 32                                        http://lod2.eu




            Qualitative Results - Examples



•   Single feature concepts
•   Easy to learn
•   If added as intensional definition, e.g. by an admin, they can
     •   help to identify errors and missing values in the database
     • Automatically classify new instances
JIST 2012 – Page 33                                   http://lod2.eu




            Qualitative Results - Examples




•   Overly specific concepts
•   Partially correct, Defoe is in Bay City, Michigan
•   53 of 100 matched
•   Data inspection showed URIs as well as literals as objects
JIST 2012 – Page 34                                         http://lod2.eu




            Qualitative Results - Examples




•   Indirect solution concepts
•   Read like paraphrases
•   no feature (e.g. champion value US_Open)
•   SubdividisonName is more frequently used by US cities in DBpedia
JIST 2012 – Page 35                                        http://lod2.eu




            Qualitative Results - Examples




•   Zero member concepts
•   Northland region is not a clear is-a relation, but rather a tag
•   Second one does not have any good features in the data
JIST 2012 – Page 36                                         http://lod2.eu




           Conclusions

•   Definition of the NKE paradigm
•   Proof of concept implementation
     • Technical feasibility
     • Web Demo: http://hanne.aksw.org


•   We have made progress to bridge the gap between user interaction and
    knowledge engineering
JIST 2012 – Page 37                                            http://lod2.eu




           Future Work & Open Questions

•   For which purpose can concepts created by users be exploited:
     • Improve Navigation via suggestions or hierarchial browsing
     • Create domain ontologies
•   Create a GUI for different target groups:
     • End-users
     • Domain experts with some technical skill
•   Further evaluation necessary, please contact us for collaborations
•   Project page is http://aksw.org/Projects/NKE




                                http://slideshare.net/kurzum

Weitere ähnliche Inhalte

Andere mochten auch

B ed school-study-and-action-research-project-2013-14
B ed school-study-and-action-research-project-2013-14B ed school-study-and-action-research-project-2013-14
B ed school-study-and-action-research-project-2013-14
nitesh sheoran
 
Executive Information System
Executive Information SystemExecutive Information System
Executive Information System
Theju Paul
 

Andere mochten auch (14)

Data Modelling and Knowledge Engineering for the Internet of Things
Data Modelling and Knowledge Engineering for the Internet of ThingsData Modelling and Knowledge Engineering for the Internet of Things
Data Modelling and Knowledge Engineering for the Internet of Things
 
Knowledge engineering and the Web
Knowledge engineering and the WebKnowledge engineering and the Web
Knowledge engineering and the Web
 
Introduction to Information System
Introduction to Information SystemIntroduction to Information System
Introduction to Information System
 
B ed school-study-and-action-research-project-2013-14
B ed school-study-and-action-research-project-2013-14B ed school-study-and-action-research-project-2013-14
B ed school-study-and-action-research-project-2013-14
 
Benefit Of Computer
Benefit Of ComputerBenefit Of Computer
Benefit Of Computer
 
Knowledge-based Systems
Knowledge-based SystemsKnowledge-based Systems
Knowledge-based Systems
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Artificial Intelligence: Knowledge Engineering
Artificial Intelligence: Knowledge EngineeringArtificial Intelligence: Knowledge Engineering
Artificial Intelligence: Knowledge Engineering
 
Executive Information System
Executive Information SystemExecutive Information System
Executive Information System
 
Knowledge based systems
Knowledge based systemsKnowledge based systems
Knowledge based systems
 
neural network
neural networkneural network
neural network
 
Knowledge representation and Predicate logic
Knowledge representation and Predicate logicKnowledge representation and Predicate logic
Knowledge representation and Predicate logic
 
6.expert systems
6.expert systems6.expert systems
6.expert systems
 
Topic 8 expert system
Topic 8 expert systemTopic 8 expert system
Topic 8 expert system
 

Ähnlich wie Navigation-induced Knowledge Engineering by Example

Integrating NLP using Linked Data
Integrating NLP using Linked DataIntegrating NLP using Linked Data
Integrating NLP using Linked Data
Sebastian Hellmann
 
Oeb08 Dec08 Tyamada
Oeb08 Dec08 TyamadaOeb08 Dec08 Tyamada
Oeb08 Dec08 Tyamada
tsyamada
 
Pedagogy coloda
Pedagogy colodaPedagogy coloda
Pedagogy coloda
KBTNHKU
 

Ähnlich wie Navigation-induced Knowledge Engineering by Example (20)

Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
 
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and RepairLOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and AuthoringLOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
 
LOD2 Webinar: UnifiedViews
LOD2 Webinar: UnifiedViewsLOD2 Webinar: UnifiedViews
LOD2 Webinar: UnifiedViews
 
EDF2013: Selected Talk Josep-L. Larriba-Pey: The Linked Data Benchmark Counci...
EDF2013: Selected Talk Josep-L. Larriba-Pey: The Linked Data Benchmark Counci...EDF2013: Selected Talk Josep-L. Larriba-Pey: The Linked Data Benchmark Counci...
EDF2013: Selected Talk Josep-L. Larriba-Pey: The Linked Data Benchmark Counci...
 
Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
 
ISO 15926 Reference Data Engineering Methodology
ISO 15926 Reference Data Engineering MethodologyISO 15926 Reference Data Engineering Methodology
ISO 15926 Reference Data Engineering Methodology
 
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORELOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
 
Testing a archival DL prototype asking to users: the case of "una città per g...
Testing a archival DL prototype asking to users: the case of "una città per g...Testing a archival DL prototype asking to users: the case of "una città per g...
Testing a archival DL prototype asking to users: the case of "una città per g...
 
Integrating NLP using Linked Data
Integrating NLP using Linked DataIntegrating NLP using Linked Data
Integrating NLP using Linked Data
 
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web  NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
 
Oeb08 Dec08 Tyamada
Oeb08 Dec08 TyamadaOeb08 Dec08 Tyamada
Oeb08 Dec08 Tyamada
 
00 intro
00 intro00 intro
00 intro
 
Citeulike
CiteulikeCiteulike
Citeulike
 
Pedagogy coloda
Pedagogy colodaPedagogy coloda
Pedagogy coloda
 
LOD2 Webinar Series: DBpedia Spotlight
LOD2 Webinar Series: DBpedia SpotlightLOD2 Webinar Series: DBpedia Spotlight
LOD2 Webinar Series: DBpedia Spotlight
 
Free Webinar: LOD2 Stack - 1st release
Free Webinar: LOD2 Stack - 1st releaseFree Webinar: LOD2 Stack - 1st release
Free Webinar: LOD2 Stack - 1st release
 
The Future of LOD
The Future of LODThe Future of LOD
The Future of LOD
 

Mehr von Sebastian Hellmann

Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Sebastian Hellmann
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
Sebastian Hellmann
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
Sebastian Hellmann
 

Mehr von Sebastian Hellmann (15)

Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
 
NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate report
 
NIF 2.0 draft for Pisa
NIF 2.0 draft for PisaNIF 2.0 draft for Pisa
NIF 2.0 draft for Pisa
 
Linked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationLinked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web Annotation
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23
 
NIF - NLP Interchange Format
NIF - NLP Interchange FormatNIF - NLP Interchange Format
NIF - NLP Interchange Format
 
Tool collection as linkeddata
Tool collection as linkeddataTool collection as linkeddata
Tool collection as linkeddata
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Navigation-induced Knowledge Engineering by Example

  • 1. Creating Knowledge out of Interlinked Data JIST 2012 – Page 1 http://lod2.eu Navigation-induced Knowledge Engineering by Example (NKE) Sebastian Hellmann, Jens Lehmann, Jörg Unbehauen, Claus Stadler, Thanh Nghia Lam, Markus Strohmaier http://slideshare.net/kurzum http://aksw.org/Projects/NKE http://lod2.eu AKSW, Universität Leipzig LOD2 Presentation . 02.09.2010 . Page http://lod2.eu
  • 2. JIST 2012 – Page 2 http://lod2.eu Problem description Why is there a Knowledge Acquisition Bottleneck? Questions you might ask an Ontology Engineer: • What is the purpose of my Ontology? • For which application is it created? • What are sensible categories? • How do I design the concept hierarchy to be useful for browsing? • How do I use my resources efficiently, yet still produce a reasonable good result? • With how many Domain experts do I have to communicate to reach consensus?
  • 3. JIST 2012 – Page 3 http://lod2.eu
  • 4. JIST 2012 – Page 4 http://lod2.eu How many Ontology Engineers are necessary to structure 31 Billion Facts? Who will guard the guards? Does their schema fit my use case? What kind of schemas do we need to effectively query and browse this data?
  • 5. JIST 2012 – Page 5 http://lod2.eu NKE Navigation-induced Knowledge Engineering by Example
  • 6. JIST 2012 – Page 6 http://lod2.eu NKE Methodology Based on the idea that each information need of a user might be a potential ontological concept (set of instances) Search <=> Ontological Concept There are three steps involved: I. Navigation: NKE starts by interpreting navigational behavior of users to infer an initial (seed) set of positive and negative examples. II. Iterative Feedback: NKE supports users in interactively refining the seed set of examples such that the final set of objects satisfies the users’ intent III.Retention: NKE allows users to retain previously explored sets of objects by grouping them and saving them for later retrieval.
  • 7. JIST 2012 – Page 7 http://lod2.eu Future Work: DRUNKE = Drupal + NKE
  • 8. JIST 2012 – Page 8 http://lod2.eu Overview ● Current prototype for NKE ● Introduction to DL-Learner ● Show more GUIs and Mockups ● Evaluation
  • 9. JIST 2012 – Page 9 http://lod2.eu Current NKE prototype
  • 10. JIST 2012 – Page 10 http://lod2.eu HANNE – http://hanne.aksw.org
  • 11. JIST 2012 – Page 11 http://lod2.eu HANNE – http://hanne.aksw.org
  • 12. JIST 2012 – Page 12 http://lod2.eu GUIs Start Learning with DL-Learner
  • 13. JIST 2012 – Page 13 http://lod2.eu DL-Learner DL-Learner is a tool for learning concepts in Description Logics (DLs) from user- provided examples.
  • 14. JIST 2012 – Page 14 http://lod2.eu Introduction DL-Learner
  • 15. JIST 2012 – Page 15 http://lod2.eu Introduction DL-Learner Good properties for active learning: - Biased towards high recall - Scales well: Number of training examples is more important than the size of the background knowledge Didier Cherix, Sebastian Hellmann und Jens Lehmann: Improving the Performance of a SPARQL Component for Semantic Web Applications In: JIST 2012
  • 16. JIST 2012 – Page 16 http://lod2.eu Introduction DL-Learner
  • 17. JIST 2012 – Page 17 http://lod2.eu GUIs Northeast football league south
  • 18. JIST 2012 – Page 18 http://lod2.eu HANNE – http://hanne.aksw.org
  • 19. JIST 2012 – Page 19 http://lod2.eu GUIs With only 2 positives and 4 negatives, it is possible to find 13 more instances, which are football clubs situated close to Saxony, Germany Possible to add more positives and complete the list
  • 20. JIST 2012 – Page 20 http://lod2.eu Vision Integrate NKE processes seamlessly into existing applications
  • 21. JIST 2012 – Page 21 http://lod2.eu GUIs dbo:President and dbo:geoRelated value United_States and dbo:spouse some Thing Retrieves 42 of 44 instances → acceptable intensional definition
  • 22. JIST 2012 – Page 22 http://lod2.eu GUIs
  • 23. JIST 2012 – Page 23 http://lod2.eu GUIs
  • 24. JIST 2012 – Page 24 http://lod2.eu Geizhals Softer criteria: Retention / “Saving” is replaced by a hit count on the concept, which is a navigation suggestion (popularity)
  • 25. JIST 2012 – Page 25 http://lod2.eu Evaluation • Based on Wikipedia Categories (1) the categories can be considered a hierarchical structure to more effectively group and browse Wikipedia articles (2) the categories are maintained manually (which is very tedious and time- consuming) (3) they do not enforce a strict is-a relation to their member articles, which means that the data contains errors from a supervised learning point of view. • list of 98 categories from DBpedia, which contained exactly 100 members that had an infobox as well as an abstract property
  • 26. JIST 2012 – Page 26 http://lod2.eu Keyword search vs. DL-Learner Keyword search • Find all “Wrestlers at the 1938 British Empire Games” { {Wrestler, 1938, British, Empire, Game}, {Wrestler, 1938, British, Empire}, {Wrestler, 1938, British, Game}, {Wrestler, 1938, Empire, Game}, … } • Total of 31 searches for five words (Power set minus the empty word)
  • 27. JIST 2012 – Page 27 http://lod2.eu Keyword search vs. DL-Learner Keyword search Limit = Based on the assumption that a user only looks at the first 20, 100, 200 examples
  • 28. JIST 2012 – Page 28 http://lod2.eu Keyword search vs. DL-Learner DL-Learner • Used same metrics • 5 randomly selected positive seed instances from the category (navigation history, string search or facet-based browsing ) • 5 negatives from parallel sister categories (with same predecessor) • 5 iterations (with a total of 25 positives and negatives)
  • 29. JIST 2012 – Page 29 http://lod2.eu Keyword search vs. DL-Learner Quantitative results
  • 30. JIST 2012 – Page 30 http://lod2.eu Qualitative Results Detailed results are available at http://aksw.org/Projects/NKE
  • 31. JIST 2012 – Page 31 http://lod2.eu Qualitative Results - Examples
  • 32. JIST 2012 – Page 32 http://lod2.eu Qualitative Results - Examples • Single feature concepts • Easy to learn • If added as intensional definition, e.g. by an admin, they can • help to identify errors and missing values in the database • Automatically classify new instances
  • 33. JIST 2012 – Page 33 http://lod2.eu Qualitative Results - Examples • Overly specific concepts • Partially correct, Defoe is in Bay City, Michigan • 53 of 100 matched • Data inspection showed URIs as well as literals as objects
  • 34. JIST 2012 – Page 34 http://lod2.eu Qualitative Results - Examples • Indirect solution concepts • Read like paraphrases • no feature (e.g. champion value US_Open) • SubdividisonName is more frequently used by US cities in DBpedia
  • 35. JIST 2012 – Page 35 http://lod2.eu Qualitative Results - Examples • Zero member concepts • Northland region is not a clear is-a relation, but rather a tag • Second one does not have any good features in the data
  • 36. JIST 2012 – Page 36 http://lod2.eu Conclusions • Definition of the NKE paradigm • Proof of concept implementation • Technical feasibility • Web Demo: http://hanne.aksw.org • We have made progress to bridge the gap between user interaction and knowledge engineering
  • 37. JIST 2012 – Page 37 http://lod2.eu Future Work & Open Questions • For which purpose can concepts created by users be exploited: • Improve Navigation via suggestions or hierarchial browsing • Create domain ontologies • Create a GUI for different target groups: • End-users • Domain experts with some technical skill • Further evaluation necessary, please contact us for collaborations • Project page is http://aksw.org/Projects/NKE http://slideshare.net/kurzum