SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
Pay-as-you-go Query Answering with PAGOdA 
BERNARDO CUENCA GRAU
Ontology-mediated Query Answering 
Q 
A C 
T 
B D 
RDF Data 
a 
b 
• (Meta)-data published in RDF 
• RDF resources reference an OWL 2 ontology 
• The ontology describes the meaning of data 
RDF and OWL 2 well-established 
• Thousands of available OWL 2 ontologies 
• RDF ubiquitous on the Web 
2
Ontology-mediated Query Answering 
Ontology languages offer a wide range modeling constructs 
High expressive power à high worst-case complexity of reasoning 
How can we provide scalable query answering? 
• Restrict our ontology to a lightweight fragment of OWL 
EL, QL or RL profiles 
• Tolerate incompleteness 
• Rely on highly optimised pay-as-you-go systems 
• Worst case optimal for lightweight fragments 
• Rapidly computes easy answers 
• Performance gracefully degrades with harder instances 
3
Datalog and the OWL 2 Profiles 
Datalog is the quintessential rule-based KR language 
• Reasoning typically implemented via materialisation 
• Our in-house system RDFox shows excellent performance 
Query answering within the OWL 2 profiles 
• RL ontologies equivalent to Datalog programs 
• EL and QL ontologies can be strengthened using Datalog 
Query answering requires an additional filtration step 
4
Incomplete Reasoning 
§ RL / EL reasoning w.r.t. arbitrary OWL ontology O dataset D and 
query q gives (in general) an incomplete answer L 
P Profile-specific reasoning via Datalog (relatively) scalable 
O Answers may be incomplete 
O Degree of incompleteness unknown 
O Incompleteness may be pathological (empty answers) 
5 
L = cert(q, hO`,Di) ✓ cert(q, hO,Di) with O |= O`
The idea behind PAGOdA 
6 
Redistribute reasoning workload 
Datalog reasoner 
Fully-fledged OWL 2 reasoner 
Resort to expensive OW2 reasoning as 
little as possible (if at all) 
Ensure sound and complete answers 
Do not restrict ontology language 
Datalog reasoner OWL 2 reasoner
Step 1: Lower and Upper Bounds 
ELHO Lower 
Lower 
Data 
Upper 
Data 
Ontology 
Query 
Datalog 
Engine 
Datalog 
Engine 
7 
Profile-specific reasoning via Datalog gives a lower bound 
L gives a subset of 
cert(q, hO,Di) 
We transform O into strictly stronger Datalog ontology Ou 
• Normalise ontology into Datalog±,v rules 
• Eliminate ∨ by transforming to ∧ 
• Replace existential variables with Skolem constants 
Datalog reasoning w.r.t. Ou 
gives upper bound answer U 
cert(q, hO,Di) ✓ cert(q, hOu,Di) = U
Step 2: Module extraction 
8 
Checking possible answers in U  L is expensive 
Compute a fragment of ontology + data sufficient to 
check each answer in U  L. 
Fragment computation involves proof tracing in Ou 
Achieved also using Datalog materialisation 
Relevant fragments are typically much smaller 
Size of the problem substantially reduced 
Datalog Engine U 
D 
Fragment
Step 3: Summarisation 
9 
Fragment 
Summarisation 
Summary 
Full Reasoner Q 
Further reduce problem size by summarising the fragment 
• Technique introduced by the SHER team at IBM 
• “Merge” constants that are instances of same concepts 
• Check answers against summary using OWL 2 reasoner 
• The summary of the fragment is typically very small 
This is an orthogonal over-approximation to previous ones 
We further reduce the size of U  L 
Sometimes we even make it empty !
Step 4: Dependency analysis 
10 
F 
Dependency Analysis 
F 
Full Reasoner Q 
Output 
Group remaining candidate answers 
• If a and b are in the same group then a is an answer iff b is 
• We can also establish dependencies between groups 
Check group representatives against fragment using the 
fully-fledged reasoner.
Features of PAGOdA 
PAGOdA provides PAYG query answering for OWL 2: 
§ Uses Datalog reasoner “out of the box” 
§ Efficiently computes sound partial answers 
§ In “easy” cases, efficiently computes complete answers 
§ In “harder” cases, applies increasingly powerful but less scalable 
reasoning techniques as needed to completely answer query 
§ The last step involving full reasoner is rarely needed in practice 
§ Recent improvements 
§ Better and better upper bounds 
§ Smaller and smaller modules 
11
Queries answered by each technique 
LUBM UOBM FLY DBPedia NPD 
Total 24 15 6 441 329 
Bounds 22 12 5 439 326 
Sum 22 14 5 440 329 
Full 24 15 6 441 329 
Scalability for lower and upper bound computation 
Importing Lower Mat Upper Mat Ave QA 
LUBM1000 313s 190s 269s 12s 
UOBM500 356s 346s 734s 4s
Queries that require full reasoning 
Lower Upper Gap Sum Groups 
LUBM100_q20 0 26 26 26 1 
LUBM100_q22 0 14 14 14 1 
UOBM1_q14 6271 6535 264 264 1 
FLY_q5 0 344 344 344 1 
DBPedia_q404 0 2 2 2 1
Lower Upper Frag Size (%) Sum Full 
LUBM100_q20 0.2s 0.3s 14.5s .005/.04 1.2s 190.1s 
LUBM100_q22 0.3s 0.2s 10.0s .005/.04 0.8s 46.1s 
UOBM1_q14 0.1s 0.1s 0.7s .17/.076 0.5s 5.4s 
FLY_q5 0.0s 0.0s 16.0s .34/.01 0.1s 0.2s 
14 
Time distribution and fragment size
PAGOdA Team 
§ Yujiao Zhou 
§ Yavor Nenov 
§ Bernardo Cuenca Grau 
§ Ian Horrocks 
15

Weitere ähnliche Inhalte

Andere mochten auch

Optique - poster
Optique - posterOptique - poster
Optique - posterDBOnto
 
ROSeAnn Presentation
ROSeAnn PresentationROSeAnn Presentation
ROSeAnn PresentationDBOnto
 
ArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - PosterArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - PosterDBOnto
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperDBOnto
 
PDQ: Proof-driven Querying presentation
PDQ: Proof-driven Querying presentationPDQ: Proof-driven Querying presentation
PDQ: Proof-driven Querying presentationDBOnto
 
RDFox Poster
RDFox PosterRDFox Poster
RDFox PosterDBOnto
 
PAGOdA poster
PAGOdA posterPAGOdA poster
PAGOdA posterDBOnto
 
PDQ Poster
PDQ PosterPDQ Poster
PDQ PosterDBOnto
 
SemFacet Poster
SemFacet PosterSemFacet Poster
SemFacet PosterDBOnto
 
PAGOdA paper
PAGOdA paperPAGOdA paper
PAGOdA paperDBOnto
 
Diadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meetingDiadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meetingDBOnto
 
Semantic Faceted Search with SemFacet presentation
Semantic Faceted Search with SemFacet presentationSemantic Faceted Search with SemFacet presentation
Semantic Faceted Search with SemFacet presentationDBOnto
 
Overview of Dan Olteanu's Research presentation
Overview of Dan Olteanu's Research presentationOverview of Dan Olteanu's Research presentation
Overview of Dan Olteanu's Research presentationDBOnto
 
Welcome by Ian Horrocks
Welcome by Ian HorrocksWelcome by Ian Horrocks
Welcome by Ian HorrocksDBOnto
 
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...DBOnto
 
Parallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationParallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationDBOnto
 
Query Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning PaperQuery Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning PaperDBOnto
 

Andere mochten auch (17)

Optique - poster
Optique - posterOptique - poster
Optique - poster
 
ROSeAnn Presentation
ROSeAnn PresentationROSeAnn Presentation
ROSeAnn Presentation
 
ArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - PosterArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
ArtForm - Dynamic analysis of JavaScript validation in web forms - Poster
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators Paper
 
PDQ: Proof-driven Querying presentation
PDQ: Proof-driven Querying presentationPDQ: Proof-driven Querying presentation
PDQ: Proof-driven Querying presentation
 
RDFox Poster
RDFox PosterRDFox Poster
RDFox Poster
 
PAGOdA poster
PAGOdA posterPAGOdA poster
PAGOdA poster
 
PDQ Poster
PDQ PosterPDQ Poster
PDQ Poster
 
SemFacet Poster
SemFacet PosterSemFacet Poster
SemFacet Poster
 
PAGOdA paper
PAGOdA paperPAGOdA paper
PAGOdA paper
 
Diadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meetingDiadem DBOnto Kick Off meeting
Diadem DBOnto Kick Off meeting
 
Semantic Faceted Search with SemFacet presentation
Semantic Faceted Search with SemFacet presentationSemantic Faceted Search with SemFacet presentation
Semantic Faceted Search with SemFacet presentation
 
Overview of Dan Olteanu's Research presentation
Overview of Dan Olteanu's Research presentationOverview of Dan Olteanu's Research presentation
Overview of Dan Olteanu's Research presentation
 
Welcome by Ian Horrocks
Welcome by Ian HorrocksWelcome by Ian Horrocks
Welcome by Ian Horrocks
 
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...DIADEM: domain-centric intelligent automated data extraction methodology Pres...
DIADEM: domain-centric intelligent automated data extraction methodology Pres...
 
Parallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox PresentationParallel Datalog Reasoning in RDFox Presentation
Parallel Datalog Reasoning in RDFox Presentation
 
Query Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning PaperQuery Distributed RDF Graphs: The Effects of Partitioning Paper
Query Distributed RDF Graphs: The Effects of Partitioning Paper
 

Ähnlich wie PAGOdA Presentation

Poster Tweet-Norm 2013
Poster Tweet-Norm 2013Poster Tweet-Norm 2013
Poster Tweet-Norm 2013pruiz_
 
Introduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing ValuesIntroduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing ValuesJo-fai Chow
 
Towards a systematic benchmarking of ontology-based query rewriting systems
Towards a systematic benchmarking of ontology-based query rewriting systemsTowards a systematic benchmarking of ontology-based query rewriting systems
Towards a systematic benchmarking of ontology-based query rewriting systemsFujitsu Laboratories of Europe
 
Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13Brian Ulicny
 
Benchmarking Apache Druid
Benchmarking Apache Druid Benchmarking Apache Druid
Benchmarking Apache Druid Matt Sarrel
 
Benchmarking Apache Druid
Benchmarking Apache DruidBenchmarking Apache Druid
Benchmarking Apache DruidImply
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptxGowrySailaja
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Josef Hardi
 
On Applying Or-Parallelism and Tabling to Logic Programs
On Applying Or-Parallelism and Tabling to Logic ProgramsOn Applying Or-Parallelism and Tabling to Logic Programs
On Applying Or-Parallelism and Tabling to Logic ProgramsLino Possamai
 
Introduction to Optimization revised.ppt
Introduction to Optimization revised.pptIntroduction to Optimization revised.ppt
Introduction to Optimization revised.pptJahnaviGautam
 
Slider: an Efficient Incremental Reasoner, by Jules Chevalier
Slider: an Efficient Incremental Reasoner, by Jules ChevalierSlider: an Efficient Incremental Reasoner, by Jules Chevalier
Slider: an Efficient Incremental Reasoner, by Jules Chevalieropencloudware
 
Machine Learning in H2O
Machine Learning in H2OMachine Learning in H2O
Machine Learning in H2OAakash Gupta
 
cade23-schneidsut-atp4owlfull-2011
cade23-schneidsut-atp4owlfull-2011cade23-schneidsut-atp4owlfull-2011
cade23-schneidsut-atp4owlfull-2011Michael Schneider
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaopenseesdays
 
K-SRL: Instance-based Learning for Semantic Role Labeling
K-SRL: Instance-based Learning for Semantic Role LabelingK-SRL: Instance-based Learning for Semantic Role Labeling
K-SRL: Instance-based Learning for Semantic Role LabelingYunyao Li
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysisodsc
 

Ähnlich wie PAGOdA Presentation (20)

Poster Tweet-Norm 2013
Poster Tweet-Norm 2013Poster Tweet-Norm 2013
Poster Tweet-Norm 2013
 
Introduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing ValuesIntroduction to Generalised Low-Rank Model and Missing Values
Introduction to Generalised Low-Rank Model and Missing Values
 
Towards a systematic benchmarking of ontology-based query rewriting systems
Towards a systematic benchmarking of ontology-based query rewriting systemsTowards a systematic benchmarking of ontology-based query rewriting systems
Towards a systematic benchmarking of ontology-based query rewriting systems
 
Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13Grosof haley-talk-semtech2013-ver6-10-13
Grosof haley-talk-semtech2013-ver6-10-13
 
Benchmarking Apache Druid
Benchmarking Apache Druid Benchmarking Apache Druid
Benchmarking Apache Druid
 
Benchmarking Apache Druid
Benchmarking Apache DruidBenchmarking Apache Druid
Benchmarking Apache Druid
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!
 
On Applying Or-Parallelism and Tabling to Logic Programs
On Applying Or-Parallelism and Tabling to Logic ProgramsOn Applying Or-Parallelism and Tabling to Logic Programs
On Applying Or-Parallelism and Tabling to Logic Programs
 
Introduction to Optimization revised.ppt
Introduction to Optimization revised.pptIntroduction to Optimization revised.ppt
Introduction to Optimization revised.ppt
 
Slider: an Efficient Incremental Reasoner, by Jules Chevalier
Slider: an Efficient Incremental Reasoner, by Jules ChevalierSlider: an Efficient Incremental Reasoner, by Jules Chevalier
Slider: an Efficient Incremental Reasoner, by Jules Chevalier
 
Logic programming in python
Logic programming in pythonLogic programming in python
Logic programming in python
 
Machine Learning in H2O
Machine Learning in H2OMachine Learning in H2O
Machine Learning in H2O
 
cade23-schneidsut-atp4owlfull-2011
cade23-schneidsut-atp4owlfull-2011cade23-schneidsut-atp4owlfull-2011
cade23-schneidsut-atp4owlfull-2011
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
 
eswc2011phd-schneid
eswc2011phd-schneideswc2011phd-schneid
eswc2011phd-schneid
 
Impl slpqp ev-sqp
Impl slpqp ev-sqpImpl slpqp ev-sqp
Impl slpqp ev-sqp
 
Prolog
PrologProlog
Prolog
 
K-SRL: Instance-based Learning for Semantic Role Labeling
K-SRL: Instance-based Learning for Semantic Role LabelingK-SRL: Instance-based Learning for Semantic Role Labeling
K-SRL: Instance-based Learning for Semantic Role Labeling
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
 

Kürzlich hochgeladen

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 

Kürzlich hochgeladen (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 

PAGOdA Presentation

  • 1. Pay-as-you-go Query Answering with PAGOdA BERNARDO CUENCA GRAU
  • 2. Ontology-mediated Query Answering Q A C T B D RDF Data a b • (Meta)-data published in RDF • RDF resources reference an OWL 2 ontology • The ontology describes the meaning of data RDF and OWL 2 well-established • Thousands of available OWL 2 ontologies • RDF ubiquitous on the Web 2
  • 3. Ontology-mediated Query Answering Ontology languages offer a wide range modeling constructs High expressive power à high worst-case complexity of reasoning How can we provide scalable query answering? • Restrict our ontology to a lightweight fragment of OWL EL, QL or RL profiles • Tolerate incompleteness • Rely on highly optimised pay-as-you-go systems • Worst case optimal for lightweight fragments • Rapidly computes easy answers • Performance gracefully degrades with harder instances 3
  • 4. Datalog and the OWL 2 Profiles Datalog is the quintessential rule-based KR language • Reasoning typically implemented via materialisation • Our in-house system RDFox shows excellent performance Query answering within the OWL 2 profiles • RL ontologies equivalent to Datalog programs • EL and QL ontologies can be strengthened using Datalog Query answering requires an additional filtration step 4
  • 5. Incomplete Reasoning § RL / EL reasoning w.r.t. arbitrary OWL ontology O dataset D and query q gives (in general) an incomplete answer L P Profile-specific reasoning via Datalog (relatively) scalable O Answers may be incomplete O Degree of incompleteness unknown O Incompleteness may be pathological (empty answers) 5 L = cert(q, hO`,Di) ✓ cert(q, hO,Di) with O |= O`
  • 6. The idea behind PAGOdA 6 Redistribute reasoning workload Datalog reasoner Fully-fledged OWL 2 reasoner Resort to expensive OW2 reasoning as little as possible (if at all) Ensure sound and complete answers Do not restrict ontology language Datalog reasoner OWL 2 reasoner
  • 7. Step 1: Lower and Upper Bounds ELHO Lower Lower Data Upper Data Ontology Query Datalog Engine Datalog Engine 7 Profile-specific reasoning via Datalog gives a lower bound L gives a subset of cert(q, hO,Di) We transform O into strictly stronger Datalog ontology Ou • Normalise ontology into Datalog±,v rules • Eliminate ∨ by transforming to ∧ • Replace existential variables with Skolem constants Datalog reasoning w.r.t. Ou gives upper bound answer U cert(q, hO,Di) ✓ cert(q, hOu,Di) = U
  • 8. Step 2: Module extraction 8 Checking possible answers in U L is expensive Compute a fragment of ontology + data sufficient to check each answer in U L. Fragment computation involves proof tracing in Ou Achieved also using Datalog materialisation Relevant fragments are typically much smaller Size of the problem substantially reduced Datalog Engine U D Fragment
  • 9. Step 3: Summarisation 9 Fragment Summarisation Summary Full Reasoner Q Further reduce problem size by summarising the fragment • Technique introduced by the SHER team at IBM • “Merge” constants that are instances of same concepts • Check answers against summary using OWL 2 reasoner • The summary of the fragment is typically very small This is an orthogonal over-approximation to previous ones We further reduce the size of U L Sometimes we even make it empty !
  • 10. Step 4: Dependency analysis 10 F Dependency Analysis F Full Reasoner Q Output Group remaining candidate answers • If a and b are in the same group then a is an answer iff b is • We can also establish dependencies between groups Check group representatives against fragment using the fully-fledged reasoner.
  • 11. Features of PAGOdA PAGOdA provides PAYG query answering for OWL 2: § Uses Datalog reasoner “out of the box” § Efficiently computes sound partial answers § In “easy” cases, efficiently computes complete answers § In “harder” cases, applies increasingly powerful but less scalable reasoning techniques as needed to completely answer query § The last step involving full reasoner is rarely needed in practice § Recent improvements § Better and better upper bounds § Smaller and smaller modules 11
  • 12. Queries answered by each technique LUBM UOBM FLY DBPedia NPD Total 24 15 6 441 329 Bounds 22 12 5 439 326 Sum 22 14 5 440 329 Full 24 15 6 441 329 Scalability for lower and upper bound computation Importing Lower Mat Upper Mat Ave QA LUBM1000 313s 190s 269s 12s UOBM500 356s 346s 734s 4s
  • 13. Queries that require full reasoning Lower Upper Gap Sum Groups LUBM100_q20 0 26 26 26 1 LUBM100_q22 0 14 14 14 1 UOBM1_q14 6271 6535 264 264 1 FLY_q5 0 344 344 344 1 DBPedia_q404 0 2 2 2 1
  • 14. Lower Upper Frag Size (%) Sum Full LUBM100_q20 0.2s 0.3s 14.5s .005/.04 1.2s 190.1s LUBM100_q22 0.3s 0.2s 10.0s .005/.04 0.8s 46.1s UOBM1_q14 0.1s 0.1s 0.7s .17/.076 0.5s 5.4s FLY_q5 0.0s 0.0s 16.0s .34/.01 0.1s 0.2s 14 Time distribution and fragment size
  • 15. PAGOdA Team § Yujiao Zhou § Yavor Nenov § Bernardo Cuenca Grau § Ian Horrocks 15