SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Politecnico di Milano, DEIB
Portoroz - 2017 - Riccardo Tommasini (Politecnico di Milano)
Riccardo Tommasini, Marco Balduini, Emanuele Della Valle
{name.surname}@polimi.it
Towards a Benchmark for
Expressive Stream Reasoning
1
ESWCEyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
SR State-of-the-art Timeline (Qualitative)
2
time
(2008) (2010) (2011) (2015)
CityBench
(2016)
YABench
(2012)
LSBench
SRBench
CSRBench
EP-SPARQL
C-SPARQL
SparkWave
MorphStream
CQELS
SKB
INSTANS
(2013)
RSP Engine
Benchmark
RDFox
trOWL
Stream Reasoner
DyamiTE
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 3
Benchmark Ontology Streams Queries Metrics Reasoning
SR Bench IoT
RDF/
Historical
✔ QL Feature X
LS Bench
Social
Media/IoT
Generate
d
✔ Max Throughput SubClassOf
CSRBench IoT x
Parametri
c
Correctness X
CityBench IoT CSV/Real ✔
Query latency,
Memory
consumption,
completeness
X
YABench IoT
Generate
d
✔ Correctness X
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 4
Stream
Reasoner
Entailmen
t
Berlin
SPARQL
LUBM UOBM DBPedia Spire Galen
SKB RDFS ✔
EP-SPARQL RDFS ✔ ✔
trOWL EL+
DynamiTE RDFS ✔
SparkWave RDFS ✔
RDFox OWL 2 RL ✔ ✔ ✔
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESR Benchmarking
5
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Design Principles
6
[P.1] TBox of moderate size yet of scalable complexity.
[P.2] Continuous reasoning tasks.
[P.3] Arbitrary scaling of static and streaming data.
[P.4] Usage of continuous queries.
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
SR Experiment
7
Is a TUPLES <R,E,T,D,S,Q,K>, where
R is a stream reasoner;

E is an entailment regime to test;

T a a static TBox

S a streaming ABox;

Q a set of continuous reasoning tasks under E and; 

K is a set of KPIs to measure.
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
LASS 1.0
8
LASS is a first attempt to make an benchmark for ESR. It comprises:
L1O, an OWL 2 RL ontology about the Social Media Influence.
L1C, a set of reasoning tasks to test the engine capabilities.
L1G, a data generation algorithm and its implementation.
Building Blocks
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
L1O 1.0
9
Extends SIOC Core Vocabulary for online Community with:
Actions/Reaction models users’ interaction within a community
Influence models users’ influence roles within a community.
Content models what characterises posts and discussions.
L1O’s modules have different velocities, 

i.e. class instances change at different rates.
The Ontology
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
L1C 1.0
10
The Reasoning Tasks
Reasoning Task
Class
Subsumption
Role
Subsumption
Transitive Inverse Realization
Tag Containment ✔
Post Popularity ✔
User Activity ✔ ✔
User Participation ✔ ✔
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
L1G 1.0
11
L1G exploits these ordering relations to instantiates class individuals.
It generates starting from the “slowest” classes (e.g. Discussion) to the “fastest”
ones (e.g. MicroPost)
L1G does not generate classes from the Influence Module, but it ensures that they
can be deducted.
The data generation algorithm
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Evaluation
12
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
What is the goal of a
benchmark?
13
ESWCEyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
The goal of a domain specific benchmark 

is to foster technological progress 

by guaranteeing a fair assessment.


14
”
“
- Jim Gray, The Benchmark Handbook
for Database and Transaction Systems, 1993
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 15
Benchmark
Approach X
1.5m
Measure
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 16
Benchmark
Approach Y
2m
Measure
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 17
-Body Level One
-Body Level Two
-Body Level Three
-Body Level Four
-Body Level Five
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 18
•  accelerate'progress,'make'technology'viable'
Ying'Zhang,2Peter2Boncz2–'Benchmarking'Linked'Open'Data'Technology'
©'Jim'Gray,'2005'
ESWCEyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Observations
19
More a benchmark challenges an approach more is effective.
The benchmark characterises the problem space.
The benchmark provides a measurable view over the related solution
space.
ESWCEyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Role of Baselines
20
A baseline defines the lower bound of the solution space.
They show the feasibility of the problem.
They avoid one-2-one competition by defining a cross reference for the
comparison.
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Benchmarking is a
research problem
21
How to explore the solution space?
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Benchmark Design
22
- What are the guiding principles / requirements?
- What does constitute a benchmark?
- What is the experimental methodology?
- What are the the baselines?
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Benchmark Quality
23
- Is the benchmark compliant to the requirements/principles?
- Is the benchmark used?
- What are the benchmark limitations? (KPIs, TestDriver)
- Does the benchmark distinguish the compared systems (blind)?
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Conclusion & Lass Evaluation
24
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Atomic LASS Evaluation
25
- We evaluate L1O showing it is compliant to Gruber’s ontology design principles.
- We formulate some continuous queries involving the L1C reasoning tasks.
- We implemented L1G by extending LUBM data generator.
ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Holistic LASS Evaluation
26
We evaluate LASS against the following principles by Jim Gray (G) and
Karl Huppler (H).



A benchmark must be

[G.1] Simple, [G.2] Portable, [G.3] Scalable, [G.4] Relevant and 

[H.1] formally Verifiable.
Portoroz - 2017 - Riccardo Tommasini (Politecnico di Milano)
Questions?
Email: riccardo.tommasini@polimi.it

Twitter: @rictomm
Github: riccardotommasini
Web: streamreasoning.org
27
Joseph Wright of Derby, An Experiment on a Bird in the Air Pump, 1768. 

The National Gallery, London

Weitere ähnliche Inhalte

Ähnlich wie Towards a Benchmark for Expressive Stream Reasoning

Cyber-physical systems Industrial applications in the CPSwarm Project
Cyber-physical systems Industrial applications in the CPSwarm ProjectCyber-physical systems Industrial applications in the CPSwarm Project
Cyber-physical systems Industrial applications in the CPSwarm Project
Alessandra Bagnato
 
Enabling combined Software and Data engineering at Web-scale
Enabling combined Software and Data engineering at Web-scaleEnabling combined Software and Data engineering at Web-scale
Enabling combined Software and Data engineering at Web-scale
Monika Solanki
 

Ähnlich wie Towards a Benchmark for Expressive Stream Reasoning (20)

Estimating Packet Loss Rate in the Access Through Application-Level Measurements
Estimating Packet Loss Rate in the Access Through Application-Level MeasurementsEstimating Packet Loss Rate in the Access Through Application-Level Measurements
Estimating Packet Loss Rate in the Access Through Application-Level Measurements
 
Estimote Beacon presentation
Estimote Beacon presentationEstimote Beacon presentation
Estimote Beacon presentation
 
COSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsCOSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical Systems
 
IoT Reference Architectures
IoT Reference ArchitecturesIoT Reference Architectures
IoT Reference Architectures
 
Compliance driven process development with DCR graphs
Compliance driven process development with DCR graphsCompliance driven process development with DCR graphs
Compliance driven process development with DCR graphs
 
Online Tv Music Channel
Online Tv Music ChannelOnline Tv Music Channel
Online Tv Music Channel
 
From TRL to MRL: Assessing Open Source Project Market Readiness, Cédric Thoma...
From TRL to MRL: Assessing Open Source Project Market Readiness, Cédric Thoma...From TRL to MRL: Assessing Open Source Project Market Readiness, Cédric Thoma...
From TRL to MRL: Assessing Open Source Project Market Readiness, Cédric Thoma...
 
Heaven: A Framework for Systematic Comparative Research Approach for RSP Engines
Heaven: A Framework for Systematic Comparative Research Approach for RSP EnginesHeaven: A Framework for Systematic Comparative Research Approach for RSP Engines
Heaven: A Framework for Systematic Comparative Research Approach for RSP Engines
 
Weekly update @ 10.05.2016
Weekly update @ 10.05.2016Weekly update @ 10.05.2016
Weekly update @ 10.05.2016
 
Mining Correlations of ATL Transformation and Metamodel Metrics
Mining Correlations of ATL Transformation and Metamodel MetricsMining Correlations of ATL Transformation and Metamodel Metrics
Mining Correlations of ATL Transformation and Metamodel Metrics
 
Cyber-physical systems Industrial applications in the CPSwarm Project
Cyber-physical systems Industrial applications in the CPSwarm ProjectCyber-physical systems Industrial applications in the CPSwarm Project
Cyber-physical systems Industrial applications in the CPSwarm Project
 
TCP1P.net Meetup Vision, Objectives and Roadmap
TCP1P.net Meetup Vision, Objectives and RoadmapTCP1P.net Meetup Vision, Objectives and Roadmap
TCP1P.net Meetup Vision, Objectives and Roadmap
 
Managing and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and CloudsManaging and Testing Ensembles of IoT, Network functions, and Clouds
Managing and Testing Ensembles of IoT, Network functions, and Clouds
 
Designing Swarms of Cyber-Physical Systems: The H2020 CPSwarm Project
Designing Swarms of Cyber-Physical Systems: The H2020 CPSwarm ProjectDesigning Swarms of Cyber-Physical Systems: The H2020 CPSwarm Project
Designing Swarms of Cyber-Physical Systems: The H2020 CPSwarm Project
 
Behavioural Rules In Multi Agent Systems Max
Behavioural Rules In Multi Agent Systems MaxBehavioural Rules In Multi Agent Systems Max
Behavioural Rules In Multi Agent Systems Max
 
Enabling combined Software and Data engineering at Web-scale
Enabling combined Software and Data engineering at Web-scaleEnabling combined Software and Data engineering at Web-scale
Enabling combined Software and Data engineering at Web-scale
 
Monitoring and Operational Data Analytics from a User Perspective at First Eu...
Monitoring and Operational Data Analytics from a User Perspective at First Eu...Monitoring and Operational Data Analytics from a User Perspective at First Eu...
Monitoring and Operational Data Analytics from a User Perspective at First Eu...
 
Superračunalništvo v Mariboru (2021, CIS11, ZID)
Superračunalništvo v Mariboru (2021, CIS11, ZID)Superračunalništvo v Mariboru (2021, CIS11, ZID)
Superračunalništvo v Mariboru (2021, CIS11, ZID)
 
cv
cvcv
cv
 
Ethics in Technology – Example of RIPE Atlas
 Ethics in Technology – Example of RIPE Atlas Ethics in Technology – Example of RIPE Atlas
Ethics in Technology – Example of RIPE Atlas
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Towards a Benchmark for Expressive Stream Reasoning

  • 1. Politecnico di Milano, DEIB Portoroz - 2017 - Riccardo Tommasini (Politecnico di Milano) Riccardo Tommasini, Marco Balduini, Emanuele Della Valle {name.surname}@polimi.it Towards a Benchmark for Expressive Stream Reasoning 1
  • 2. ESWCEyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano SR State-of-the-art Timeline (Qualitative) 2 time (2008) (2010) (2011) (2015) CityBench (2016) YABench (2012) LSBench SRBench CSRBench EP-SPARQL C-SPARQL SparkWave MorphStream CQELS SKB INSTANS (2013) RSP Engine Benchmark RDFox trOWL Stream Reasoner DyamiTE
  • 3. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 3 Benchmark Ontology Streams Queries Metrics Reasoning SR Bench IoT RDF/ Historical ✔ QL Feature X LS Bench Social Media/IoT Generate d ✔ Max Throughput SubClassOf CSRBench IoT x Parametri c Correctness X CityBench IoT CSV/Real ✔ Query latency, Memory consumption, completeness X YABench IoT Generate d ✔ Correctness X
  • 4. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 4 Stream Reasoner Entailmen t Berlin SPARQL LUBM UOBM DBPedia Spire Galen SKB RDFS ✔ EP-SPARQL RDFS ✔ ✔ trOWL EL+ DynamiTE RDFS ✔ SparkWave RDFS ✔ RDFox OWL 2 RL ✔ ✔ ✔
  • 5. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano ESR Benchmarking 5
  • 6. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Design Principles 6 [P.1] TBox of moderate size yet of scalable complexity. [P.2] Continuous reasoning tasks. [P.3] Arbitrary scaling of static and streaming data. [P.4] Usage of continuous queries.
  • 7. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano SR Experiment 7 Is a TUPLES <R,E,T,D,S,Q,K>, where R is a stream reasoner;
 E is an entailment regime to test;
 T a a static TBox
 S a streaming ABox;
 Q a set of continuous reasoning tasks under E and; 
 K is a set of KPIs to measure.
  • 8. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano LASS 1.0 8 LASS is a first attempt to make an benchmark for ESR. It comprises: L1O, an OWL 2 RL ontology about the Social Media Influence. L1C, a set of reasoning tasks to test the engine capabilities. L1G, a data generation algorithm and its implementation. Building Blocks
  • 9. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano L1O 1.0 9 Extends SIOC Core Vocabulary for online Community with: Actions/Reaction models users’ interaction within a community Influence models users’ influence roles within a community. Content models what characterises posts and discussions. L1O’s modules have different velocities, 
 i.e. class instances change at different rates. The Ontology
  • 10. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano L1C 1.0 10 The Reasoning Tasks Reasoning Task Class Subsumption Role Subsumption Transitive Inverse Realization Tag Containment ✔ Post Popularity ✔ User Activity ✔ ✔ User Participation ✔ ✔
  • 11. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano L1G 1.0 11 L1G exploits these ordering relations to instantiates class individuals. It generates starting from the “slowest” classes (e.g. Discussion) to the “fastest” ones (e.g. MicroPost) L1G does not generate classes from the Influence Module, but it ensures that they can be deducted. The data generation algorithm
  • 12. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Evaluation 12
  • 13. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano What is the goal of a benchmark? 13
  • 14. ESWCEyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano The goal of a domain specific benchmark 
 is to foster technological progress 
 by guaranteeing a fair assessment. 
 14 ” “ - Jim Gray, The Benchmark Handbook for Database and Transaction Systems, 1993
  • 15. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 15 Benchmark Approach X 1.5m Measure
  • 16. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 16 Benchmark Approach Y 2m Measure
  • 17. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 17 -Body Level One -Body Level Two -Body Level Three -Body Level Four -Body Level Five
  • 18. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 18 •  accelerate'progress,'make'technology'viable' Ying'Zhang,2Peter2Boncz2–'Benchmarking'Linked'Open'Data'Technology' ©'Jim'Gray,'2005'
  • 19. ESWCEyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Observations 19 More a benchmark challenges an approach more is effective. The benchmark characterises the problem space. The benchmark provides a measurable view over the related solution space.
  • 20. ESWCEyE Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Role of Baselines 20 A baseline defines the lower bound of the solution space. They show the feasibility of the problem. They avoid one-2-one competition by defining a cross reference for the comparison.
  • 21. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Benchmarking is a research problem 21 How to explore the solution space?
  • 22. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Benchmark Design 22 - What are the guiding principles / requirements? - What does constitute a benchmark? - What is the experimental methodology? - What are the the baselines?
  • 23. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Benchmark Quality 23 - Is the benchmark compliant to the requirements/principles? - Is the benchmark used? - What are the benchmark limitations? (KPIs, TestDriver) - Does the benchmark distinguish the compared systems (blind)?
  • 24. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Conclusion & Lass Evaluation 24
  • 25. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Atomic LASS Evaluation 25 - We evaluate L1O showing it is compliant to Gruber’s ontology design principles. - We formulate some continuous queries involving the L1C reasoning tasks. - We implemented L1G by extending LUBM data generator.
  • 26. ESWC Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano Holistic LASS Evaluation 26 We evaluate LASS against the following principles by Jim Gray (G) and Karl Huppler (H).
 
 A benchmark must be
 [G.1] Simple, [G.2] Portable, [G.3] Scalable, [G.4] Relevant and 
 [H.1] formally Verifiable.
  • 27. Portoroz - 2017 - Riccardo Tommasini (Politecnico di Milano) Questions? Email: riccardo.tommasini@polimi.it
 Twitter: @rictomm Github: riccardotommasini Web: streamreasoning.org 27 Joseph Wright of Derby, An Experiment on a Bird in the Air Pump, 1768. 
 The National Gallery, London