[2024]Digital Global Overview Report 2024 Meltwater.pdf
Towards a Benchmark for Expressive Stream Reasoning
1. Politecnico di Milano, DEIB
Portoroz - 2017 - Riccardo Tommasini (Politecnico di Milano)
Riccardo Tommasini, Marco Balduini, Emanuele Della Valle
{name.surname}@polimi.it
Towards a Benchmark for
Expressive Stream Reasoning
1
3. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 3
Benchmark Ontology Streams Queries Metrics Reasoning
SR Bench IoT
RDF/
Historical
✔ QL Feature X
LS Bench
Social
Media/IoT
Generate
d
✔ Max Throughput SubClassOf
CSRBench IoT x
Parametri
c
Correctness X
CityBench IoT CSV/Real ✔
Query latency,
Memory
consumption,
completeness
X
YABench IoT
Generate
d
✔ Correctness X
5. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
ESR Benchmarking
5
6. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Design Principles
6
[P.1] TBox of moderate size yet of scalable complexity.
[P.2] Continuous reasoning tasks.
[P.3] Arbitrary scaling of static and streaming data.
[P.4] Usage of continuous queries.
7. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
SR Experiment
7
Is a TUPLES <R,E,T,D,S,Q,K>, where
R is a stream reasoner;
E is an entailment regime to test;
T a a static TBox
S a streaming ABox;
Q a set of continuous reasoning tasks under E and;
K is a set of KPIs to measure.
8. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
LASS 1.0
8
LASS is a first attempt to make an benchmark for ESR. It comprises:
L1O, an OWL 2 RL ontology about the Social Media Influence.
L1C, a set of reasoning tasks to test the engine capabilities.
L1G, a data generation algorithm and its implementation.
Building Blocks
9. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
L1O 1.0
9
Extends SIOC Core Vocabulary for online Community with:
Actions/Reaction models users’ interaction within a community
Influence models users’ influence roles within a community.
Content models what characterises posts and discussions.
L1O’s modules have different velocities,
i.e. class instances change at different rates.
The Ontology
10. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
L1C 1.0
10
The Reasoning Tasks
Reasoning Task
Class
Subsumption
Role
Subsumption
Transitive Inverse Realization
Tag Containment ✔
Post Popularity ✔
User Activity ✔ ✔
User Participation ✔ ✔
11. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
L1G 1.0
11
L1G exploits these ordering relations to instantiates class individuals.
It generates starting from the “slowest” classes (e.g. Discussion) to the “fastest”
ones (e.g. MicroPost)
L1G does not generate classes from the Influence Module, but it ensures that they
can be deducted.
The data generation algorithm
12. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Evaluation
12
13. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
What is the goal of a
benchmark?
13
14. ESWCEyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
The goal of a domain specific benchmark
is to foster technological progress
by guaranteeing a fair assessment.
14
”
“
- Jim Gray, The Benchmark Handbook
for Database and Transaction Systems, 1993
15. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 15
Benchmark
Approach X
1.5m
Measure
16. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 16
Benchmark
Approach Y
2m
Measure
17. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 17
-Body Level One
-Body Level Two
-Body Level Three
-Body Level Four
-Body Level Five
19. ESWCEyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Observations
19
More a benchmark challenges an approach more is effective.
The benchmark characterises the problem space.
The benchmark provides a measurable view over the related solution
space.
20. ESWCEyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Role of Baselines
20
A baseline defines the lower bound of the solution space.
They show the feasibility of the problem.
They avoid one-2-one competition by defining a cross reference for the
comparison.
21. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Benchmarking is a
research problem
21
How to explore the solution space?
22. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Benchmark Design
22
- What are the guiding principles / requirements?
- What does constitute a benchmark?
- What is the experimental methodology?
- What are the the baselines?
23. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Benchmark Quality
23
- Is the benchmark compliant to the requirements/principles?
- Is the benchmark used?
- What are the benchmark limitations? (KPIs, TestDriver)
- Does the benchmark distinguish the compared systems (blind)?
24. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Conclusion & Lass Evaluation
24
25. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Atomic LASS Evaluation
25
- We evaluate L1O showing it is compliant to Gruber’s ontology design principles.
- We formulate some continuous queries involving the L1C reasoning tasks.
- We implemented L1G by extending LUBM data generator.
26. ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
Holistic LASS Evaluation
26
We evaluate LASS against the following principles by Jim Gray (G) and
Karl Huppler (H).
A benchmark must be
[G.1] Simple, [G.2] Portable, [G.3] Scalable, [G.4] Relevant and
[H.1] formally Verifiable.
27. Portoroz - 2017 - Riccardo Tommasini (Politecnico di Milano)
Questions?
Email: riccardo.tommasini@polimi.it
Twitter: @rictomm
Github: riccardotommasini
Web: streamreasoning.org
27
Joseph Wright of Derby, An Experiment on a Bird in the Air Pump, 1768.
The National Gallery, London