Towards a Benchmark for Expressive Stream Reasoning

Politecnico di Milano, DEIB
Portoroz - 2017 - Riccardo Tommasini (Politecnico di Milano)
Riccardo Tommasini, Marco Balduini, Emanuele Della Valle
{name.surname}@polimi.it
Towards a Benchmark for
Expressive Stream Reasoning
1

ESWCEyE
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano
SR State-of-the-art Timeline (Qualitative)
2
time
(2008) (2010) (2011) (2015)
CityBench
(2016)
YABench
(2012)
LSBench
SRBench
CSRBench
EP-SPARQL
C-SPARQL
SparkWave
MorphStream
CQELS
SKB
INSTANS
(2013)
RSP Engine
Benchmark
RDFox
trOWL
Stream Reasoner
DyamiTE

ESWC
Portoroz - 2017 - Riccardo Tommasini - @rictomm - Politecnico di Milano 3
Benchmark Ontology Streams Queries Metrics Reasoning
SR Bench IoT
RDF/
Historical
✔ QL Feature X
LS Bench
Social
Media/IoT
Generate
d
✔ Max Throughput SubClassOf
CSRBench IoT x
Parametri
c
Correctness X
CityBench IoT CSV/Real ✔
Query latency,
Memory
consumption,
completeness
X
YABench IoT
Generate
d
✔ Correctness X

ESWC
Stream
Reasoner
Entailmen
t
Berlin
SPARQL
LUBM UOBM DBPedia Spire Galen
SKB RDFS ✔
EP-SPARQL RDFS ✔ ✔
trOWL EL+
DynamiTE RDFS ✔
SparkWave RDFS ✔
RDFox OWL 2 RL ✔ ✔ ✔

ESWC
ESR Benchmarking
5

ESWC
Design Principles
6
[P.1] TBox of moderate size yet of scalable complexity.
[P.2] Continuous reasoning tasks.
[P.3] Arbitrary scaling of static and streaming data.
[P.4] Usage of continuous queries.

ESWC
SR Experiment
7
Is a TUPLES <R,E,T,D,S,Q,K>, where
R is a stream reasoner; 
E is an entailment regime to test; 
T a a static TBox 
S a streaming ABox; 
Q a set of continuous reasoning tasks under E and;  
K is a set of KPIs to measure.

ESWC
LASS 1.0
8
LASS is a first attempt to make an benchmark for ESR. It comprises:
L1O, an OWL 2 RL ontology about the Social Media Influence.
L1C, a set of reasoning tasks to test the engine capabilities.
L1G, a data generation algorithm and its implementation.
Building Blocks

ESWC
L1O 1.0
9
Extends SIOC Core Vocabulary for online Community with:
Actions/Reaction models users’ interaction within a community
Influence models users’ influence roles within a community.
Content models what characterises posts and discussions.
L1O’s modules have different velocities,  
i.e. class instances change at different rates.
The Ontology

ESWC
L1C 1.0
10
The Reasoning Tasks
Reasoning Task
Class
Subsumption
Role
Subsumption
Transitive Inverse Realization
Tag Containment ✔
Post Popularity ✔
User Activity ✔ ✔
User Participation ✔ ✔

ESWC
L1G 1.0
11
L1G exploits these ordering relations to instantiates class individuals.
It generates starting from the “slowest” classes (e.g. Discussion) to the “fastest”
ones (e.g. MicroPost)
L1G does not generate classes from the Influence Module, but it ensures that they
can be deducted.
The data generation algorithm

ESWC
Evaluation
12

ESWC
What is the goal of a
benchmark?
13

ESWCEyE
The goal of a domain specific benchmark  
is to foster technological progress  
by guaranteeing a fair assessment.
 
14
”
“
- Jim Gray, The Benchmark Handbook
for Database and Transaction Systems, 1993

ESWC
Benchmark
Approach X
1.5m
Measure

ESWC
Benchmark
Approach Y
2m
Measure

ESWC
-Body Level One
-Body Level Two
-Body Level Three
-Body Level Four
-Body Level Five

ESWC
•  accelerate'progress,'make'technology'viable'
Ying'Zhang,2Peter2Boncz2–'Benchmarking'Linked'Open'Data'Technology'
©'Jim'Gray,'2005'

ESWCEyE
Observations
19
More a benchmark challenges an approach more is effective.
The benchmark characterises the problem space.
The benchmark provides a measurable view over the related solution
space.

ESWCEyE
Role of Baselines
20
A baseline defines the lower bound of the solution space.
They show the feasibility of the problem.
They avoid one-2-one competition by defining a cross reference for the
comparison.

ESWC
Benchmarking is a
research problem
21
How to explore the solution space?

ESWC
Benchmark Design
22
- What are the guiding principles / requirements?
- What does constitute a benchmark?
- What is the experimental methodology?
- What are the the baselines?

ESWC
Benchmark Quality
23
- Is the benchmark compliant to the requirements/principles?
- Is the benchmark used?
- What are the benchmark limitations? (KPIs, TestDriver)
- Does the benchmark distinguish the compared systems (blind)?

ESWC
Conclusion & Lass Evaluation
24

ESWC
Atomic LASS Evaluation
25
- We evaluate L1O showing it is compliant to Gruber’s ontology design principles.
- We formulate some continuous queries involving the L1C reasoning tasks.
- We implemented L1G by extending LUBM data generator.

ESWC
Holistic LASS Evaluation
26
We evaluate LASS against the following principles by Jim Gray (G) and
Karl Huppler (H). 
 
A benchmark must be 
[G.1] Simple, [G.2] Portable, [G.3] Scalable, [G.4] Relevant and  
[H.1] formally Verifiable.

Portoroz - 2017 - Riccardo Tommasini (Politecnico di Milano)
Questions?
Email: riccardo.tommasini@polimi.it 
Twitter: @rictomm
Github: riccardotommasini
Web: streamreasoning.org
27
Joseph Wright of Derby, An Experiment on a Bird in the Air Pump, 1768.  
The National Gallery, London

Towards a Benchmark for Expressive Stream Reasoning

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Towards a Benchmark for Expressive Stream Reasoning

Ähnlich wie Towards a Benchmark for Expressive Stream Reasoning (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Towards a Benchmark for Expressive Stream Reasoning