SlideShare a Scribd company logo
1 of 16
Download to read offline
BowlognaBench	
  
Benchmarking	
  RDF	
  Analy5cs	
  
Gianluca	
  Demar5ni,	
  Iliya	
  Enchev,	
  
Joël	
  Gapany,	
  and	
  Philippe	
  Cudré-­‐Mauroux	
  
	
  eXascale	
  Infolab	
  &	
  Faculty	
  of	
  Humani5es	
  
University	
  of	
  Fribourg,	
  Switzerland	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   1	
  
Mo5va5on	
  
•  Seman5c	
  Data	
  keeps	
  increasing	
  on	
  the	
  Web	
  
•  More	
  common	
  is	
  the	
  need	
  to	
  run	
  OLAP-­‐type	
  
queries	
  
– How	
  did	
  university	
  student	
  performance	
  evolve	
  
over	
  last	
  5	
  years?	
  
•  A	
  novel	
  benchmark	
  for	
  Knowledge	
  Bases	
  
focusing	
  on	
  complex	
  Analy5cs	
  queries	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   2	
  
Why	
  do	
  we	
  need	
  a	
  new	
  RDF	
  
benchmark?	
  
•  Exis5ng	
  RDF	
  benchmarks	
  (e.g.,	
  LUBM)	
  
– Don’t	
  deal	
  with	
  complex	
  analy5c	
  queries	
  
– Don’t	
  look	
  at	
  the	
  temporal	
  dimension	
  
– Don’t	
  model	
  a	
  realis5c	
  se_ng	
  
•  Analy5c	
  benchmarks	
  exist	
  for	
  rela5onal	
  
systems	
  (e.g.	
  TPC-­‐H)	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   3	
  
The	
  Bologna	
  Reform	
  
•  Started	
  in	
  June	
  1999	
  
•  Framework	
  for	
  higher	
  educa5on	
  systems	
  
•  47	
  Countries	
  
•  Common	
  academic	
  degrees	
  
•  Common	
  study	
  structure	
  
•  Common	
  terminology	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   4	
  
The	
  university	
  se_ng	
  ader	
  Bologna	
  
•  A	
  lot	
  of	
  data	
  is	
  available	
  
–  Not	
  following	
  standard	
  schemas	
  
–  Comprehensive	
  and	
  available	
  data	
  is	
  a	
  success	
  factor	
  
	
  
•  Shared	
  data	
  
–  Erasmus	
  exchanges	
  
–  Courses	
  in	
  a	
  given	
  language	
  
•  Analy5c	
  tools	
  may	
  help	
  monitoring	
  university	
  
performance	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   5	
  
An	
  ontology	
  about	
  Bologna 	
  	
  
•  A	
  Lexicon	
  for	
  the	
  Bologna	
  Reform	
  
– Basic	
  set	
  of	
  terms	
  for	
  the	
  new	
  system	
  
– Stable	
  across	
  5me	
  and	
  ins5tu5ons	
  
– Developed	
  by	
  a	
  professional	
  terminologist	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   6	
  
The	
  ontology	
  crea5on	
  process	
  	
  
•  The	
  Bowlogna	
  Ontology	
  
– 29	
  top	
  classes	
  (67	
  in	
  total)	
  
– Classes:	
  student,	
  professor,	
  evalua5on,	
  teaching	
  
unit,	
  ECTS	
  credit,	
  semester,	
  etc.	
  
– Concept	
  defini5ons	
  in	
  English,	
  French,	
  German	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   7	
  
Bowlogna	
  Ontology	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   8	
  
Bowlogna	
  Ontology	
  
•  Private	
  /	
  Public	
  parts	
  
– Public	
  data	
  can	
  be	
  shared	
  with	
  other	
  uni	
  (e.g.,	
  
course	
  descrip5ons)	
  
– Private	
  data	
  in	
  sensible	
  (e.g.,	
  evalua5on	
  results)	
  
•  Private	
  data	
  might	
  contain	
  more	
  instances	
  
•  Aggrega5ons	
  over	
  private	
  data	
  may	
  be	
  shared	
  
(e.g.,	
  number	
  of	
  enrolled	
  students)	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   9	
  
The	
  Benchmark	
  
•  Bowlogna	
  Ontology	
  
– 67	
  classes	
  
•  12	
  Analy5cs	
  queries	
  
– Natural	
  language	
  and	
  SPARQL	
  transla5on	
  
•  Automa5c	
  Instance	
  Generator	
  
– Populated	
  ontology	
  with	
  given	
  num	
  of	
  instances	
  
•  Test	
  over	
  num	
  of	
  instances	
  and	
  universi5es	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   10	
  
Analy5c	
  Queries	
  
•  Count	
  
•  Molecule	
  
–  Query	
  4.	
  Return	
  all	
  informa5on	
  about	
  Student0	
  within	
  a	
  
scope	
  of	
  two	
  
•  Max	
  Min	
  
•  Ranking	
  and	
  TopK	
  	
  
•  Temporal	
  
–  Query	
  8.	
  What	
  is	
  the	
  average	
  comple5on	
  5me	
  of	
  Bachelor	
  
studies	
  for	
  each	
  Study	
  Track?	
  
•  Path	
  
•  Mul5ple	
  Universi5es	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   11	
  
Query	
  Classifica5on	
  
we classify a query as having a large input size if it involves more than 5%
of instances, and small otherwise. Selectivity measures the amount of instances
that match the query: we classify a query as having high selectivity if less than
10% of instances match the query, and low otherwise. Complexity measures the
amount of classes and properties involved in the query: queries are classified as
having high or low complexity accordingly to the RDF schema we have defined.
Table 1. Classification of queries according to their need to access private and public
data, input size, selectivity, and complexity.
Count Molecule MaxMin TopK Temp Path MultiUniv
Query 1 2 3 4 5 6 7 8 9 10 11 12
Public x x x x x x x x
Private x x x x x x x x
Input Size Small Large Small Small Large Small Large Large Large Large Large Large
Selectivity High Low Low Low Low Low High Low Low Low Low Low
Complexity Low Low Low High High Low High Low High High Low High
As we can see the majority of queries have a low selectivity which reflects our
intent of performing analytic queries, that is, queries for which a lot of data is
retrieved and aggregated. For the same reason, most of the queries have a large
input. Finally, queries are equally divided in high and low complexities.30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   12	
  
From	
  the	
  process	
  analyst	
  point	
  of	
  view	
  
•  Which	
  system	
  should	
  I	
  pick	
  for	
  my	
  specific	
  
problem?	
  
– Not	
  looking	
  for	
  the	
  best	
  system	
  
– Look	
  at	
  Problem-­‐specific	
  query	
  sets	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   13	
  
Which	
  system	
  to	
  use?	
  
Count	
  
Queries	
  
Path	
  
Queries	
  
Rank	
  
Queries	
  
Temporal	
  
Queries	
  
System	
  A	
   0.5s	
   5s	
   0.1s	
   2s	
  
System	
  B	
   3s	
   0.4s	
   2s	
   1s	
  
System	
  C	
   0.5s	
   0.5s	
   0.5s	
   0.5s	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   14	
  
Conclusions	
  
•  BowlognaBench	
  for	
  Analy5c	
  Queries	
  
•  OWL	
  Ontology	
  for	
  Higher	
  Educa5on	
  Systems	
  
•  Next	
  Steps	
  
– Run	
  a	
  compara5ve	
  evalua5on	
  of	
  RDF	
  systems	
  
– Set	
  up	
  a	
  wiki-­‐like	
  space	
  where	
  groups	
  can	
  upload	
  
experimental	
  results	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   15	
  
hlp://diuf.unifr.ch/xi/bowlognabench/	
  
30-­‐Jun-­‐11	
   Gianluca	
  Demar5ni	
   16	
  

More Related Content

Similar to RDF Analytics Benchmarking Framework

Beyond MOOCS – A Catalyst for Change
Beyond MOOCS – A Catalyst for ChangeBeyond MOOCS – A Catalyst for Change
Beyond MOOCS – A Catalyst for ChangeFutureLearn FLAN
 
Presentation on Software process improvement in GSD
Presentation on Software process improvement in GSDPresentation on Software process improvement in GSD
Presentation on Software process improvement in GSDRafi Ullah
 
Primo Central Trial, Usability Testing, and Implementation Options (2012)
Primo Central Trial, Usability Testing, and Implementation Options (2012)Primo Central Trial, Usability Testing, and Implementation Options (2012)
Primo Central Trial, Usability Testing, and Implementation Options (2012)Alison Hitchens
 
Learning analytics exemplar template
Learning analytics exemplar templateLearning analytics exemplar template
Learning analytics exemplar templateSimon Buckingham Shum
 
Analytic emperical Mehods
Analytic emperical MehodsAnalytic emperical Mehods
Analytic emperical MehodsM Surendar
 
Qualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptxQualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptxTayeDosane
 
Blended e-learning in UKMFolio
Blended e-learning in UKMFolioBlended e-learning in UKMFolio
Blended e-learning in UKMFolioAzmi Mohd Tamil
 
RESEARCH METHODOLOGY
RESEARCH METHODOLOGYRESEARCH METHODOLOGY
RESEARCH METHODOLOGYbesirdernjani
 
Lecture 10.12.10
Lecture 10.12.10Lecture 10.12.10
Lecture 10.12.10VMRoberts
 
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...Dominik Kowald
 
Recommending Scientific Papers: Investigating the User Curriculum
Recommending Scientific Papers: Investigating the User CurriculumRecommending Scientific Papers: Investigating the User Curriculum
Recommending Scientific Papers: Investigating the User CurriculumJonathas Magalhães
 
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Michael Levine-Clark
 
Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Roi Blanco
 
Learning Analytics: Realizing their Promise in the California State University
Learning Analytics:  Realizing their Promise in the California State UniversityLearning Analytics:  Realizing their Promise in the California State University
Learning Analytics: Realizing their Promise in the California State UniversityJohn Whitmer, Ed.D.
 
Leverhulme methods presentation
Leverhulme methods presentationLeverhulme methods presentation
Leverhulme methods presentationAnne Adams
 
Research methodology week03
Research methodology week03Research methodology week03
Research methodology week03swati kundabor
 
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...Christoph Rensing
 

Similar to RDF Analytics Benchmarking Framework (20)

Beyond MOOCS – A Catalyst for Change
Beyond MOOCS – A Catalyst for ChangeBeyond MOOCS – A Catalyst for Change
Beyond MOOCS – A Catalyst for Change
 
Presentation on Software process improvement in GSD
Presentation on Software process improvement in GSDPresentation on Software process improvement in GSD
Presentation on Software process improvement in GSD
 
Primo Central Trial, Usability Testing, and Implementation Options (2012)
Primo Central Trial, Usability Testing, and Implementation Options (2012)Primo Central Trial, Usability Testing, and Implementation Options (2012)
Primo Central Trial, Usability Testing, and Implementation Options (2012)
 
Learning analytics exemplar template
Learning analytics exemplar templateLearning analytics exemplar template
Learning analytics exemplar template
 
Learning Analytics for MOOCs: EMMA case
Learning Analytics for MOOCs: EMMA caseLearning Analytics for MOOCs: EMMA case
Learning Analytics for MOOCs: EMMA case
 
Analytic emperical Mehods
Analytic emperical MehodsAnalytic emperical Mehods
Analytic emperical Mehods
 
Qualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptxQualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptx
 
Blended e-learning in UKMFolio
Blended e-learning in UKMFolioBlended e-learning in UKMFolio
Blended e-learning in UKMFolio
 
RESEARCH METHODOLOGY
RESEARCH METHODOLOGYRESEARCH METHODOLOGY
RESEARCH METHODOLOGY
 
RM3.ppt
RM3.pptRM3.ppt
RM3.ppt
 
Lecture 10.12.10
Lecture 10.12.10Lecture 10.12.10
Lecture 10.12.10
 
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...
WWW'15: A Hybrid Resource Recommender Mimicking Attention-Interpretation Dyna...
 
Recommending Scientific Papers: Investigating the User Curriculum
Recommending Scientific Papers: Investigating the User CurriculumRecommending Scientific Papers: Investigating the User Curriculum
Recommending Scientific Papers: Investigating the User Curriculum
 
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
 
Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement
 
Learning Analytics: Realizing their Promise in the California State University
Learning Analytics:  Realizing their Promise in the California State UniversityLearning Analytics:  Realizing their Promise in the California State University
Learning Analytics: Realizing their Promise in the California State University
 
data interpretation
data interpretationdata interpretation
data interpretation
 
Leverhulme methods presentation
Leverhulme methods presentationLeverhulme methods presentation
Leverhulme methods presentation
 
Research methodology week03
Research methodology week03Research methodology week03
Research methodology week03
 
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
Investigating Crowdsourcing as an Evaluation Method for (TEL) Recommender Sy...
 

More from eXascale Infolab

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictioneXascale Infolab
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...eXascale Infolab
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex GraphseXascale Infolab
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapeXascale Infolab
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...eXascale Infolab
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...eXascale Infolab
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceanseXascale Infolab
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutioneXascale Infolab
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataeXascale Infolab
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data ManagementeXascale Infolab
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataeXascale Infolab
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataeXascale Infolab
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingeXascale Infolab
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...eXascale Infolab
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingeXascale Infolab
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big DataeXascale Infolab
 

More from eXascale Infolab (20)

Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link PredictionBeyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction
 
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
It Takes Two: Instrumenting the Interaction between In-Memory Databases and S...
 
Representation Learning on Complex Graphs
Representation Learning on Complex GraphsRepresentation Learning on Complex Graphs
Representation Learning on Complex Graphs
 
A force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory mapA force directed approach for offline gps trajectory map
A force directed approach for offline gps trajectory map
 
Cikm 2018
Cikm 2018Cikm 2018
Cikm 2018
 
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms wit...
 
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
SwissLink: High-Precision, Context-Free Entity Linking Exploiting Unambiguous...
 
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data OceansDependency-Driven Analytics: A Compass for Uncharted Data Oceans
Dependency-Driven Analytics: A Compass for Uncharted Data Oceans
 
Crowd scheduling www2016
Crowd scheduling www2016Crowd scheduling www2016
Crowd scheduling www2016
 
SANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference ResolutionSANAPHOR: Ontology-based Coreference Resolution
SANAPHOR: Ontology-based Coreference Resolution
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Entity-Centric Data Management
Entity-Centric Data ManagementEntity-Centric Data Management
Entity-Centric Data Management
 
SSSW 2015 Sense Making
SSSW 2015 Sense MakingSSSW 2015 Sense Making
SSSW 2015 Sense Making
 
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked DataLDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
LDOW2015 - Uduvudu: a Graph-Aware and Adaptive UI Engine for Linked Data
 
Executing Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web DataExecuting Provenance-Enabled Queries over Web Data
Executing Provenance-Enabled Queries over Web Data
 
The Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task CrowdsourcingThe Dynamics of Micro-Task Crowdsourcing
The Dynamics of Micro-Task Crowdsourcing
 
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
Fixing the Domain and Range of Properties in Linked Data by Context Disambigu...
 
CIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition rankingCIKM14: Fixing grammatical errors by preposition ranking
CIKM14: Fixing grammatical errors by preposition ranking
 
OLTP-Bench
OLTP-BenchOLTP-Bench
OLTP-Bench
 
An Introduction to Big Data
An Introduction to Big DataAn Introduction to Big Data
An Introduction to Big Data
 

Recently uploaded

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 

Recently uploaded (20)

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 

RDF Analytics Benchmarking Framework

  • 1. BowlognaBench   Benchmarking  RDF  Analy5cs   Gianluca  Demar5ni,  Iliya  Enchev,   Joël  Gapany,  and  Philippe  Cudré-­‐Mauroux    eXascale  Infolab  &  Faculty  of  Humani5es   University  of  Fribourg,  Switzerland   30-­‐Jun-­‐11   Gianluca  Demar5ni   1  
  • 2. Mo5va5on   •  Seman5c  Data  keeps  increasing  on  the  Web   •  More  common  is  the  need  to  run  OLAP-­‐type   queries   – How  did  university  student  performance  evolve   over  last  5  years?   •  A  novel  benchmark  for  Knowledge  Bases   focusing  on  complex  Analy5cs  queries   30-­‐Jun-­‐11   Gianluca  Demar5ni   2  
  • 3. Why  do  we  need  a  new  RDF   benchmark?   •  Exis5ng  RDF  benchmarks  (e.g.,  LUBM)   – Don’t  deal  with  complex  analy5c  queries   – Don’t  look  at  the  temporal  dimension   – Don’t  model  a  realis5c  se_ng   •  Analy5c  benchmarks  exist  for  rela5onal   systems  (e.g.  TPC-­‐H)   30-­‐Jun-­‐11   Gianluca  Demar5ni   3  
  • 4. The  Bologna  Reform   •  Started  in  June  1999   •  Framework  for  higher  educa5on  systems   •  47  Countries   •  Common  academic  degrees   •  Common  study  structure   •  Common  terminology   30-­‐Jun-­‐11   Gianluca  Demar5ni   4  
  • 5. The  university  se_ng  ader  Bologna   •  A  lot  of  data  is  available   –  Not  following  standard  schemas   –  Comprehensive  and  available  data  is  a  success  factor     •  Shared  data   –  Erasmus  exchanges   –  Courses  in  a  given  language   •  Analy5c  tools  may  help  monitoring  university   performance   30-­‐Jun-­‐11   Gianluca  Demar5ni   5  
  • 6. An  ontology  about  Bologna     •  A  Lexicon  for  the  Bologna  Reform   – Basic  set  of  terms  for  the  new  system   – Stable  across  5me  and  ins5tu5ons   – Developed  by  a  professional  terminologist   30-­‐Jun-­‐11   Gianluca  Demar5ni   6  
  • 7. The  ontology  crea5on  process     •  The  Bowlogna  Ontology   – 29  top  classes  (67  in  total)   – Classes:  student,  professor,  evalua5on,  teaching   unit,  ECTS  credit,  semester,  etc.   – Concept  defini5ons  in  English,  French,  German   30-­‐Jun-­‐11   Gianluca  Demar5ni   7  
  • 8. Bowlogna  Ontology   30-­‐Jun-­‐11   Gianluca  Demar5ni   8  
  • 9. Bowlogna  Ontology   •  Private  /  Public  parts   – Public  data  can  be  shared  with  other  uni  (e.g.,   course  descrip5ons)   – Private  data  in  sensible  (e.g.,  evalua5on  results)   •  Private  data  might  contain  more  instances   •  Aggrega5ons  over  private  data  may  be  shared   (e.g.,  number  of  enrolled  students)   30-­‐Jun-­‐11   Gianluca  Demar5ni   9  
  • 10. The  Benchmark   •  Bowlogna  Ontology   – 67  classes   •  12  Analy5cs  queries   – Natural  language  and  SPARQL  transla5on   •  Automa5c  Instance  Generator   – Populated  ontology  with  given  num  of  instances   •  Test  over  num  of  instances  and  universi5es   30-­‐Jun-­‐11   Gianluca  Demar5ni   10  
  • 11. Analy5c  Queries   •  Count   •  Molecule   –  Query  4.  Return  all  informa5on  about  Student0  within  a   scope  of  two   •  Max  Min   •  Ranking  and  TopK     •  Temporal   –  Query  8.  What  is  the  average  comple5on  5me  of  Bachelor   studies  for  each  Study  Track?   •  Path   •  Mul5ple  Universi5es   30-­‐Jun-­‐11   Gianluca  Demar5ni   11  
  • 12. Query  Classifica5on   we classify a query as having a large input size if it involves more than 5% of instances, and small otherwise. Selectivity measures the amount of instances that match the query: we classify a query as having high selectivity if less than 10% of instances match the query, and low otherwise. Complexity measures the amount of classes and properties involved in the query: queries are classified as having high or low complexity accordingly to the RDF schema we have defined. Table 1. Classification of queries according to their need to access private and public data, input size, selectivity, and complexity. Count Molecule MaxMin TopK Temp Path MultiUniv Query 1 2 3 4 5 6 7 8 9 10 11 12 Public x x x x x x x x Private x x x x x x x x Input Size Small Large Small Small Large Small Large Large Large Large Large Large Selectivity High Low Low Low Low Low High Low Low Low Low Low Complexity Low Low Low High High Low High Low High High Low High As we can see the majority of queries have a low selectivity which reflects our intent of performing analytic queries, that is, queries for which a lot of data is retrieved and aggregated. For the same reason, most of the queries have a large input. Finally, queries are equally divided in high and low complexities.30-­‐Jun-­‐11   Gianluca  Demar5ni   12  
  • 13. From  the  process  analyst  point  of  view   •  Which  system  should  I  pick  for  my  specific   problem?   – Not  looking  for  the  best  system   – Look  at  Problem-­‐specific  query  sets   30-­‐Jun-­‐11   Gianluca  Demar5ni   13  
  • 14. Which  system  to  use?   Count   Queries   Path   Queries   Rank   Queries   Temporal   Queries   System  A   0.5s   5s   0.1s   2s   System  B   3s   0.4s   2s   1s   System  C   0.5s   0.5s   0.5s   0.5s   30-­‐Jun-­‐11   Gianluca  Demar5ni   14  
  • 15. Conclusions   •  BowlognaBench  for  Analy5c  Queries   •  OWL  Ontology  for  Higher  Educa5on  Systems   •  Next  Steps   – Run  a  compara5ve  evalua5on  of  RDF  systems   – Set  up  a  wiki-­‐like  space  where  groups  can  upload   experimental  results   30-­‐Jun-­‐11   Gianluca  Demar5ni   15