SlideShare a Scribd company logo
1 of 35
Web Data Management
Advanced Database Presentation
By:
Navid Sedighpour
Professor :
Dr. Alireza Bagheri
Nevember 2015
1
Interest
Lack of schema
Data is unstructured or at best “semi-structured”
Missing data, additional attributes, similar data but not identical
Volatility
May confirm to one schema now, but not later
Scale
How to capture everything?
Querying Difficulty
What is the user language?
What are the primitives?
Aren’t Search Engines sufficient?
2
Fusion Tables
Users contribute data in spreadsheet
Possible joins between multiple data sets
Extensive visualization
3
More Recent Approaches to Web Querying
More Recent Approaches to Web Querying
XML
Data exchange language
Tree based structure
4
More Recent Approaches to Web Querying
RDF
W3C Recommendation
Simple, self-descriptive model
5
RDF Data Volumes
90% of world's data generated over last two years
Data are growing fast
Size almost doubling every year
6
RDF Data Volumes
March 2009 – 89 Datasets
7
RDF Data Volumes
September 2010 – 203 datasets
8
RDF Data Volumes
September 2011 – 295 Datasets
9
RDF Data Volumes
10
April 2014 – 1091 Datasets
RDF Introduction
Everything is an uniquely named resource
Prefixes can be used to shorten names
Properties of resources can be defined
Relationships with other resources can be defined
Resource description can be contributed by different people/groups and can be located anywhere
in the web
Integrated web “database”
11
RDF Data Model
Triple : Subject, Predicate (Property) , Object
Subject : The entity that is described (URI or Blank Node)
Predicate : a feature of the entity
Object : value of the feature
Set of RDF Triples is called “RDF Graph”
12
RDF Example Instance
13
RDF Graph
14
SPARQL Queries
15
Naïve Triple Store Design
16
17
Naïve Triple Store Design
Easy to Implement
But
Too Many self-joins
Property Tables
Grouping by Entities
Types :
Clustered Property Tables
Property Class Tables
18
Clustered Property Tables
Group together the properties that tend to occur in the same (or similar) subjects
19
Property Class Tables
Cluster the subjects with the same type of property into one property table
20
Property Tables
Advantages :
Fewer Joins
Disadvantages :
Lots of NULLs
Clustering is not trivial
Multi-valued properties are complicated
21
Binary Tables
Grouping by Properties: for each property build a two column table containing both subject and
object, ordered by subjects
Also called “Vertically Partitioned Approach”
N two column tables (n is the number of unique properties in the data)
22
Binary Tables
Advantages :
Support multi-valued Properties
No NULLs
No Clustering
Good performance for subject-subject joins
Disadvantages:
Not useful for subject-subject joins
Expensive inserts
23
Graph-Based Approach
Answering SPARQL query = Subgraph Matching
gStore
24
Two steps need to be done :
1. For each node of Q* get the lists of nodes in G* that include that node
2. Do a multi-way join to get the candidate list
Alternatives :
Sequential scan of G*
 Both steps are inefficient
S-Tree
 Height Balanced Tree over signatures
 Run an inclusion query for each node of Q* and get lists of nodes in G* that include that node (q & s = q)
VS-Tree
 Support both steps efficiently
 Grouping by vertices
25
Graph-Based Approach
S-Tree
26
Pruning
S-Tree
27
S-Tree
28
S-Tree
29
S-Tree
30
VS-Tree
31
VS-Tree
32
Conclusion
RDF Data seem to have considerable promise for web data management
We talked about four approaches to web data management including Naïve triple store design,
Property Tables, Binary Tables and Graph-Based approach
VS-Tree has the best performance in Graph-Base approaches
gStore is more efficient than other approaches
33
References
34
[1] D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach, "Scalable semantic web data
management using vertical partitioning," in Proceedings of the 33rd international conference on Very large
data bases, 2007, pp. 411-422.
[2] L. Zou, J. Mo, L. Chen, M. T. Özsu, and D. Zhao, "gStore: answering SPARQL queries via
subgraph matching," Proceedings of the VLDB Endowment, vol. 4, pp. 482-493, 2011.
[3] L. Zou, M. T. Özsu, L. Chen, X. Shen, R. Huang, and D. Zhao, "gStore: a graph-based SPARQL
query engine," The VLDB Journal—The International Journal on Very Large Data Bases, vol. 23, pp. 565-
590, 2014.
[4] X. Shen, L. Zou, M. T. Ozsu, L. Chen, Y. Li, S. Han, et al., "A Graph-based RDF Triple Store."
35

More Related Content

What's hot

Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference reviewGong Cheng
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationPeter Haase
 
Analytics and Access to the UK web archive
Analytics and Access to the UK web archiveAnalytics and Access to the UK web archive
Analytics and Access to the UK web archiveLewis Crawford
 
QB'er demonstration
QB'er demonstrationQB'er demonstration
QB'er demonstrationCLARIAH
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineLeigh Dodds
 
Mining a Large Web Corpus
Mining a Large Web CorpusMining a Large Web Corpus
Mining a Large Web CorpusRobert Meusel
 
COOL-WD: A Completeness Tool for Wikidata
COOL-WD: A Completeness Tool for WikidataCOOL-WD: A Completeness Tool for Wikidata
COOL-WD: A Completeness Tool for WikidataFariz Darari
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Asuncion Gomez-Perez
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Fabrizio Orlandi
 
Rdf and open linked data a first approach
Rdf and open linked data a first approach Rdf and open linked data a first approach
Rdf and open linked data a first approach @CULT Srl
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsdgarijo
 
DataTables view CKAN monthly live
DataTables view   CKAN monthly liveDataTables view   CKAN monthly live
DataTables view CKAN monthly liveJoel Natividad
 
Do it on your own - From 3 to 5 Star Linked Open Data with RMLio
Do it on your own - From 3 to 5 Star Linked Open Data with RMLioDo it on your own - From 3 to 5 Star Linked Open Data with RMLio
Do it on your own - From 3 to 5 Star Linked Open Data with RMLioOpen Knowledge Belgium
 
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...Robert Meusel
 
Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefineOpen Knowledge Belgium
 
ODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer NatureODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer NatureMichele Pasin
 
Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Fabrizio Orlandi
 

What's hot (20)

Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference review
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federation
 
Analytics and Access to the UK web archive
Analytics and Access to the UK web archiveAnalytics and Access to the UK web archive
Analytics and Access to the UK web archive
 
QB'er demonstration
QB'er demonstrationQB'er demonstration
QB'er demonstration
 
Wikidata
WikidataWikidata
Wikidata
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data Engine
 
Mining a Large Web Corpus
Mining a Large Web CorpusMining a Large Web Corpus
Mining a Large Web Corpus
 
COOL-WD: A Completeness Tool for Wikidata
COOL-WD: A Completeness Tool for WikidataCOOL-WD: A Completeness Tool for Wikidata
COOL-WD: A Completeness Tool for Wikidata
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data
 
Pandas
PandasPandas
Pandas
 
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
 
Rdf and open linked data a first approach
Rdf and open linked data a first approach Rdf and open linked data a first approach
Rdf and open linked data a first approach
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
 
DataTables view CKAN monthly live
DataTables view   CKAN monthly liveDataTables view   CKAN monthly live
DataTables view CKAN monthly live
 
Linked Data
Linked DataLinked Data
Linked Data
 
Do it on your own - From 3 to 5 Star Linked Open Data with RMLio
Do it on your own - From 3 to 5 Star Linked Open Data with RMLioDo it on your own - From 3 to 5 Star Linked Open Data with RMLio
Do it on your own - From 3 to 5 Star Linked Open Data with RMLio
 
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
 
Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefine
 
ODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer NatureODI Summit 2016 - Linked Open Data at Springer Nature
ODI Summit 2016 - Linked Open Data at Springer Nature
 
Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021Beyond 2022 project presentation 2021
Beyond 2022 project presentation 2021
 

Similar to Scalable Web Data Management using RDF

No sql – rise of the clusters
No sql – rise of the clustersNo sql – rise of the clusters
No sql – rise of the clustersresponseteam
 
Selecting the right database type for your knowledge management needs.
Selecting the right database type for your knowledge management needs.Selecting the right database type for your knowledge management needs.
Selecting the right database type for your knowledge management needs.Synaptica, LLC
 
NO SQL Databases, Big Data and the cloud
NO SQL Databases, Big Data and the cloudNO SQL Databases, Big Data and the cloud
NO SQL Databases, Big Data and the cloudManu Cohen-Yashar
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedSören Auer
 
NoSQL_Databases
NoSQL_DatabasesNoSQL_Databases
NoSQL_DatabasesRick Perry
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
OUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4J
OUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4JOUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4J
OUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4Jijcsity
 
Introducción a NoSQL
Introducción a NoSQLIntroducción a NoSQL
Introducción a NoSQLMongoDB
 
Graph database in sv meetup
Graph database in sv meetupGraph database in sv meetup
Graph database in sv meetupJoshua Bae
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011jexp
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGijiert bestjournal
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 

Similar to Scalable Web Data Management using RDF (20)

NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
No sql – rise of the clusters
No sql – rise of the clustersNo sql – rise of the clusters
No sql – rise of the clusters
 
Selecting the right database type for your knowledge management needs.
Selecting the right database type for your knowledge management needs.Selecting the right database type for your knowledge management needs.
Selecting the right database type for your knowledge management needs.
 
NO SQL Databases, Big Data and the cloud
NO SQL Databases, Big Data and the cloudNO SQL Databases, Big Data and the cloud
NO SQL Databases, Big Data and the cloud
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
NoSQL_Databases
NoSQL_DatabasesNoSQL_Databases
NoSQL_Databases
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
OUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4J
OUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4JOUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4J
OUTCOME ANALYSIS IN ACADEMIC INSTITUTIONS USING NEO4J
 
Introducción a NoSQL
Introducción a NoSQLIntroducción a NoSQL
Introducción a NoSQL
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Oslo baksia2014
Oslo baksia2014Oslo baksia2014
Oslo baksia2014
 
Graph database in sv meetup
Graph database in sv meetupGraph database in sv meetup
Graph database in sv meetup
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
 
Nosql
NosqlNosql
Nosql
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
Nosql
NosqlNosql
Nosql
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 

Recently uploaded

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Scalable Web Data Management using RDF

  • 1. Web Data Management Advanced Database Presentation By: Navid Sedighpour Professor : Dr. Alireza Bagheri Nevember 2015 1
  • 2. Interest Lack of schema Data is unstructured or at best “semi-structured” Missing data, additional attributes, similar data but not identical Volatility May confirm to one schema now, but not later Scale How to capture everything? Querying Difficulty What is the user language? What are the primitives? Aren’t Search Engines sufficient? 2
  • 3. Fusion Tables Users contribute data in spreadsheet Possible joins between multiple data sets Extensive visualization 3 More Recent Approaches to Web Querying
  • 4. More Recent Approaches to Web Querying XML Data exchange language Tree based structure 4
  • 5. More Recent Approaches to Web Querying RDF W3C Recommendation Simple, self-descriptive model 5
  • 6. RDF Data Volumes 90% of world's data generated over last two years Data are growing fast Size almost doubling every year 6
  • 7. RDF Data Volumes March 2009 – 89 Datasets 7
  • 8. RDF Data Volumes September 2010 – 203 datasets 8
  • 9. RDF Data Volumes September 2011 – 295 Datasets 9
  • 10. RDF Data Volumes 10 April 2014 – 1091 Datasets
  • 11. RDF Introduction Everything is an uniquely named resource Prefixes can be used to shorten names Properties of resources can be defined Relationships with other resources can be defined Resource description can be contributed by different people/groups and can be located anywhere in the web Integrated web “database” 11
  • 12. RDF Data Model Triple : Subject, Predicate (Property) , Object Subject : The entity that is described (URI or Blank Node) Predicate : a feature of the entity Object : value of the feature Set of RDF Triples is called “RDF Graph” 12
  • 16. Naïve Triple Store Design 16
  • 17. 17 Naïve Triple Store Design Easy to Implement But Too Many self-joins
  • 18. Property Tables Grouping by Entities Types : Clustered Property Tables Property Class Tables 18
  • 19. Clustered Property Tables Group together the properties that tend to occur in the same (or similar) subjects 19
  • 20. Property Class Tables Cluster the subjects with the same type of property into one property table 20
  • 21. Property Tables Advantages : Fewer Joins Disadvantages : Lots of NULLs Clustering is not trivial Multi-valued properties are complicated 21
  • 22. Binary Tables Grouping by Properties: for each property build a two column table containing both subject and object, ordered by subjects Also called “Vertically Partitioned Approach” N two column tables (n is the number of unique properties in the data) 22
  • 23. Binary Tables Advantages : Support multi-valued Properties No NULLs No Clustering Good performance for subject-subject joins Disadvantages: Not useful for subject-subject joins Expensive inserts 23
  • 24. Graph-Based Approach Answering SPARQL query = Subgraph Matching gStore 24
  • 25. Two steps need to be done : 1. For each node of Q* get the lists of nodes in G* that include that node 2. Do a multi-way join to get the candidate list Alternatives : Sequential scan of G*  Both steps are inefficient S-Tree  Height Balanced Tree over signatures  Run an inclusion query for each node of Q* and get lists of nodes in G* that include that node (q & s = q) VS-Tree  Support both steps efficiently  Grouping by vertices 25 Graph-Based Approach
  • 33. Conclusion RDF Data seem to have considerable promise for web data management We talked about four approaches to web data management including Naïve triple store design, Property Tables, Binary Tables and Graph-Based approach VS-Tree has the best performance in Graph-Base approaches gStore is more efficient than other approaches 33
  • 34. References 34 [1] D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach, "Scalable semantic web data management using vertical partitioning," in Proceedings of the 33rd international conference on Very large data bases, 2007, pp. 411-422. [2] L. Zou, J. Mo, L. Chen, M. T. Özsu, and D. Zhao, "gStore: answering SPARQL queries via subgraph matching," Proceedings of the VLDB Endowment, vol. 4, pp. 482-493, 2011. [3] L. Zou, M. T. Özsu, L. Chen, X. Shen, R. Huang, and D. Zhao, "gStore: a graph-based SPARQL query engine," The VLDB Journal—The International Journal on Very Large Data Bases, vol. 23, pp. 565- 590, 2014. [4] X. Shen, L. Zou, M. T. Ozsu, L. Chen, Y. Li, S. Han, et al., "A Graph-based RDF Triple Store."
  • 35. 35