SlideShare ist ein Scribd-Unternehmen logo
1 von 34
An Approach for the Incremental
Export of Relational Databases
into RDF Graphs
N. Konstantinou, D.-E. Spanos, D. Kouis, N. Mitrou
National Technical
University of Athens
13th Hellenic Data Management Symposium (HDMS'15)
About this work
 Started as a project in 2012 in the National Documentation Centre, to
offer Linked Data views over its contents
 Evolved as a standards-compliant open-source tool
 First results presented in IC-ININFO’13 and MTSR’13
 Journal paper of the presentation of the software used was awarded an
Outstanding Paper Award
 Latest results presented in WIMS’14
 Revised and extended version in a special issue in IJAIT (2015)
13th Hellenic Data Management Symposium (HDMS'15) 2
Outline
 Introduction
 Background
 Proposed Approach
 Measurements
 Conclusions
13th Hellenic Data Management Symposium (HDMS'15) 3
Introduction
 Information collection, maintenance and update is not always taking
place directly at a triplestore, but at a RDBMS
 It can be difficult to change established methodologies and systems
 Especially in less frequently changing environments, e.g. libraries
 Triplestores are often kept as an alternative content delivery channel
 Newer technologies need to operate side-by-side to existing ones before
migration
13th Hellenic Data Management Symposium (HDMS'15) 4
Mapping Relational Data to RDF
 Synchronous or Asynchronous RDF Views
 Real-time SPARQL-to-SQL or Querying the RDF dump using SPARQL
 Queries on the RDF dump are faster in certain conditions, compared to
round-trips to the database
 Difference in the performance more visible when SPARQL queries involve
numerous triple patterns (which translate to expensive JOIN statements)
 In this paper, we focus on the asynchronous approach
 Exporting (dumping) relational database contents into an RDF graph
13th Hellenic Data Management Symposium (HDMS'15) 5
Incremental Export into RDF (1/2)
 Problem
 Avoid dumping the whole database contents every time
 In cases when few data change in the source database, it is not necessary to dump
the entire database
 Approach
 Every time the RDF export is materialized
 Detect the changes in the source database or the mapping definition
 Insert/delete/update only the necessary triples, in order to reflect these changes in the
resulting RDF graph
613th Hellenic Data Management Symposium (HDMS'15)
Incremental Export into RDF (2/2)
 Incremental transformation
 Each time the transformation is executed, only the part in the database that
changed should be transformed into RDF
 Incremental storage
 Storing (persisting) to the destination RDF graph only the triples that were modified
and not the whole graph
 Possible only when the resulting RDF graph is stored in a relational database or
using Jena TDB
713th Hellenic Data Management Symposium (HDMS'15)
Outline
 Introduction
 Background
 Proposed Approach
 Measurements
 Conclusions
13th Hellenic Data Management Symposium (HDMS'15) 8
Reification in RDF
13th Hellenic Data Management Symposium (HDMS'15) 9
<http://data.example.or
g/repository/person/1>
"John Smith"
<http://data.example.or
g/repository/person/1>
foaf:name
"John Smith"
blank
node
rdf:Statement
foaf:name
rdf:type
rdf:subject
rdf:predicate
rdf:object
<http://data.example.org/repository/person/1>
foaf:name "John Smith" .
becomes
[] a rdf:Statement ;
rdf:subject
<http://data.example.org/repository/person/1> ;
rdf:predicate foaf:name ;
rdf:object "John Smith" ;
dc:source map:persons .
Reification in RDF
13th Hellenic Data Management Symposium (HDMS'15) 10
<http://data.example.or
g/repository/person/1>
"John Smith"
<http://data.example.or
g/repository/person/1>
foaf:name
"John Smith"
blank
node
rdf:Statement
map:persons
foaf:name
rdf:type
rdf:subject
rdf:predicate
rdf:object
dc:source
<http://data.example.org/repository/person/1>
foaf:name "John Smith" .
becomes
[] a rdf:Statement ;
rdf:subject
<http://data.example.org/repository/person/1> ;
rdf:predicate foaf:name ;
rdf:object "John Smith" ;
dc:source map:persons .
 Ability to annotate every triple
 E.g. the mapping definition that produced it
R2RML
 RDB to RDF
Mapping Language
 A W3C
Recommendation,
as of 2012
 Mapping
documents
contain sets of
Triples Maps
13th Hellenic Data Management Symposium (HDMS'15) 11
RefObjectMap
TriplesMap
PredicateObjectMap
SubjectMap
Generated Triples
Generated Output Dataset
GraphMap
Join
ObjectMap
PredicateMap
LogicalTable
Triples Maps in R2RML (1)
 Reusable mapping definitions
 Specify a rule for translating each row of a
logical table to zero or more RDF triples
 A logical table is a tabular SQL query result
set that is to be mapped to RDF triples
 Execution of a triples map generates the
triples that originate from the specific
result set
1213th Hellenic Data Management Symposium (HDMS'15)
Triples Maps in R2RML (2)
 An example
13th Hellenic Data Management Symposium (HDMS'15)
map:persons
rr:logicalTable [ rr:tableName '"eperson"'; ];
rr:subjectMap [
rr:template 'http://data.example.org/repository/person/{"eperson_id"}';
rr:class foaf:Person; ];
rr:predicateObjectMap [
rr:predicate foaf:name;
rr:objectMap [ rr:template '{"firstname"} {"lastname"}' ;
rr:termType rr:Literal; ] ].
13
An R2RML Mapping Example
@prefix map: <#>.
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix dcterms: <http://purl.org/dc/terms/>.
map:persons-groups
rr:logicalTable [ rr:tableName '"epersongroup2eperson"'; ];
rr:subjectMap [
rr:template 'http://data.example.org/repository/group/{"eperson_group_id"}';
];
rr:predicateObjectMap [
rr:predicate foaf:member;
rr:objectMap [ rr:template 'http://data.example.org/repository/person/{"eperson_id"}';
rr:termType rr:IRI; ] ].
14
<http://data.example.org/repository/group/1> foaf:member
<http://data.example.org/repository/person/1> ,
<http://data.example.org/repository/person/2> ,
<http://data.example.org/repository/person/3> ,
<http://data.example.org/repository/person/4> ,
<http://data.example.org/repository/person/5> ,
<http://data.example.org/repository/person/6> .
Table
epersongroup2eperson
Outline
 Introduction
 Background
 Proposed Approach
 Measurements
 Conclusions
13th Hellenic Data Management Symposium (HDMS'15) 15
The R2RML Parser tool
 An R2RML implementation
 Command-line tool that can export relational database contents as RDF
graphs, based on an R2RML mapping document
 Open-source (CC BY-NC), written in Java
 Publicly available at https://github.com/nkons/r2rml-parser
 Worldwide interest (Ontotext, Abbvie, Financial Times)
 Tested against MySQL, PostgreSQL, and Oracle
 Output can be written in RDF/OWL
 N3, Turtle, N-Triple, TTL, RDF/XML(-ABBREV) notation, or Jena TDB backend
 Covers most (not all) of the R2RML constructs (see the wiki)
 Does not offer SPARQL-to-SQL translations 16
Information Flow (1)
 Parse the source database contents into result sets
 According to the R2RML Mapping File, the Parser generates a set of
instructions to the Generator
 The Generator instantiates in-memory the resulting RDF graph
 Persist the generated RDF graph into
 An RDF file in the Hard Disk, or
 In Jena’s relational database (eventually rendered obsolete), or
 In Jena’s TDB (Tuple Data Base, a custom implementation of B+ trees)
 Log the results
Parser GeneratorMapping
fileSource database
RDF graph
TDB
R2RML Parser
17
Target database
Hard Disk
Information Flow (2)
 Overall generation time is the sum of the following:
 t1: Parse mapping document
 t2: Generate Jena model in memory
 t3: Dump model to the destination medium
 t4: Log the results
 In incremental transformation, the log file contains the reified model
 A model that contains only reified statements
 Statements are annotated with the Triples Map URI that produced them
1813th Hellenic Data Management Symposium (HDMS'15)
Incremental RDF Triple Transformation
 Basic challenge
 Discover, since the last time the incremental RDF
generation took place
 Which database tuples were modified
 Which Triples Maps were modified
 Then, perform the mapping only for this altered subset
 Ideally, we should detect the exact changed
database cells and modify only the respectively
generated elements in the RDF graph
 However, using R2RML, the atom of the mapping definition
becomes the Triples Map
a b c d e
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
s p O
. . .
. . .
. . .
.
. . .
. . .
. . .
. . .
. . .
. . .
Triples Map
Result
set
Triples
11
13th Hellenic Data Management Symposium (HDMS'15)
Incremental transformation
 Possible when the resulting RDF graph is persisted on the hard disk
 The algorithm does not run the entire set of triples maps
 Consult the log file with the output of the last run of the algorithm
 MD5 hashes of triples maps definitions, SELECT queries, and respective query resultsets
 Perform transformations only on the changed data subset
 I.e. triples maps for which a change was detected
 The resulting RDF graph file is erased and rewritten on the hard disk
 Retrieve unchanged triples from the log file
 Log file contains a set of reified statements, annotated as per source Triples Maps
definition
13th Hellenic Data Management Symposium (HDMS'15) 20
Incremental storage
 Store changes without rewriting the whole graph
 Possible when the resulting graph is persisted in an RDF store
 Jena’s TDB in our case
 The output medium must allow additions/deletions/modifications at the triples level
13th Hellenic Data Management Symposium (HDMS'15) 21
Proposed Approach
 For each Triples Map in the Mapping Document
 Decide whether we have to produce the resulting triples, based on the logged MD5
hashes
 Dumping to the Hard Disk
 Initially, generate the number of RDF triples that correspond to the source
database
 RDF triples are logged and annotated as reified statements
 Incremental generation
 In subsequent executions, modify the existing reified model, by reflecting only the
changes in the source database
 Dumping to a database or to TDB
 No log is needed, storage is incremental by default
22
Outline
 Introduction
 Background
 Proposed Approach
 Measurements
 Conclusions
13th Hellenic Data Management Symposium (HDMS'15) 23
Measurements Setup
 An Ubuntu server, 2GHz dual-core, 4GB RAM
 Oracle Java 1.7, Postgresql 9.1, Mysql 5.5.32
 7 DSpace (dspace.org) repositories
 1k, 5k, 10k, 50k, 100k, 500k, 1m items, respectively
 Random data text values (2-50 chars) populating a random number (5-30) of Dublin
Core metadata fields
 A set of SQL queries: complicated, simplified, and simple
 In order to deal with database caching effects, the queries were run several times,
prior to performing the measurements
2413th Hellenic Data Management Symposium (HDMS'15)
Query Sets
 Complicated
 3 expensive JOIN conditions
among 4 tables
 4 WHERE clauses
 Simplified
 2 JOIN conditions among 3 tables
 2 WHERE clauses
 Simple
 No JOIN or WHERE conditions
SELECT i.item_id AS item_id, mv.text_value AS text_value
FROM item AS i, metadatavalue AS mv,
metadataschemaregistry
AS msr, metadatafieldregistry AS mfr WHERE
msr.metadata_schema_id=mfr.metadata_schema_id AND
mfr.metadata_field_id=mv.metadata_field_id AND
mv.text_value is not null AND
i.item_id=mv.item_id AND
msr.namespace='http://dublincore.org/documents/dcmi-
terms/'
AND mfr.element='coverage'
AND mfr.qualifier='spatial'
SELECT i.item_id AS item_id, mv.text_value AS text_value
FROM item AS i, metadatavalue AS mv,
metadatafieldregistry AS mfr WHERE
mfr.metadata_field_id=mv.metadata_field_id AND
i.item_id=mv.item_id AND
mfr.element='coverage' AND
mfr.qualifier='spatial'
SELECT "language", "netid", "phone",
"sub_frequency","last_active", "self_registered",
"require_certificate", "can_log_in", "lastname",
"firstname", "digest_algorithm", "salt", "password",
"email", "eperson_id"
FROM "eperson" ORDER BY "language"
Q1: 28.32 *
Q2: 21.29 *
Q3: 12.52 *
* Score obtained using PostgreSQL’s EXPLAIN
Measurements Results
 Exporting to an RDF File
 Exporting to a Relational Database
 Exporting to Jena TDB
13th Hellenic Data Management Symposium (HDMS'15) 26
Exporting to an RDF File (1)
 Export to an RDF file
 Simple and complicated queries,
initial export
 Initial incremental dumps take more
time than non-incremental, as the
reified model also has to be created
0
100
200
300
400
500
600
700
1000 5000 10000
source database rows
non-ncremental
incemental
0
5
10
15
20
25
30
35
40
3000 15000 30000 150000 300000
result model triples
non-incremental
incremental
2713th Hellenic Data Management Symposium (HDMS'15)
n non-incremental mapping
transformation
a incremental, for the initial time
b 0/12 (no changes)
c 1/12
d 3/12
e 6/12
f 9/12
g 12/12
0
100
200
300
400
500
600
700
n a b c d e f g
1000 items 5000 items 10000 items
Exporting to an RDF File (2)
 12 Triples Maps
28
Data change
0
50
100
150
200
250
300
350
n a b c d e f g
complicated mappings simpler mappings
0
100
200
300
400
500
600
700
800
900
n a b c d e f g
RDF file database jena TDB
0
2000
4000
6000
8000
a b c d e f g
database jena TDB
Exporting to a Database
and to Jena TDB
 Jena TDB is the optimal approach
regarding scalability
29
~180k triples
~1.8m triples
13th Hellenic Data Management Symposium (HDMS'15)
Outline
 Introduction
 Background
 Proposed Approach
 Measurements
 Conclusions
13th Hellenic Data Management Symposium (HDMS'15) 30
Conclusions (1)
 The approach is efficient when data freshness is not crucial and/or
selection queries over the contents are more frequent than the updates
 The task of exposing database contents as RDF could be considered similar
to the task of maintaining search indexes next to text content
 Third party software systems can operate completely based on the
exported graph
 E.g. using Fuseki, Sesame, Virtuoso
 TDB is the optimal solution regarding scalability
 Caution is still needed in producing de-referenceable URIs
31
Conclusions (2)
 On the efficiency of the approach for storing RDF on the Hard Disk
 Good results for mappings (or queries) that include (or lead to) expensive SQL queries
 E.g. with numerous JOIN statements
 For changes that can affect as much as ¾ of the source data
 Limitations
 By physical memory
 Scales up to several millions of triples, does not qualify as “Big Data”
 Formatting of the logged model did affect performance
 RDF/XML and TTL try to pretty-print the result, consuming extra resources
 N-TRIPLES is optimal
3213th Hellenic Data Management Symposium (HDMS'15)
Future Work
 Hashing Result sets is expensive
 Requires re-run of the query, adds an “expensive” ORDER BY clause
 Further study the impact of SQL complexity on the performance
 Investigation of two-way updates
 Send changes from the triplestore back to the database
3313th Hellenic Data Management Symposium (HDMS'15)
Questions?
3413th Hellenic Data Management Symposium (HDMS'15)

Weitere ähnliche Inhalte

Was ist angesagt?

Tools for Next Generation of CMS: XML, RDF, & GRDDL
Tools for Next Generation of CMS: XML, RDF, & GRDDLTools for Next Generation of CMS: XML, RDF, & GRDDL
Tools for Next Generation of CMS: XML, RDF, & GRDDLChimezie Ogbuji
 
Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...
Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...
Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...SisInfLab-SWoT @Politecnico di Bari
 
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...Blake Regalia
 
co-Hadoop: Data co-location on Hadoop.
co-Hadoop: Data co-location on Hadoop.co-Hadoop: Data co-location on Hadoop.
co-Hadoop: Data co-location on Hadoop.Yousef Fadila
 
An unsupervised framework for effective indexing of BigData
An unsupervised framework for effective indexing of BigDataAn unsupervised framework for effective indexing of BigData
An unsupervised framework for effective indexing of BigDataRamakrishna Prasad Sakhamuri
 
Topic 6: MapReduce Applications
Topic 6: MapReduce ApplicationsTopic 6: MapReduce Applications
Topic 6: MapReduce ApplicationsZubair Nabi
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management ServiceSafe Software
 
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...Kyong-Ha Lee
 
Stratosphere with big_data_analytics
Stratosphere with big_data_analyticsStratosphere with big_data_analytics
Stratosphere with big_data_analyticsAvinash Pandu
 
Bitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingBitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingKyong-Ha Lee
 
Managing large datasets in R – ff examples and concepts
Managing large datasets in R – ff examples and conceptsManaging large datasets in R – ff examples and concepts
Managing large datasets in R – ff examples and conceptsAjay Ohri
 
Poster GraphQL-LD: Linked Data Querying with GraphQL
Poster GraphQL-LD: Linked Data Querying with GraphQLPoster GraphQL-LD: Linked Data Querying with GraphQL
Poster GraphQL-LD: Linked Data Querying with GraphQLRuben Taelman
 
Applying stratosphere for big data analytics
Applying stratosphere for big data analyticsApplying stratosphere for big data analytics
Applying stratosphere for big data analyticsAvinash Pandu
 
Linked Media and Data Using Apache Marmotta
Linked Media and Data Using Apache MarmottaLinked Media and Data Using Apache Marmotta
Linked Media and Data Using Apache MarmottaSebastian Schaffert
 
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data SourcesVirtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sourcesrumito
 
Presentation data collection and gtfs
Presentation data collection and gtfsPresentation data collection and gtfs
Presentation data collection and gtfsLIFE GreenYourMove
 

Was ist angesagt? (20)

Tools for Next Generation of CMS: XML, RDF, & GRDDL
Tools for Next Generation of CMS: XML, RDF, & GRDDLTools for Next Generation of CMS: XML, RDF, & GRDDL
Tools for Next Generation of CMS: XML, RDF, & GRDDL
 
Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...
Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...
Linked Data (in low-resource) Platforms: a mapping for Constrained Applicatio...
 
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
Towards a More Efficient Paradigm of Storing and Querying Spatial Data on the...
 
co-Hadoop: Data co-location on Hadoop.
co-Hadoop: Data co-location on Hadoop.co-Hadoop: Data co-location on Hadoop.
co-Hadoop: Data co-location on Hadoop.
 
An unsupervised framework for effective indexing of BigData
An unsupervised framework for effective indexing of BigDataAn unsupervised framework for effective indexing of BigData
An unsupervised framework for effective indexing of BigData
 
PhD Defense
PhD DefensePhD Defense
PhD Defense
 
Topic 6: MapReduce Applications
Topic 6: MapReduce ApplicationsTopic 6: MapReduce Applications
Topic 6: MapReduce Applications
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management Service
 
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...
HadoopXML: A Suite for Parallel Processing of Massive XML Data with Multiple ...
 
Stratosphere with big_data_analytics
Stratosphere with big_data_analyticsStratosphere with big_data_analytics
Stratosphere with big_data_analytics
 
Bitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingBitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query Processing
 
Summary of HDF-EOS5 Files, Data Model and File Format
Summary of HDF-EOS5 Files, Data Model and File FormatSummary of HDF-EOS5 Files, Data Model and File Format
Summary of HDF-EOS5 Files, Data Model and File Format
 
Managing large datasets in R – ff examples and concepts
Managing large datasets in R – ff examples and conceptsManaging large datasets in R – ff examples and concepts
Managing large datasets in R – ff examples and concepts
 
Poster GraphQL-LD: Linked Data Querying with GraphQL
Poster GraphQL-LD: Linked Data Querying with GraphQLPoster GraphQL-LD: Linked Data Querying with GraphQL
Poster GraphQL-LD: Linked Data Querying with GraphQL
 
Introduction to W3C Linked Data Platform
Introduction to W3C Linked Data PlatformIntroduction to W3C Linked Data Platform
Introduction to W3C Linked Data Platform
 
Applying stratosphere for big data analytics
Applying stratosphere for big data analyticsApplying stratosphere for big data analytics
Applying stratosphere for big data analytics
 
Linked Media and Data Using Apache Marmotta
Linked Media and Data Using Apache MarmottaLinked Media and Data Using Apache Marmotta
Linked Media and Data Using Apache Marmotta
 
Introduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIsIntroduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIs
 
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data SourcesVirtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
Virtuoso Sponger - RDFizer Middleware for creating RDF from non RDF Data Sources
 
Presentation data collection and gtfs
Presentation data collection and gtfsPresentation data collection and gtfs
Presentation data collection and gtfs
 

Andere mochten auch

Materializing the Web of Linked Data
Materializing the Web of Linked DataMaterializing the Web of Linked Data
Materializing the Web of Linked DataNikolaos Konstantinou
 
Deploying Linked Open Data: Methodologies and Software Tools
Deploying Linked Open Data: Methodologies and Software ToolsDeploying Linked Open Data: Methodologies and Software Tools
Deploying Linked Open Data: Methodologies and Software ToolsNikolaos Konstantinou
 
Introduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic WebIntroduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic WebNikolaos Konstantinou
 
Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesNikolaos Konstantinou
 
Learning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingLearning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingVrije Universiteit Amsterdam
 
Publishing and Using Linked Data
Publishing and Using Linked DataPublishing and Using Linked Data
Publishing and Using Linked Dataostephens
 
Linked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable developmentLinked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable developmentMartin Kaltenböck
 
Generating Linked Data in Real-time from Sensor Data Streams
Generating Linked Data in Real-time from Sensor Data StreamsGenerating Linked Data in Real-time from Sensor Data Streams
Generating Linked Data in Real-time from Sensor Data StreamsNikolaos Konstantinou
 
Linking KOS Data [using SKOS and OWL2]
Linking KOS Data [using SKOS and OWL2]Linking KOS Data [using SKOS and OWL2]
Linking KOS Data [using SKOS and OWL2]Marcia Zeng
 
Entity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationEntity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationFaegheh Hasibi
 
From Research to Innovation: Linked Open Data and Gamification to Design Inte...
From Research to Innovation: Linked Open Data and Gamification to Design Inte...From Research to Innovation: Linked Open Data and Gamification to Design Inte...
From Research to Innovation: Linked Open Data and Gamification to Design Inte...Ig Bittencourt
 
Linked Data tutorial at Semtech 2012
Linked Data tutorial at Semtech 2012Linked Data tutorial at Semtech 2012
Linked Data tutorial at Semtech 2012Juan Sequeda
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Juan Sequeda
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Juan Sequeda
 
Méthodes et outils pour interrelier le web des données
Méthodes et outils pour interrelier le web des donnéesMéthodes et outils pour interrelier le web des données
Méthodes et outils pour interrelier le web des donnéesFrançois Scharffe
 
RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031kwangsub kim
 
Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Juan Sequeda
 

Andere mochten auch (20)

Conclusions: Summary and Outlook
Conclusions: Summary and OutlookConclusions: Summary and Outlook
Conclusions: Summary and Outlook
 
Materializing the Web of Linked Data
Materializing the Web of Linked DataMaterializing the Web of Linked Data
Materializing the Web of Linked Data
 
Deploying Linked Open Data: Methodologies and Software Tools
Deploying Linked Open Data: Methodologies and Software ToolsDeploying Linked Open Data: Methodologies and Software Tools
Deploying Linked Open Data: Methodologies and Software Tools
 
Introduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic WebIntroduction: Linked Data and the Semantic Web
Introduction: Linked Data and the Semantic Web
 
Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational Databases
 
Learning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic ProgrammingLearning to assess Linked Data relationships using Genetic Programming
Learning to assess Linked Data relationships using Genetic Programming
 
Publishing and Using Linked Data
Publishing and Using Linked DataPublishing and Using Linked Data
Publishing and Using Linked Data
 
Linked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable developmentLinked Open Data Principles, benefits of LOD for sustainable development
Linked Open Data Principles, benefits of LOD for sustainable development
 
Generating Linked Data in Real-time from Sensor Data Streams
Generating Linked Data in Real-time from Sensor Data StreamsGenerating Linked Data in Real-time from Sensor Data Streams
Generating Linked Data in Real-time from Sensor Data Streams
 
RDF Streams and Continuous SPARQL (C-SPARQL)
RDF Streams and Continuous SPARQL (C-SPARQL)RDF Streams and Continuous SPARQL (C-SPARQL)
RDF Streams and Continuous SPARQL (C-SPARQL)
 
Linking KOS Data [using SKOS and OWL2]
Linking KOS Data [using SKOS and OWL2]Linking KOS Data [using SKOS and OWL2]
Linking KOS Data [using SKOS and OWL2]
 
Publishing Linked Data from RDB
Publishing Linked Data from RDBPublishing Linked Data from RDB
Publishing Linked Data from RDB
 
Entity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and EvaluationEntity Linking in Queries: Tasks and Evaluation
Entity Linking in Queries: Tasks and Evaluation
 
From Research to Innovation: Linked Open Data and Gamification to Design Inte...
From Research to Innovation: Linked Open Data and Gamification to Design Inte...From Research to Innovation: Linked Open Data and Gamification to Design Inte...
From Research to Innovation: Linked Open Data and Gamification to Design Inte...
 
Linked Data tutorial at Semtech 2012
Linked Data tutorial at Semtech 2012Linked Data tutorial at Semtech 2012
Linked Data tutorial at Semtech 2012
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
 
Méthodes et outils pour interrelier le web des données
Méthodes et outils pour interrelier le web des donnéesMéthodes et outils pour interrelier le web des données
Méthodes et outils pour interrelier le web des données
 
RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031RDF Tutorial - SPARQL 20091031
RDF Tutorial - SPARQL 20091031
 
Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010
 

Ähnlich wie An Approach for the Incremental Export of Relational Databases into RDF Graphs

Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with HadoopOReillyStrata
 
Large Scale ETL with Hadoop
Large Scale ETL with HadoopLarge Scale ETL with Hadoop
Large Scale ETL with HadoopEric Sammer
 
Quadrupling your elephants - RDF and the Hadoop ecosystem
Quadrupling your elephants - RDF and the Hadoop ecosystemQuadrupling your elephants - RDF and the Hadoop ecosystem
Quadrupling your elephants - RDF and the Hadoop ecosystemRob Vesse
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsRobert Grossman
 
dmapply: A functional primitive to express distributed machine learning algor...
dmapply: A functional primitive to express distributed machine learning algor...dmapply: A functional primitive to express distributed machine learning algor...
dmapply: A functional primitive to express distributed machine learning algor...Bikash Chandra Karmokar
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxHARIKRISHNANU13
 
Hadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologiesHadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologiesKelly Technologies
 
Translation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLTranslation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLFranck Michel
 
Stanford CS347 Guest Lecture: Apache Spark
Stanford CS347 Guest Lecture: Apache SparkStanford CS347 Guest Lecture: Apache Spark
Stanford CS347 Guest Lecture: Apache SparkReynold Xin
 
Generating Executable Mappings from RDF Data Cube Data Structure Definitions
Generating Executable Mappings from RDF Data Cube Data Structure DefinitionsGenerating Executable Mappings from RDF Data Cube Data Structure Definitions
Generating Executable Mappings from RDF Data Cube Data Structure DefinitionsChristophe Debruyne
 
Hadoop institutes-in-bangalore
Hadoop institutes-in-bangaloreHadoop institutes-in-bangalore
Hadoop institutes-in-bangaloreKelly Technologies
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersXiao Qin
 

Ähnlich wie An Approach for the Incremental Export of Relational Databases into RDF Graphs (20)

Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with Hadoop
 
Large Scale ETL with Hadoop
Large Scale ETL with HadoopLarge Scale ETL with Hadoop
Large Scale ETL with Hadoop
 
Map reducefunnyslide
Map reducefunnyslideMap reducefunnyslide
Map reducefunnyslide
 
RDD
RDDRDD
RDD
 
Quadrupling your elephants - RDF and the Hadoop ecosystem
Quadrupling your elephants - RDF and the Hadoop ecosystemQuadrupling your elephants - RDF and the Hadoop ecosystem
Quadrupling your elephants - RDF and the Hadoop ecosystem
 
ECSA 2013 (Cuesta)
ECSA 2013 (Cuesta)ECSA 2013 (Cuesta)
ECSA 2013 (Cuesta)
 
Data Science
Data ScienceData Science
Data Science
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data Clouds
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
dmapply: A functional primitive to express distributed machine learning algor...
dmapply: A functional primitive to express distributed machine learning algor...dmapply: A functional primitive to express distributed machine learning algor...
dmapply: A functional primitive to express distributed machine learning algor...
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
 
Hadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologiesHadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologies
 
Translation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLTranslation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RML
 
Stanford CS347 Guest Lecture: Apache Spark
Stanford CS347 Guest Lecture: Apache SparkStanford CS347 Guest Lecture: Apache Spark
Stanford CS347 Guest Lecture: Apache Spark
 
Generating Executable Mappings from RDF Data Cube Data Structure Definitions
Generating Executable Mappings from RDF Data Cube Data Structure DefinitionsGenerating Executable Mappings from RDF Data Cube Data Structure Definitions
Generating Executable Mappings from RDF Data Cube Data Structure Definitions
 
SparkNotes
SparkNotesSparkNotes
SparkNotes
 
Hadoop institutes-in-bangalore
Hadoop institutes-in-bangaloreHadoop institutes-in-bangalore
Hadoop institutes-in-bangalore
 
HDF5 Advanced Topics
HDF5 Advanced TopicsHDF5 Advanced Topics
HDF5 Advanced Topics
 
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop ClustersHDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
HDFS-HC: A Data Placement Module for Heterogeneous Hadoop Clusters
 
Data Analytics using MATLAB and HDF5
Data Analytics using MATLAB and HDF5Data Analytics using MATLAB and HDF5
Data Analytics using MATLAB and HDF5
 

Mehr von Nikolaos Konstantinou

Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Nikolaos Konstantinou
 
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...Nikolaos Konstantinou
 
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματα
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματαΔιαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματα
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματαNikolaos Konstantinou
 
Publishing Data Using Semantic Web Technologies
Publishing Data Using Semantic Web TechnologiesPublishing Data Using Semantic Web Technologies
Publishing Data Using Semantic Web TechnologiesNikolaos Konstantinou
 
From Sensor Data to Triples: Information Flow in Semantic Sensor Networks
From Sensor Data to Triples: Information Flow in Semantic Sensor NetworksFrom Sensor Data to Triples: Information Flow in Semantic Sensor Networks
From Sensor Data to Triples: Information Flow in Semantic Sensor NetworksNikolaos Konstantinou
 
A rule-based approach for the real-time semantic annotation in context-aware ...
A rule-based approach for the real-time semantic annotation in context-aware ...A rule-based approach for the real-time semantic annotation in context-aware ...
A rule-based approach for the real-time semantic annotation in context-aware ...Nikolaos Konstantinou
 
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...Nikolaos Konstantinou
 

Mehr von Nikolaos Konstantinou (8)

Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Exposing Bibliographic Information as Linked Open Data using Standards-based ...
Exposing Bibliographic Information as Linked Open Data using Standards-based ...
 
OR2012 Biblio-transformation-engine
OR2012 Biblio-transformation-engineOR2012 Biblio-transformation-engine
OR2012 Biblio-transformation-engine
 
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...
Priamos: A Middleware Architecture for Real-Time Semantic Annotation of Conte...
 
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματα
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματαΔιαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματα
Διαχείριση Ψηφιακού Περιεχομένου με το DSpace: Λειτουργία και τεχνικά ζητήματα
 
Publishing Data Using Semantic Web Technologies
Publishing Data Using Semantic Web TechnologiesPublishing Data Using Semantic Web Technologies
Publishing Data Using Semantic Web Technologies
 
From Sensor Data to Triples: Information Flow in Semantic Sensor Networks
From Sensor Data to Triples: Information Flow in Semantic Sensor NetworksFrom Sensor Data to Triples: Information Flow in Semantic Sensor Networks
From Sensor Data to Triples: Information Flow in Semantic Sensor Networks
 
A rule-based approach for the real-time semantic annotation in context-aware ...
A rule-based approach for the real-time semantic annotation in context-aware ...A rule-based approach for the real-time semantic annotation in context-aware ...
A rule-based approach for the real-time semantic annotation in context-aware ...
 
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...
VisAVis: An Approach to an Intermediate Layer between Ontologies and Relation...
 

Kürzlich hochgeladen

CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 

Kürzlich hochgeladen (20)

CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

An Approach for the Incremental Export of Relational Databases into RDF Graphs

  • 1. An Approach for the Incremental Export of Relational Databases into RDF Graphs N. Konstantinou, D.-E. Spanos, D. Kouis, N. Mitrou National Technical University of Athens 13th Hellenic Data Management Symposium (HDMS'15)
  • 2. About this work  Started as a project in 2012 in the National Documentation Centre, to offer Linked Data views over its contents  Evolved as a standards-compliant open-source tool  First results presented in IC-ININFO’13 and MTSR’13  Journal paper of the presentation of the software used was awarded an Outstanding Paper Award  Latest results presented in WIMS’14  Revised and extended version in a special issue in IJAIT (2015) 13th Hellenic Data Management Symposium (HDMS'15) 2
  • 3. Outline  Introduction  Background  Proposed Approach  Measurements  Conclusions 13th Hellenic Data Management Symposium (HDMS'15) 3
  • 4. Introduction  Information collection, maintenance and update is not always taking place directly at a triplestore, but at a RDBMS  It can be difficult to change established methodologies and systems  Especially in less frequently changing environments, e.g. libraries  Triplestores are often kept as an alternative content delivery channel  Newer technologies need to operate side-by-side to existing ones before migration 13th Hellenic Data Management Symposium (HDMS'15) 4
  • 5. Mapping Relational Data to RDF  Synchronous or Asynchronous RDF Views  Real-time SPARQL-to-SQL or Querying the RDF dump using SPARQL  Queries on the RDF dump are faster in certain conditions, compared to round-trips to the database  Difference in the performance more visible when SPARQL queries involve numerous triple patterns (which translate to expensive JOIN statements)  In this paper, we focus on the asynchronous approach  Exporting (dumping) relational database contents into an RDF graph 13th Hellenic Data Management Symposium (HDMS'15) 5
  • 6. Incremental Export into RDF (1/2)  Problem  Avoid dumping the whole database contents every time  In cases when few data change in the source database, it is not necessary to dump the entire database  Approach  Every time the RDF export is materialized  Detect the changes in the source database or the mapping definition  Insert/delete/update only the necessary triples, in order to reflect these changes in the resulting RDF graph 613th Hellenic Data Management Symposium (HDMS'15)
  • 7. Incremental Export into RDF (2/2)  Incremental transformation  Each time the transformation is executed, only the part in the database that changed should be transformed into RDF  Incremental storage  Storing (persisting) to the destination RDF graph only the triples that were modified and not the whole graph  Possible only when the resulting RDF graph is stored in a relational database or using Jena TDB 713th Hellenic Data Management Symposium (HDMS'15)
  • 8. Outline  Introduction  Background  Proposed Approach  Measurements  Conclusions 13th Hellenic Data Management Symposium (HDMS'15) 8
  • 9. Reification in RDF 13th Hellenic Data Management Symposium (HDMS'15) 9 <http://data.example.or g/repository/person/1> "John Smith" <http://data.example.or g/repository/person/1> foaf:name "John Smith" blank node rdf:Statement foaf:name rdf:type rdf:subject rdf:predicate rdf:object <http://data.example.org/repository/person/1> foaf:name "John Smith" . becomes [] a rdf:Statement ; rdf:subject <http://data.example.org/repository/person/1> ; rdf:predicate foaf:name ; rdf:object "John Smith" ; dc:source map:persons .
  • 10. Reification in RDF 13th Hellenic Data Management Symposium (HDMS'15) 10 <http://data.example.or g/repository/person/1> "John Smith" <http://data.example.or g/repository/person/1> foaf:name "John Smith" blank node rdf:Statement map:persons foaf:name rdf:type rdf:subject rdf:predicate rdf:object dc:source <http://data.example.org/repository/person/1> foaf:name "John Smith" . becomes [] a rdf:Statement ; rdf:subject <http://data.example.org/repository/person/1> ; rdf:predicate foaf:name ; rdf:object "John Smith" ; dc:source map:persons .  Ability to annotate every triple  E.g. the mapping definition that produced it
  • 11. R2RML  RDB to RDF Mapping Language  A W3C Recommendation, as of 2012  Mapping documents contain sets of Triples Maps 13th Hellenic Data Management Symposium (HDMS'15) 11 RefObjectMap TriplesMap PredicateObjectMap SubjectMap Generated Triples Generated Output Dataset GraphMap Join ObjectMap PredicateMap LogicalTable
  • 12. Triples Maps in R2RML (1)  Reusable mapping definitions  Specify a rule for translating each row of a logical table to zero or more RDF triples  A logical table is a tabular SQL query result set that is to be mapped to RDF triples  Execution of a triples map generates the triples that originate from the specific result set 1213th Hellenic Data Management Symposium (HDMS'15)
  • 13. Triples Maps in R2RML (2)  An example 13th Hellenic Data Management Symposium (HDMS'15) map:persons rr:logicalTable [ rr:tableName '"eperson"'; ]; rr:subjectMap [ rr:template 'http://data.example.org/repository/person/{"eperson_id"}'; rr:class foaf:Person; ]; rr:predicateObjectMap [ rr:predicate foaf:name; rr:objectMap [ rr:template '{"firstname"} {"lastname"}' ; rr:termType rr:Literal; ] ]. 13
  • 14. An R2RML Mapping Example @prefix map: <#>. @prefix rr: <http://www.w3.org/ns/r2rml#>. @prefix dcterms: <http://purl.org/dc/terms/>. map:persons-groups rr:logicalTable [ rr:tableName '"epersongroup2eperson"'; ]; rr:subjectMap [ rr:template 'http://data.example.org/repository/group/{"eperson_group_id"}'; ]; rr:predicateObjectMap [ rr:predicate foaf:member; rr:objectMap [ rr:template 'http://data.example.org/repository/person/{"eperson_id"}'; rr:termType rr:IRI; ] ]. 14 <http://data.example.org/repository/group/1> foaf:member <http://data.example.org/repository/person/1> , <http://data.example.org/repository/person/2> , <http://data.example.org/repository/person/3> , <http://data.example.org/repository/person/4> , <http://data.example.org/repository/person/5> , <http://data.example.org/repository/person/6> . Table epersongroup2eperson
  • 15. Outline  Introduction  Background  Proposed Approach  Measurements  Conclusions 13th Hellenic Data Management Symposium (HDMS'15) 15
  • 16. The R2RML Parser tool  An R2RML implementation  Command-line tool that can export relational database contents as RDF graphs, based on an R2RML mapping document  Open-source (CC BY-NC), written in Java  Publicly available at https://github.com/nkons/r2rml-parser  Worldwide interest (Ontotext, Abbvie, Financial Times)  Tested against MySQL, PostgreSQL, and Oracle  Output can be written in RDF/OWL  N3, Turtle, N-Triple, TTL, RDF/XML(-ABBREV) notation, or Jena TDB backend  Covers most (not all) of the R2RML constructs (see the wiki)  Does not offer SPARQL-to-SQL translations 16
  • 17. Information Flow (1)  Parse the source database contents into result sets  According to the R2RML Mapping File, the Parser generates a set of instructions to the Generator  The Generator instantiates in-memory the resulting RDF graph  Persist the generated RDF graph into  An RDF file in the Hard Disk, or  In Jena’s relational database (eventually rendered obsolete), or  In Jena’s TDB (Tuple Data Base, a custom implementation of B+ trees)  Log the results Parser GeneratorMapping fileSource database RDF graph TDB R2RML Parser 17 Target database Hard Disk
  • 18. Information Flow (2)  Overall generation time is the sum of the following:  t1: Parse mapping document  t2: Generate Jena model in memory  t3: Dump model to the destination medium  t4: Log the results  In incremental transformation, the log file contains the reified model  A model that contains only reified statements  Statements are annotated with the Triples Map URI that produced them 1813th Hellenic Data Management Symposium (HDMS'15)
  • 19. Incremental RDF Triple Transformation  Basic challenge  Discover, since the last time the incremental RDF generation took place  Which database tuples were modified  Which Triples Maps were modified  Then, perform the mapping only for this altered subset  Ideally, we should detect the exact changed database cells and modify only the respectively generated elements in the RDF graph  However, using R2RML, the atom of the mapping definition becomes the Triples Map a b c d e . . . . . . . . . . . . . . . . . . . . . . . . . s p O . . . . . . . . . . . . . . . . . . . . . . . . . . . . Triples Map Result set Triples 11 13th Hellenic Data Management Symposium (HDMS'15)
  • 20. Incremental transformation  Possible when the resulting RDF graph is persisted on the hard disk  The algorithm does not run the entire set of triples maps  Consult the log file with the output of the last run of the algorithm  MD5 hashes of triples maps definitions, SELECT queries, and respective query resultsets  Perform transformations only on the changed data subset  I.e. triples maps for which a change was detected  The resulting RDF graph file is erased and rewritten on the hard disk  Retrieve unchanged triples from the log file  Log file contains a set of reified statements, annotated as per source Triples Maps definition 13th Hellenic Data Management Symposium (HDMS'15) 20
  • 21. Incremental storage  Store changes without rewriting the whole graph  Possible when the resulting graph is persisted in an RDF store  Jena’s TDB in our case  The output medium must allow additions/deletions/modifications at the triples level 13th Hellenic Data Management Symposium (HDMS'15) 21
  • 22. Proposed Approach  For each Triples Map in the Mapping Document  Decide whether we have to produce the resulting triples, based on the logged MD5 hashes  Dumping to the Hard Disk  Initially, generate the number of RDF triples that correspond to the source database  RDF triples are logged and annotated as reified statements  Incremental generation  In subsequent executions, modify the existing reified model, by reflecting only the changes in the source database  Dumping to a database or to TDB  No log is needed, storage is incremental by default 22
  • 23. Outline  Introduction  Background  Proposed Approach  Measurements  Conclusions 13th Hellenic Data Management Symposium (HDMS'15) 23
  • 24. Measurements Setup  An Ubuntu server, 2GHz dual-core, 4GB RAM  Oracle Java 1.7, Postgresql 9.1, Mysql 5.5.32  7 DSpace (dspace.org) repositories  1k, 5k, 10k, 50k, 100k, 500k, 1m items, respectively  Random data text values (2-50 chars) populating a random number (5-30) of Dublin Core metadata fields  A set of SQL queries: complicated, simplified, and simple  In order to deal with database caching effects, the queries were run several times, prior to performing the measurements 2413th Hellenic Data Management Symposium (HDMS'15)
  • 25. Query Sets  Complicated  3 expensive JOIN conditions among 4 tables  4 WHERE clauses  Simplified  2 JOIN conditions among 3 tables  2 WHERE clauses  Simple  No JOIN or WHERE conditions SELECT i.item_id AS item_id, mv.text_value AS text_value FROM item AS i, metadatavalue AS mv, metadataschemaregistry AS msr, metadatafieldregistry AS mfr WHERE msr.metadata_schema_id=mfr.metadata_schema_id AND mfr.metadata_field_id=mv.metadata_field_id AND mv.text_value is not null AND i.item_id=mv.item_id AND msr.namespace='http://dublincore.org/documents/dcmi- terms/' AND mfr.element='coverage' AND mfr.qualifier='spatial' SELECT i.item_id AS item_id, mv.text_value AS text_value FROM item AS i, metadatavalue AS mv, metadatafieldregistry AS mfr WHERE mfr.metadata_field_id=mv.metadata_field_id AND i.item_id=mv.item_id AND mfr.element='coverage' AND mfr.qualifier='spatial' SELECT "language", "netid", "phone", "sub_frequency","last_active", "self_registered", "require_certificate", "can_log_in", "lastname", "firstname", "digest_algorithm", "salt", "password", "email", "eperson_id" FROM "eperson" ORDER BY "language" Q1: 28.32 * Q2: 21.29 * Q3: 12.52 * * Score obtained using PostgreSQL’s EXPLAIN
  • 26. Measurements Results  Exporting to an RDF File  Exporting to a Relational Database  Exporting to Jena TDB 13th Hellenic Data Management Symposium (HDMS'15) 26
  • 27. Exporting to an RDF File (1)  Export to an RDF file  Simple and complicated queries, initial export  Initial incremental dumps take more time than non-incremental, as the reified model also has to be created 0 100 200 300 400 500 600 700 1000 5000 10000 source database rows non-ncremental incemental 0 5 10 15 20 25 30 35 40 3000 15000 30000 150000 300000 result model triples non-incremental incremental 2713th Hellenic Data Management Symposium (HDMS'15)
  • 28. n non-incremental mapping transformation a incremental, for the initial time b 0/12 (no changes) c 1/12 d 3/12 e 6/12 f 9/12 g 12/12 0 100 200 300 400 500 600 700 n a b c d e f g 1000 items 5000 items 10000 items Exporting to an RDF File (2)  12 Triples Maps 28 Data change 0 50 100 150 200 250 300 350 n a b c d e f g complicated mappings simpler mappings
  • 29. 0 100 200 300 400 500 600 700 800 900 n a b c d e f g RDF file database jena TDB 0 2000 4000 6000 8000 a b c d e f g database jena TDB Exporting to a Database and to Jena TDB  Jena TDB is the optimal approach regarding scalability 29 ~180k triples ~1.8m triples 13th Hellenic Data Management Symposium (HDMS'15)
  • 30. Outline  Introduction  Background  Proposed Approach  Measurements  Conclusions 13th Hellenic Data Management Symposium (HDMS'15) 30
  • 31. Conclusions (1)  The approach is efficient when data freshness is not crucial and/or selection queries over the contents are more frequent than the updates  The task of exposing database contents as RDF could be considered similar to the task of maintaining search indexes next to text content  Third party software systems can operate completely based on the exported graph  E.g. using Fuseki, Sesame, Virtuoso  TDB is the optimal solution regarding scalability  Caution is still needed in producing de-referenceable URIs 31
  • 32. Conclusions (2)  On the efficiency of the approach for storing RDF on the Hard Disk  Good results for mappings (or queries) that include (or lead to) expensive SQL queries  E.g. with numerous JOIN statements  For changes that can affect as much as ¾ of the source data  Limitations  By physical memory  Scales up to several millions of triples, does not qualify as “Big Data”  Formatting of the logged model did affect performance  RDF/XML and TTL try to pretty-print the result, consuming extra resources  N-TRIPLES is optimal 3213th Hellenic Data Management Symposium (HDMS'15)
  • 33. Future Work  Hashing Result sets is expensive  Requires re-run of the query, adds an “expensive” ORDER BY clause  Further study the impact of SQL complexity on the performance  Investigation of two-way updates  Send changes from the triplestore back to the database 3313th Hellenic Data Management Symposium (HDMS'15)
  • 34. Questions? 3413th Hellenic Data Management Symposium (HDMS'15)