SlideShare ist ein Scribd-Unternehmen logo
1 von 43
> LOP – Capturing and Linking
Open Provenance on LOD Cycle
Rogers R. de Mendonça, Jonas F. S. M. De La Cerda, Kelli F. de Cordeiro
Sérgio M. S. da Cruz, Maria Cláudia Cavalcanti, Maria Luiza M. Campos
5th Internacional Workshop on
Semantic Web Information Management
SWIM 2013
New York, USA – June 23, 2013
>Outline
Introduction
– Provenance
– Linked Open Data Lifecycle
An Approach for Linked Open Provenance Capture
– Data Preparation and Transformation Process– Data Preparation and Transformation Process
– Data Interlinking Process
– Linked Open Provenance Architecture
– Usage Scenario
Conclusion
– Contributions
– Future Works
>Increase of the Web of Data
What about
data reliability and quality ?
>
Information about the history of the data:
– Where did the data come from?
– Who designed the publishing process?
– Who executed the publishing process?
– Which operations were applied to the data?
Provenance
Importance to the Web of Data:
– Support quality and reliability assessment of the
published data
>Semantic Web Stack
Provenance
W3C®
>
Provenance data available according to LOD principles:
1. Use URIs as names for things
2. Use HTTP URIs, so that people can look up those
names
3. When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL)
Linked Open Provenance (LOP)
information, using the standards (RDF, SPARQL)
4. Include links to other URIs, so that they can discover
more things
>Related Works
Ontologies / Vocabularies
– PROV-O (PROV-DM)
http://purl.org/net/opmv/ns
– OPMV (OPM)
http://www.w3.org/TR/prov-o/http://www.w3.org/TR/prov-o/
– Cogs (ETL)
http://vocab.deri.ie/cogs
– Dublin Core Metadata Terms , FOAF
>Related Works
Use of provenance to support quality and reliability
assessment of published data
– Provenance Information in the Web of Data (HARTIG,
2009)
– Managing the life-cycle of linked data with the LOD2
stack. (AUER et al, 2012)stack. (AUER et al, 2012)
– Linked Data Quality Assessment and Fusion
(MENDES et al, 2012)
Focus on metadata about the source and access of the
data
>
Interlinking
EnrichmentAuthoring
Linked Open Data Lifecycle
Quality
Evolution
Exploration
Extraction
Storage
LOD2
>
Interlinking
EnrichmentAuthoring
Quality Phase
Quality assessment
Quality
Evolution
Exploration
Extraction
Storage
LOD2
Quality assessment
>
Interlinking
EnrichmentAuthoring
Interlinking Phase
Create and maintain links
Quality assessment
Quality
Evolution
Exploration
Extraction
Storage
LOD2
Quality assessment
>
Interlinking
EnrichmentAuthoring
Extraction Phase
Create and maintain links
Quality assessment
Quality
Evolution
Exploration
Extraction
Storage
LOD2
Quality assessment
Extract and triplify data
>
Interlinking
EnrichmentAuthoring
Extension of Extraction Phase
Create and maintain links
Quality assessment
Quality
Evolution
Exploration
Extraction
Storage
Preparation
LOD2
Quality assessment
>
Interlinking
EnrichmentAuthoring
Extension: Preparation Before Triplification
Create and maintain links
Quality assessment
Quality
Evolution
Exploration
Extraction
Storage
Preparation
LOD2
Quality assessment
Extract, prepare and triplify data
>Data Publishing and Interlinking Process
>Data Publishing and Interlinking Process
Extraction Phase
>Data Preparation and Transformation Process
Heterogeneous
Data Sources
Triplify
Extract
Clean
Conform
Pre-Integrate
Data Preparation and Transformation
Process
ETL (Extraction-Tranformation-Loading) approach:ETL (Extraction-Tranformation-Loading) approach:
– Foundation of DW systems
– Its techniques and tools have been developed and
refined over many years in challenging BI scenarios
– It is very advantageous to inherit the potential of
theses techniques and tools to publish LOD and LOP
>Data Preparation and Transformation Process
Heterogeneous
Data Sources
Triplify
Extract
Clean
Conform
Pre-Integrate
Data Preparation and Transformation
Process
Use of a workflow to have:Use of a workflow to have:
– Systematization of the publishing process
– Monitoring and management of the several tasks
– Facilities for reusing the process
Pentaho Data Integration (a.k.a. Kettle)
– Open source, large community of users, extensible
>Data Publishing and Interlinking Process
Extraction Phase
Interlinking Phase
>Data Interlinking Process
Data Interlinking Process
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
>Data Interlinking Process
Data Interlinking Process
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
Extracts data from its original sources
>Data Interlinking Process
Data Interlinking Process
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
Matches corresponding terms of
multiple vocabularies
>Data Interlinking Process
Data Interlinking Process
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
Finds and links similar resources on
different datasets
>Data Interlinking Process
Data Interlinking Process
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
Evaluates data quality based on a set
of rules
>Provenance Oportunity
Data Interlinking Process
Heterogeneous
Data Sources
Triplify
Extract
Clean
Conform
Pre-Integrate
Data Preparation and Transformation
Process
All steps need heavy parameterization and produce a
lot of results
– Employed parameter values and techniques as well
as results obtained are all provenance data
Web Data
Access
Schema
Mappings
Identity
Resolution
Quality
Evaluator
>Linked Open Provenance Architecture
>Data Interlinking Scenarios
>Implementation of PGA
Provenance Gathering Agent
RDF Triple
Triple StoreTriple Store
Provenance
Data
Staging DatabaseStaging Database
>Implementation of PGA
The andThe PGA wraps the ETL process and
stores provenance in data staging
tables to be further extracted,
RDF Triple
Triple StoreTriple Store
Provenance
Data
Staging DatabaseStaging Database
tables to be further extracted,
triplified and loaded to the triple store
by other specific steps, developed
through Kettle API and Linked Open
Data frameworks
>Implementation of PGA
Web Data Access
Schema MappingsSchema Mappings
Identity Resolution
Provenance Gathering Agent was
implemented as a web service
written in Scala (www.scala-lang.org)
Provenance Gathering Agent was
implemented as a web service
written in Scala (www.scala-lang.org)
>Use Case Scenario
>Use Case Scenario
CNPq = Brazilian governmental organization
responsible for fostering scientific research
RNP = Brazilian governmental organization
that finances research projects
>Use Case Scenario – First Part
>Use Case Scenario – First Part
>
SELECT ?group_name ?project_name ?researcher_uri ?process_name
FROM NAMED <http://linkgraph.provenance.br>
FROM NAMED <http://datagraph.provenance.br>
FROM NAMED <http://www.cnpq.br>
FROM NAMED <http://lattes.cnpq.br>
WHERE
{
GRAPH <http://linkgraph.provenance.br> {
?row_uri provprop:cnpqResearchGroup ?group_uri .
?row_uri provprop:lattesProject ?project_uri .
?row_uri provprop:lattesResearcher ?researcher_uri . }
GRAPH <http://datagraph.provenance.br> {
Gets researcher’s groups,
projects and researchers
from data graphs of domain
dataset
Querying Linked Open Provenance
GRAPH <http://datagraph.provenance.br> {
?row_uri opmv:wasGeneratedBy ?process_uri .
?process_uri provprop:composition ?process_def_uri .
?process_def_uri dcterms:title ?process_name . }
GRAPH <http://www.cnpq.br> {
?group_uri cnpq:project ?project_uri .
?group_uri foaf:name ?group_name . }
GRAPH <http://lattes.cnpq.br> {
?project_uri foaf:name ?project_name .
?researcher_uri foaf:name ?researcher_name . }
}
Data, that were in differents datasources of the CNPq
organization, are now integrated in the Web of Data.
>Querying Linked Open Provenance
SELECT ?group_name ?project_name ?researcher_uri ?process_name
FROM NAMED <http://linkgraph.provenance.br>
FROM NAMED <http://datagraph.provenance.br>
FROM NAMED <http://www.cnpq.br>
FROM NAMED <http://lattes.cnpq.br>
WHERE
{
GRAPH <http://linkgraph.provenance.br> {
?row_uri provprop:cnpqResearchGroup ?group_uri .
?row_uri provprop:lattesProject ?project_uri .
?row_uri provprop:lattesResearcher ?researcher_uri . }
GRAPH <http://datagraph.provenance.br> {
Also gets the integration
process from provenance
graphs of Linked Open
Provenance dataset
GRAPH <http://datagraph.provenance.br> {
?row_uri opmv:wasGeneratedBy ?process_uri .
?process_uri provprop:composition ?process_def_uri .
?process_def_uri dcterms:title ?process_name . }
GRAPH <http://www.cnpq.br> {
?group_uri cnpq:project ?project_uri .
?group_uri foaf:name ?group_name . }
GRAPH <http://lattes.cnpq.br> {
?project_uri foaf:name ?project_name .
?researcher_uri foaf:name ?researcher_name . }
}
>
group_name project_name research_uri process_name
"GRECO - Grupo
Engenharia do
Conhecimento"@pt
"LinkedDataBR -
Exposição,
compartilhamento e
http://lattes.cn
pq.br/resourc
e/Researcher/
"Merge CNPq
Research Groups
x Lattes Projects"
Querying Linked Open Provenance
Conhecimento"@pt compartilhamento e
conexão de recursos de
dados abertos na Web
(Linked Open Data)"@pt
e/Researcher/
K4781460T3
x Lattes Projects"
"GRECO - Grupo
Engenharia do
Conhecimento"@pt
"Núcleo de Pesquisa de
Sistemas Computacionais
Complexos para a Gestão
de Emergências"@pt
http://lattes.cn
pq.br/resourc
e/Researcher/
K4717449A7
"Merge CNPq
Research Groups
x Lattes Projects"
"GRECO - Grupo
Engenharia do
Conhecimento"@pt
"Identificação e Análise de
Redes Sociais
Complexas"@pt
http://lattes.cn
pq.br/resourc
e/Researcher/
K4761314U5
"Merge CNPq
Research Groups
x Lattes Projects"
>Use Case Scenario – Second Part
>Use Case Scenario – Second Part
>Use Case Scenario – Provenance Evaluation
At the end of the execution of both processes, a
SPARQL query could be used to ask: “At which
projects does a researcher work?”
The result would include projects declared in the CNPq
dataset and in the RNP datasetdataset and in the RNP dataset
If the projects returned by CNPq diverges from RNP, it
is possible to investigate the cause by querying and
evaluating LOP data
>Conclusion - Contributions
New strategy to provide provenance for data and links
of Web of Data
LOD cycle is extended with a systematic data
preparation and transformation process, supported by
an ETL workflow frameworkan ETL workflow framework
Provenance data is available according to LOD
principles (Linked Open Provenance)
>Conclusion – Future works
Development of provenance query interface
– Take advantage of LOP and support its exploration
Development / evolution of a provenance ontology
– Today, we are using a combination of vocabularies
Investigation in the area of Big Data
– Fine-grained provenance generates large volumes of
data
>Thank You !
LOP – Capturing and Linking Open
Provenance on LOD Cycle
Rogers R. de Mendonça 1
rogers@ufrj.br
Jonas F. S. M. De La Cerda 2
jonas.ferreira@uniriotec.br
Kelli F. de Cordeiro 1
kelli@ufrj.br
Sérgio M. S. da Cruz 3
serra@ufrrj.br
Maria Cláudia Cavalcanti 2
yoko@ime.eb.br
Maria Luiza M. Campos 1
mluiza@ppgi.ufrj.br
1 Federal University of
Rio de Janeiro - UFRJ
2 Military Institute of
Engineering - IME
3 Federal Rural University
of Rio de Janeiro - UFRRJ

Weitere ähnliche Inhalte

Was ist angesagt?

Reuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and RealizationReuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and Realizationandrea huang
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyNandana Mihindukulasooriya
 
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...Yongyao Jiang
 
Linked data HHS 2015
Linked data HHS 2015Linked data HHS 2015
Linked data HHS 2015Cason Snow
 
Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesNikolaos Konstantinou
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?andrea huang
 
WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410Arnaud Le Hors
 
Big Linked Data - Creating Training Curricula
Big Linked Data - Creating Training CurriculaBig Linked Data - Creating Training Curricula
Big Linked Data - Creating Training CurriculaEUCLID project
 
Linked data MLA 2015
Linked data MLA 2015Linked data MLA 2015
Linked data MLA 2015Cason Snow
 
Linked Data MLA 2015
Linked Data MLA 2015Linked Data MLA 2015
Linked Data MLA 2015Cason Snow
 
How to expose research data in EOSC
How to expose research data in EOSCHow to expose research data in EOSC
How to expose research data in EOSCEUDAT
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
Linking Open Government Data at Scale
Linking Open Government Data at Scale Linking Open Government Data at Scale
Linking Open Government Data at Scale Bernadette Hyland-Wood
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu
 
Bio2RDF Distributed Querying model
Bio2RDF Distributed Querying modelBio2RDF Distributed Querying model
Bio2RDF Distributed Querying modelPeter Ansell
 
Introduction to Web Services
Introduction to Web ServicesIntroduction to Web Services
Introduction to Web ServicesJeffrey Anderson
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 

Was ist angesagt? (20)

Reuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and RealizationReuse of Structured Data: Semantics, Linkage, and Realization
Reuse of Structured Data: Semantics, Linkage, and Realization
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core Vocabulary
 
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
 
Linked data HHS 2015
Linked data HHS 2015Linked data HHS 2015
Linked data HHS 2015
 
Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational Databases
 
Providing Linked Data
Providing Linked DataProviding Linked Data
Providing Linked Data
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?How to clean data less through Linked (Open Data) approach?
How to clean data less through Linked (Open Data) approach?
 
WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410WWW2014 Overview of W3C Linked Data Platform 20140410
WWW2014 Overview of W3C Linked Data Platform 20140410
 
Big Linked Data - Creating Training Curricula
Big Linked Data - Creating Training CurriculaBig Linked Data - Creating Training Curricula
Big Linked Data - Creating Training Curricula
 
Linked data MLA 2015
Linked data MLA 2015Linked data MLA 2015
Linked data MLA 2015
 
Linked Data MLA 2015
Linked Data MLA 2015Linked Data MLA 2015
Linked Data MLA 2015
 
How to expose research data in EOSC
How to expose research data in EOSCHow to expose research data in EOSC
How to expose research data in EOSC
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Linking Open Government Data at Scale
Linking Open Government Data at Scale Linking Open Government Data at Scale
Linking Open Government Data at Scale
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
Bio2RDF Distributed Querying model
Bio2RDF Distributed Querying modelBio2RDF Distributed Querying model
Bio2RDF Distributed Querying model
 
Introduction to Web Services
Introduction to Web ServicesIntroduction to Web Services
Introduction to Web Services
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 

Ähnlich wie LOP – Capturing and Linking Open Provenance on LOD Cycle

ALIGNED Data Curation Methods and Tools
ALIGNED Data Curation Methods and ToolsALIGNED Data Curation Methods and Tools
ALIGNED Data Curation Methods and ToolsAlignedProject
 
Data Quality
Data QualityData Quality
Data Qualityjerdeb
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data GenerationFilip Radulovic
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapubeswcsummerschool
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataeXascale Infolab
 
Scientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity MatrixScientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity MatrixGe Peng
 
THOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingTHOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingMaaike Duine
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsMerce Crosas
 
Resilient Linked Data
Resilient Linked DataResilient Linked Data
Resilient Linked DataDave Reynolds
 
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_data
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_dataA candidate dataset_discovery_and_linkage_recommendation_system_for_linked_data
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_dataSTIinnsbruck
 
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Anastasija Nikiforova
 
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014Robert Meusel
 
Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Ge Peng
 
2010 CLARA Nijmegen - Data Seal of Approval tutorial
2010 CLARA Nijmegen - Data Seal of Approval tutorial2010 CLARA Nijmegen - Data Seal of Approval tutorial
2010 CLARA Nijmegen - Data Seal of Approval tutorialDirk Roorda
 
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryBioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryWolfgang G. Hoeck
 
An Extensible Framework to Validate and Build Dataset Profiles
An Extensible Framework to Validate and Build Dataset ProfilesAn Extensible Framework to Validate and Build Dataset Profiles
An Extensible Framework to Validate and Build Dataset ProfilesAhmad Assaf
 
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
(ATS6-APP01) Unleashing the Power of Your Data with DiscoverantBIOVIA
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Rakebul Hasan
 
Testing insights from data lakes
Testing insights from data lakesTesting insights from data lakes
Testing insights from data lakesshivindkaur
 
Modeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVModeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVEUDAT
 

Ähnlich wie LOP – Capturing and Linking Open Provenance on LOD Cycle (20)

ALIGNED Data Curation Methods and Tools
ALIGNED Data Curation Methods and ToolsALIGNED Data Curation Methods and Tools
ALIGNED Data Curation Methods and Tools
 
Data Quality
Data QualityData Quality
Data Quality
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data Generation
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
Efficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked DataEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data
 
Scientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity MatrixScientific Data Stewardship Maturity Matrix
Scientific Data Stewardship Maturity Matrix
 
THOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier LinkingTHOR Workshop - Persistent Identifier Linking
THOR Workshop - Persistent Identifier Linking
 
Dataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTagsDataverse, Cloud Dataverse, and DataTags
Dataverse, Cloud Dataverse, and DataTags
 
Resilient Linked Data
Resilient Linked DataResilient Linked Data
Resilient Linked Data
 
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_data
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_dataA candidate dataset_discovery_and_linkage_recommendation_system_for_linked_data
A candidate dataset_discovery_and_linkage_recommendation_system_for_linked_data
 
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
Data Lake or Data Warehouse? Data Cleaning or Data Wrangling? How to Ensure t...
 
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014
Linked Data for Information Extraction Challenge - Tasks and Results @ ISWC 2014
 
Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695
 
2010 CLARA Nijmegen - Data Seal of Approval tutorial
2010 CLARA Nijmegen - Data Seal of Approval tutorial2010 CLARA Nijmegen - Data Seal of Approval tutorial
2010 CLARA Nijmegen - Data Seal of Approval tutorial
 
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryBioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
 
An Extensible Framework to Validate and Build Dataset Profiles
An Extensible Framework to Validate and Build Dataset ProfilesAn Extensible Framework to Validate and Build Dataset Profiles
An Extensible Framework to Validate and Build Dataset Profiles
 
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
Testing insights from data lakes
Testing insights from data lakesTesting insights from data lakes
Testing insights from data lakes
 
Modeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVModeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROV
 

Kürzlich hochgeladen

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 

Kürzlich hochgeladen (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

LOP – Capturing and Linking Open Provenance on LOD Cycle

  • 1. > LOP – Capturing and Linking Open Provenance on LOD Cycle Rogers R. de Mendonça, Jonas F. S. M. De La Cerda, Kelli F. de Cordeiro Sérgio M. S. da Cruz, Maria Cláudia Cavalcanti, Maria Luiza M. Campos 5th Internacional Workshop on Semantic Web Information Management SWIM 2013 New York, USA – June 23, 2013
  • 2. >Outline Introduction – Provenance – Linked Open Data Lifecycle An Approach for Linked Open Provenance Capture – Data Preparation and Transformation Process– Data Preparation and Transformation Process – Data Interlinking Process – Linked Open Provenance Architecture – Usage Scenario Conclusion – Contributions – Future Works
  • 3. >Increase of the Web of Data What about data reliability and quality ?
  • 4. > Information about the history of the data: – Where did the data come from? – Who designed the publishing process? – Who executed the publishing process? – Which operations were applied to the data? Provenance Importance to the Web of Data: – Support quality and reliability assessment of the published data
  • 6. > Provenance data available according to LOD principles: 1. Use URIs as names for things 2. Use HTTP URIs, so that people can look up those names 3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) Linked Open Provenance (LOP) information, using the standards (RDF, SPARQL) 4. Include links to other URIs, so that they can discover more things
  • 7. >Related Works Ontologies / Vocabularies – PROV-O (PROV-DM) http://purl.org/net/opmv/ns – OPMV (OPM) http://www.w3.org/TR/prov-o/http://www.w3.org/TR/prov-o/ – Cogs (ETL) http://vocab.deri.ie/cogs – Dublin Core Metadata Terms , FOAF
  • 8. >Related Works Use of provenance to support quality and reliability assessment of published data – Provenance Information in the Web of Data (HARTIG, 2009) – Managing the life-cycle of linked data with the LOD2 stack. (AUER et al, 2012)stack. (AUER et al, 2012) – Linked Data Quality Assessment and Fusion (MENDES et al, 2012) Focus on metadata about the source and access of the data
  • 9. > Interlinking EnrichmentAuthoring Linked Open Data Lifecycle Quality Evolution Exploration Extraction Storage LOD2
  • 11. > Interlinking EnrichmentAuthoring Interlinking Phase Create and maintain links Quality assessment Quality Evolution Exploration Extraction Storage LOD2 Quality assessment
  • 12. > Interlinking EnrichmentAuthoring Extraction Phase Create and maintain links Quality assessment Quality Evolution Exploration Extraction Storage LOD2 Quality assessment Extract and triplify data
  • 13. > Interlinking EnrichmentAuthoring Extension of Extraction Phase Create and maintain links Quality assessment Quality Evolution Exploration Extraction Storage Preparation LOD2 Quality assessment
  • 14. > Interlinking EnrichmentAuthoring Extension: Preparation Before Triplification Create and maintain links Quality assessment Quality Evolution Exploration Extraction Storage Preparation LOD2 Quality assessment Extract, prepare and triplify data
  • 15. >Data Publishing and Interlinking Process
  • 16. >Data Publishing and Interlinking Process Extraction Phase
  • 17. >Data Preparation and Transformation Process Heterogeneous Data Sources Triplify Extract Clean Conform Pre-Integrate Data Preparation and Transformation Process ETL (Extraction-Tranformation-Loading) approach:ETL (Extraction-Tranformation-Loading) approach: – Foundation of DW systems – Its techniques and tools have been developed and refined over many years in challenging BI scenarios – It is very advantageous to inherit the potential of theses techniques and tools to publish LOD and LOP
  • 18. >Data Preparation and Transformation Process Heterogeneous Data Sources Triplify Extract Clean Conform Pre-Integrate Data Preparation and Transformation Process Use of a workflow to have:Use of a workflow to have: – Systematization of the publishing process – Monitoring and management of the several tasks – Facilities for reusing the process Pentaho Data Integration (a.k.a. Kettle) – Open source, large community of users, extensible
  • 19. >Data Publishing and Interlinking Process Extraction Phase Interlinking Phase
  • 20. >Data Interlinking Process Data Interlinking Process Web Data Access Schema Mappings Identity Resolution Quality Evaluator
  • 21. >Data Interlinking Process Data Interlinking Process Web Data Access Schema Mappings Identity Resolution Quality Evaluator Extracts data from its original sources
  • 22. >Data Interlinking Process Data Interlinking Process Web Data Access Schema Mappings Identity Resolution Quality Evaluator Matches corresponding terms of multiple vocabularies
  • 23. >Data Interlinking Process Data Interlinking Process Web Data Access Schema Mappings Identity Resolution Quality Evaluator Finds and links similar resources on different datasets
  • 24. >Data Interlinking Process Data Interlinking Process Web Data Access Schema Mappings Identity Resolution Quality Evaluator Evaluates data quality based on a set of rules
  • 25. >Provenance Oportunity Data Interlinking Process Heterogeneous Data Sources Triplify Extract Clean Conform Pre-Integrate Data Preparation and Transformation Process All steps need heavy parameterization and produce a lot of results – Employed parameter values and techniques as well as results obtained are all provenance data Web Data Access Schema Mappings Identity Resolution Quality Evaluator
  • 26. >Linked Open Provenance Architecture
  • 28. >Implementation of PGA Provenance Gathering Agent RDF Triple Triple StoreTriple Store Provenance Data Staging DatabaseStaging Database
  • 29. >Implementation of PGA The andThe PGA wraps the ETL process and stores provenance in data staging tables to be further extracted, RDF Triple Triple StoreTriple Store Provenance Data Staging DatabaseStaging Database tables to be further extracted, triplified and loaded to the triple store by other specific steps, developed through Kettle API and Linked Open Data frameworks
  • 30. >Implementation of PGA Web Data Access Schema MappingsSchema Mappings Identity Resolution Provenance Gathering Agent was implemented as a web service written in Scala (www.scala-lang.org) Provenance Gathering Agent was implemented as a web service written in Scala (www.scala-lang.org)
  • 32. >Use Case Scenario CNPq = Brazilian governmental organization responsible for fostering scientific research RNP = Brazilian governmental organization that finances research projects
  • 33. >Use Case Scenario – First Part
  • 34. >Use Case Scenario – First Part
  • 35. > SELECT ?group_name ?project_name ?researcher_uri ?process_name FROM NAMED <http://linkgraph.provenance.br> FROM NAMED <http://datagraph.provenance.br> FROM NAMED <http://www.cnpq.br> FROM NAMED <http://lattes.cnpq.br> WHERE { GRAPH <http://linkgraph.provenance.br> { ?row_uri provprop:cnpqResearchGroup ?group_uri . ?row_uri provprop:lattesProject ?project_uri . ?row_uri provprop:lattesResearcher ?researcher_uri . } GRAPH <http://datagraph.provenance.br> { Gets researcher’s groups, projects and researchers from data graphs of domain dataset Querying Linked Open Provenance GRAPH <http://datagraph.provenance.br> { ?row_uri opmv:wasGeneratedBy ?process_uri . ?process_uri provprop:composition ?process_def_uri . ?process_def_uri dcterms:title ?process_name . } GRAPH <http://www.cnpq.br> { ?group_uri cnpq:project ?project_uri . ?group_uri foaf:name ?group_name . } GRAPH <http://lattes.cnpq.br> { ?project_uri foaf:name ?project_name . ?researcher_uri foaf:name ?researcher_name . } } Data, that were in differents datasources of the CNPq organization, are now integrated in the Web of Data.
  • 36. >Querying Linked Open Provenance SELECT ?group_name ?project_name ?researcher_uri ?process_name FROM NAMED <http://linkgraph.provenance.br> FROM NAMED <http://datagraph.provenance.br> FROM NAMED <http://www.cnpq.br> FROM NAMED <http://lattes.cnpq.br> WHERE { GRAPH <http://linkgraph.provenance.br> { ?row_uri provprop:cnpqResearchGroup ?group_uri . ?row_uri provprop:lattesProject ?project_uri . ?row_uri provprop:lattesResearcher ?researcher_uri . } GRAPH <http://datagraph.provenance.br> { Also gets the integration process from provenance graphs of Linked Open Provenance dataset GRAPH <http://datagraph.provenance.br> { ?row_uri opmv:wasGeneratedBy ?process_uri . ?process_uri provprop:composition ?process_def_uri . ?process_def_uri dcterms:title ?process_name . } GRAPH <http://www.cnpq.br> { ?group_uri cnpq:project ?project_uri . ?group_uri foaf:name ?group_name . } GRAPH <http://lattes.cnpq.br> { ?project_uri foaf:name ?project_name . ?researcher_uri foaf:name ?researcher_name . } }
  • 37. > group_name project_name research_uri process_name "GRECO - Grupo Engenharia do Conhecimento"@pt "LinkedDataBR - Exposição, compartilhamento e http://lattes.cn pq.br/resourc e/Researcher/ "Merge CNPq Research Groups x Lattes Projects" Querying Linked Open Provenance Conhecimento"@pt compartilhamento e conexão de recursos de dados abertos na Web (Linked Open Data)"@pt e/Researcher/ K4781460T3 x Lattes Projects" "GRECO - Grupo Engenharia do Conhecimento"@pt "Núcleo de Pesquisa de Sistemas Computacionais Complexos para a Gestão de Emergências"@pt http://lattes.cn pq.br/resourc e/Researcher/ K4717449A7 "Merge CNPq Research Groups x Lattes Projects" "GRECO - Grupo Engenharia do Conhecimento"@pt "Identificação e Análise de Redes Sociais Complexas"@pt http://lattes.cn pq.br/resourc e/Researcher/ K4761314U5 "Merge CNPq Research Groups x Lattes Projects"
  • 38. >Use Case Scenario – Second Part
  • 39. >Use Case Scenario – Second Part
  • 40. >Use Case Scenario – Provenance Evaluation At the end of the execution of both processes, a SPARQL query could be used to ask: “At which projects does a researcher work?” The result would include projects declared in the CNPq dataset and in the RNP datasetdataset and in the RNP dataset If the projects returned by CNPq diverges from RNP, it is possible to investigate the cause by querying and evaluating LOP data
  • 41. >Conclusion - Contributions New strategy to provide provenance for data and links of Web of Data LOD cycle is extended with a systematic data preparation and transformation process, supported by an ETL workflow frameworkan ETL workflow framework Provenance data is available according to LOD principles (Linked Open Provenance)
  • 42. >Conclusion – Future works Development of provenance query interface – Take advantage of LOP and support its exploration Development / evolution of a provenance ontology – Today, we are using a combination of vocabularies Investigation in the area of Big Data – Fine-grained provenance generates large volumes of data
  • 43. >Thank You ! LOP – Capturing and Linking Open Provenance on LOD Cycle Rogers R. de Mendonça 1 rogers@ufrj.br Jonas F. S. M. De La Cerda 2 jonas.ferreira@uniriotec.br Kelli F. de Cordeiro 1 kelli@ufrj.br Sérgio M. S. da Cruz 3 serra@ufrrj.br Maria Cláudia Cavalcanti 2 yoko@ime.eb.br Maria Luiza M. Campos 1 mluiza@ppgi.ufrj.br 1 Federal University of Rio de Janeiro - UFRJ 2 Military Institute of Engineering - IME 3 Federal Rural University of Rio de Janeiro - UFRRJ