Enriching scholarly data with metadata enhances the publications’ meaning. Unfortunately, different publishers of overlapping or complementary scholarly data neglect general-purpose solutions for metadata and instead use their own ad-hoc solutions. This leads to duplicate efforts and entails non-negligible implementation and maintenance costs. In this paper, we propose a reusable Linked Data publishing workflow that can be easily adjusted by different data owners to (i) generate and publish Linked Data, and (ii) align scholarly data repositories with enrichments over the publications’ content. As a proof-of-concept, the proposed workflow was applied to the iMinds research institute data warehouse, which was aligned with publications’ content derived from Ghent University’s digital repository. Moreover, we developed a user interface to help lay users with the exploration of the iLastic Linked Data set. Our proposed approach relies on a general-purpose workflow. This way, we manage to reduce the development and maintenance costs and increase the quality of the resulting Linked Data.
Developer Data Modeling Mistakes: From Postgres to NoSQL
iLastic: Linked Data Generation Workflow and User Interface for iMinds Scholarly Data
1. iLastic:
Linked Data Generation
Workflow & User Interface
for iMinds Scholarly Data
SAVE-SD 2017
Anastasia Dimou, Gerald Haesendonck, Martin Vanbrabant,
Laurens De Vocht, Ruben Verborgh, Steven Latré, Erik Mannens
Anastasia.Dimou@ugent.be ● @natadimou
Ghent University – IDLab – imec
2. Publication is archived & published by the
event organizers where it was presented
publisher who publishes the proceedings
authors who co-edited it
organization(s) the authors are affiliated with
3. Publication
Dimou A. et al. (2015)
Assessing & Refining Mappings to RDF to Improve Dataset Quality
In: Arenas M. et al. (eds) The Semantic Web - ISWC 2015
Lecture Notes in Computer Science, vol 9367. Springer, Cham
4. Publication is archived & published by the
event ISWC2015
http://iswc2015.semanticweb.org/sites/iswc2015.semanticweb.org/files/93670111.pdf
5. Publication is archived & published by the
event ISWC2015
publisher LNCS, Springer
https://link.springer.com/chapter/10.1007/978-3-319-25010-6_8
6. Publication is archived & published by the
event ISWC2015
publisher LNCS, Springer
authors multiple by 8
https://ruben.verborgh.org/publications/dimou_iswc_2015a/
http://jens-lehmann.org/files/2015/iswc_rml_rdfunit.pdf
7. Publication is archived & published by the
event ISWC2015
publisher LNCS, Springer
authors multiple by 8
organization(s) multiple by 5
https://biblio.ugent.be/publication/8030828
8. Publication is archived & published 15 times!!
Dimou A. et al. (2015)
Assessing & Refining Mappings to RDF to Improve Dataset Quality
In: Arenas M. et al. (eds) The Semantic Web - ISWC 2015
Lecture Notes in Computer Science, vol 9367. Springer, Cham
9. Publication is published 15 times...
… if all agents publish its scholarly data as Linked (Open) Data
10. Publication is published N times...
… if N agents publish its scholarly data as Linked (Open) Data
Linked (Open) Data is generated with N different ways
12. Semantic Publishing: ad-hoc solutions
different agents own
overlapping or complementary scholarly data
use their own ad-hoc solutions
to generate and publish their own Linked (Open) Data
13. Semantic Publishing: fragmented datasets
different agents own
overlapping or complementary scholarly data
focus on metadata or content, rarely on both
content annotations are rarely published as datasets
14. Semantic Publishing: currently leading to..
duplicate efforts for Linked (Open) Data generation:
(re-)implementing from scratch
non-negligible implementation & maintenance costs
17. How can we
reduce implementation costs
increase Linked Data quality?
18. Semantic Publishing: our approach
general-purpose Linked (Open) Data
generation and publication workflow
adjusted to each agent’s scholarly data
integrates metadata & content annotations
19. Semantic Publishing: iLastic
general-purpose Linked (Open) Data
generation and publication workflow
based on our modular RML tool chain
adjusted to iMinds & Ghent university repository
overlapping and complementary scholarly data
integrates metadata & content annotations
based on the RML tool chain & text enricher alignment
27. iLastic Workflow
RDF generation & publication service
general purpose tool:
distinct mapping rules definition & execution
execution: RML Processor
Enrichment service
https://github.com/RMLio/RML-Processor
28.
29. iLastic Workflow
RDF generation & publication service
general purpose tool:
distinct mapping rules definition & execution
execution: RML Processor
definition
Enrichment service
30. iLastic Workflow
RDF generation & publication service
general purpose tool:
distinct mapping rules definition & execution
execution: RML Processor
definition: RML language
Enrichment service
A. Dimou et al. (2014) RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data. In
Proceedings of the 7th Workshop on Linked Data on the Web (LDOW2014), Seoul, Korea.
http://rml.io
31. iLastic Workflow
RDF generation & publication service
general purpose tool:
distinct mapping rules definition & execution
execution: RML Processor
definition: RML Editor
Enrichment service
Heyvaert P. et al. (2016) RMLEditor: A Graph-Based Mapping Editor for Linked Data Mappings. In The
Semantic Web. Latest Advances and New Domains. ESWC 2016. LNCS, vol 9678. Springer, Cham
https://www.youtube.com/watch?v=0lPDaghlZoQ
32.
33. iLastic Workflow
RDF generation & publication service
general purpose tool:
execution: RML Processor
definition: RML Editor
validation
Enrichment service
34. iLastic Workflow
RDF generation & publication service
general purpose tool:
execution: RML Processor
definition: RML Editor
validation: RML Validator
Enrichment service
Dimou A. et al. (2015) Assessing and Refining Mappingsto RDF to Improve Dataset Quality. In: Arenas M. et
al. (eds) The Semantic Web - ISWC 2015. Lecture Notes in Computer Science, vol 9367. Springer, Cham
45. iLastic Workflow
RDF generation & publication service
data dumps
Linked Data Fragments
Enrichment service
http://linkeddatafragments.org/
46. iLastic Workflow
RDF generation & publication service
data dumps
Linked Data Fragments
SPARQL endpoint - Virtuoso
Enrichment service
https://github.com/openlink/virtuoso-opensource
47.
48. iLastic Workflow
RDF generation & publication service
data dumps
Linked Data Fragments
SPARQL endpoint - Virtuoso
The DataTank
Enrichment service
http://thedatatank.com/