SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
Weaving repository contents into the Semantic Web 
Pascal-Nicolas Becker | Technische Universität Berlin | SWIB14 | Bonn, December 1-3, 2014
Digital Repositories 
Source: The Directory of Open Access Repositories, 
http://www.opendoar.org, retrieved June 06, 2014. 
Repositories are systems to safely store 
and publish digital objects and their 
descriptive metadata. 
Not in the meaning of software repositories. 
Examples: 
• Digital archives 
• Institutional repositories (preprints, 
postprints, open access publications, …) 
• Digital image libraries 
• Research data repositories 
• … 
More than 2500 Open Access repositories 
worldwide. 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 2
Repository contents are particularly suited 
The data stored in repositories are 
particularly suited to be used in the 
Semantic Web: 
• Metadata already exist in a structured 
form. 
• They do not have to be generated or 
entered manually for publication as 
Linked Data. 
• “Just” convert the data in RDF, add links 
and publish them respecting the Linked 
Data Principles. 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 3
xxx.lanl.org / ArXiv.org 
Source: Paul Ginsparg, First Steps Towards Electronic Research Communication. In: Computer in Physics, Vol. 8, No. 4, 1994, pp. 390-396. 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 4 
“Although the WorldWideWeb still 
represents only a small fraction of the 
overall usage, this access mode is expected 
to become dominant in the near future.” 
Paul Ginsparg 1994
Current data exchange with Repositories 
• OAI-PMH (Open Archive Initiative – Protocol for Metadata Harvesting): 
de facto standard in the context of repositories 
• But: limited to that context 
• Google retired support for OAI-PMH in 2008 
(used before as alternative to the sitemap protocol) 
• “Just” an interface, not a format 
 Linked Data is a generic, native way of data exchange, 
not only in the field of repositories 
 Data published following the Linked Data Principles is self-descriptive 
 Linked Data simplifies data exchange with repositories 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 5
Characteristics of repositories 
• Different repositories may use different metadata schemas. 
 Conversion must be highly configurable and extendable. 
• Metadata may use already existing vocabularies (e.g. Dublin Core, LCSH, …). 
 Convert metadata values to URIs / links. 
• Repository contents change rarely (to be citable and reliable). 
Conversion may be time intensive. 
 Convert data and store converted data in a cache. 
• Repositories generate URIs that shall be used to address their content. 
 Reuse those URIs, add content negotiation to them. 
• Persistent Identifiers (handle, DOI, …) violate the Linked Data Principles. 
 Use Persistent Identifiers in form of HTTP(S) URIs (http://dx.doi.org/...). 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 6
Extending Repositories 
• Add a Triple Store. 
• Use it as cache for converted data. 
• Use it to provide a SPARQL endpoint. 
• Add methods to convert data into RDF and to add links. 
• Add a module to serve data as RDF serializations. 
• Add content negotiation. 
RDF Conversion 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 7 
File System 
Relational 
Database 
Triple Store 
Authorization 
System 
Browse and 
Search 
Persistent 
Identifier Mgt. 
Event System 
User 
Administration 
... 
Web UI 
OAI-PMH 
Interface 
REST 
SWORD ... 
RDF 
Serialization 
Interfaces 
Business Logic 
Storage Layer
What do Repositories store? 
“Repositories are systems to safely store and publish digital objects and 
• Digital objects 
their descriptive metadata.” 
 We can‘t convert the files (technical problems, far too much work). 
 But we can convert the metadata and link to the files! 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 8 
 One or several files: 
Documents (PDF, Text, …), Tables (CSV, …), 
Images (PNG, Tiff …), Audio (Wave, …), 
Video, File Archives, … 
• Descriptive metadata 
 Structured metadata as key – value: 
dc.title, dc.contributor.author, dc.description, 
dc.date.available, dc.subject.lcsh, 
dc.subject.ddc, …
Convert existing metadata to RDF 
• Repository software can be extended to support more or other metadata fields. 
• Dublin Core is used often, but there are other metadata schemas as well. 
 Make the conversion highly configurable! 
 Use RDF for the configuration (so all features of RDF can be used in the configuration 
easily). 
 Use Reification to describe the results. 
 Use Placeholders where necessary, e.g. URIs used by the repository. 
 Use Regular Expressions to generate Literals and/or URIs from a metadata value. 
 Create a vocabulary to write such configurations. 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 9
Example: DSpace Metadata RDF Mapping Vocabulary 
http://digital-repositories.org/ontologies/dspace-metadata-mapping/ 
• One Mapping describes how to convert one metadata field in RDF. 
• Can detect the metadata field by its name (key) and a regular expression used on its 
value. 
• Creates one or several triples. 
• Can use a placeholder for the URI of the object being converted currently. 
• Can create Literals or Resources as needed. 
• Can specify value types and language tags. 
• Can use the language tag DSpace stores for some metadata fields. 
• Can reuse the metadata value, of course. 
• May use regular expressions to modify metadata values used as Literals or Resource 
URIs. 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 10
Example: DSpace Metadata RDF Mapping Vocabulary 
@prefix dc: <http://purl.org/dc/elements/1.1/> . 
@prefix dm: <http://digital-repositories.org/ontologies/dspace-metadata-mapping/0.2.0#> . 
@prefix : <#> . 
:title 
dm:metadataName "dc.title" ; 
dm:creates [ 
dm:subject dm:DSpaceObjectIRI ; 
dm:predicate dcterms:title ; 
dm:object dm:DSpaceValue ; 
] ; 
. 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 11
Example: DSpace Metadata RDF Mapping Vocabulary 
:doi 
dm:metadataName „dc.identifier.doi" ; 
dm:condition „^doi:“ ; 
dm:creates [ 
dm:subject dm:DSpaceObjectIRI ; 
dm:predicate dc:identifier; 
dm:object [ 
a dm:ResourceGenerator ; 
dm:modifier [ 
dm:matcher „^doi:(.*)$“ ; 
dm:replacement „http://dx.doi.org/$1“ ; 
] ; 
dm:pattern „$DSpaceValue“ ; 
] ; 
] ; 
. 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 12
Describing Repositories 
• Beside converting metadata it is worth describing the repository itself. 
• Who is running the repository? Does it have an OAI-PMH interface? Where can I find 
a SPARQL endpoint? How is the content structured? … 
• A vocabulary to link to the digital objects (files) is needed as well. 
• For DSpace, I created the DSpace Repository Ontology: 
http://digital-repositories.org/ontologies/dspace/ 
• A Digital Repositories Ontology would be great, describing repositories independent 
from the software used to create them. 
• A mapping between such an ontology and the DSpace Repository Ontology, the 
EPrints Ontology or any other would be great! 
• If you are interested in creating such an Ontology as well: please contact me. 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 13
Things to mention, even if they should be clear 
• Reuse existing URIs wherever possible, don’t create your own URI if there already 
exists one. 
• E.g.: For classifications like the Library of Congress Subject Headings URIs 
already exists. 
• Create URIs only for you own entities or if you have enough information. 
• Do not create URIs for authors unless you can distinguish different authors with 
the same name! 
• Think about whether the author should create his or her own URI or if it is really 
up to you to create one. 
• But create URIs for the objects in your repository. 
• Create links wherever possible. 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 14
DSpace 5 
• DSpace is the most often used software for Open Access Repositories worldwide 
• Release of DSpace version 5.0 planned for December 2014 
(release candidates are out, testathon is running) 
• Will contain support for Linked Data (RDF/XML, Turtle, N-Triples, SPARQL) 
• Will support content negotiation 
• Highly configurable, good default configuration included 
• Test it yourself: 
http://demo.dspace.org/data/handle/10673/5/ttl 
http://demo.dspace.org/data/handle/10673/5/ttl?text 
wget -O - --header=‘Accept: text/turtle’ http://demo.dspace.org/jspui/handle/10673/5 
or download and install a release candidate 
If you’re about to use DSpace 5.0 or above 
please consider switching Linked Data Support on. 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 15
Technische Universität Berlin 
Universitätsbibliothek 
Pascal-Nicolas Becker 
p.becker@tu-berlin.de 
Servicezentrum Forschungsdaten und –publikationen 
http://www.szf.tu-berlin.de 
Repository DepositOnce 
http://depositonce.tu-berlin.de 
Thesis „Repositorien und das Semantic Web“ (in German) 
http://www.pnjb.de/uni/diplomarbeit/ 
Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 
Slide 16

Más contenido relacionado

Was ist angesagt?

Learning W3C Linked Data Platform with examples
Learning W3C Linked Data Platform with examplesLearning W3C Linked Data Platform with examples
Learning W3C Linked Data Platform with examplesNandana Mihindukulasooriya
 
Repeatable Semantic Queries for the Linked Data Agnostic
Repeatable Semantic Queries for the Linked Data AgnosticRepeatable Semantic Queries for the Linked Data Agnostic
Repeatable Semantic Queries for the Linked Data AgnosticAlbert Meroño-Peñuela
 
Introduction to Linked Data Platform (LDP)
Introduction to Linked Data Platform (LDP)Introduction to Linked Data Platform (LDP)
Introduction to Linked Data Platform (LDP)Hector Correa
 
IFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked DataIFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked DataLars G. Svensson
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyNandana Mihindukulasooriya
 
d:swarm - A Library Data Management Platform Based on a Linked Open Data Appr...
d:swarm - A Library Data Management Platform Based on a Linked Open Data Appr...d:swarm - A Library Data Management Platform Based on a Linked Open Data Appr...
d:swarm - A Library Data Management Platform Based on a Linked Open Data Appr...Jens Mittelbach
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Herbert Van de Sompel
 
Matthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings FundMatthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings FundTracy Kent
 
LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...Nandana Mihindukulasooriya
 
Nobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataNobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataMetaSolutions AB
 
ORDS, research data network
ORDS, research data networkORDS, research data network
ORDS, research data networkJisc RDM
 
3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides
3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides
3.7.17 DSpace for Data: issues, solutions and challenges Webinar SlidesDuraSpace
 
The state of the art in Linked Data
The state of the art in Linked DataThe state of the art in Linked Data
The state of the art in Linked DataJoshua Shinavier
 
Linked Open Data and DANS
Linked Open Data and DANSLinked Open Data and DANS
Linked Open Data and DANSvty
 
DataverseNL as structured data hub
DataverseNL as structured data hubDataverseNL as structured data hub
DataverseNL as structured data hubvty
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Open data easy, explicit and fast
Open data easy, explicit and fastOpen data easy, explicit and fast
Open data easy, explicit and fastMetaSolutions AB
 

Was ist angesagt? (20)

Reminiscing about interoperability
Reminiscing about interoperabilityReminiscing about interoperability
Reminiscing about interoperability
 
Learning W3C Linked Data Platform with examples
Learning W3C Linked Data Platform with examplesLearning W3C Linked Data Platform with examples
Learning W3C Linked Data Platform with examples
 
Repeatable Semantic Queries for the Linked Data Agnostic
Repeatable Semantic Queries for the Linked Data AgnosticRepeatable Semantic Queries for the Linked Data Agnostic
Repeatable Semantic Queries for the Linked Data Agnostic
 
Introduction to Linked Data Platform (LDP)
Introduction to Linked Data Platform (LDP)Introduction to Linked Data Platform (LDP)
Introduction to Linked Data Platform (LDP)
 
IFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked DataIFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked Data
 
Linked Data
Linked DataLinked Data
Linked Data
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core Vocabulary
 
Introduction to W3C Linked Data Platform
Introduction to W3C Linked Data PlatformIntroduction to W3C Linked Data Platform
Introduction to W3C Linked Data Platform
 
d:swarm - A Library Data Management Platform Based on a Linked Open Data Appr...
d:swarm - A Library Data Management Platform Based on a Linked Open Data Appr...d:swarm - A Library Data Management Platform Based on a Linked Open Data Appr...
d:swarm - A Library Data Management Platform Based on a Linked Open Data Appr...
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 
Matthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings FundMatthew Hale - Open Source at the Kings Fund
Matthew Hale - Open Source at the Kings Fund
 
LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...
 
Nobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataNobel Prizes as Linked Open Data
Nobel Prizes as Linked Open Data
 
ORDS, research data network
ORDS, research data networkORDS, research data network
ORDS, research data network
 
3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides
3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides
3.7.17 DSpace for Data: issues, solutions and challenges Webinar Slides
 
The state of the art in Linked Data
The state of the art in Linked DataThe state of the art in Linked Data
The state of the art in Linked Data
 
Linked Open Data and DANS
Linked Open Data and DANSLinked Open Data and DANS
Linked Open Data and DANS
 
DataverseNL as structured data hub
DataverseNL as structured data hubDataverseNL as structured data hub
DataverseNL as structured data hub
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
Open data easy, explicit and fast
Open data easy, explicit and fastOpen data easy, explicit and fast
Open data easy, explicit and fast
 

Andere mochten auch

Strategic Development - Future Plans for DSpace
Strategic Development - Future Plans for DSpaceStrategic Development - Future Plans for DSpace
Strategic Development - Future Plans for DSpacePascal-Nicolas Becker
 
16. DINI-Jahrestagung: Linked Data und Repositorien
16. DINI-Jahrestagung: Linked Data und Repositorien16. DINI-Jahrestagung: Linked Data und Repositorien
16. DINI-Jahrestagung: Linked Data und RepositorienPascal-Nicolas Becker
 
Repositorieninhalte als Linked Data bereitstellen
Repositorieninhalte als Linked Data bereitstellenRepositorieninhalte als Linked Data bereitstellen
Repositorieninhalte als Linked Data bereitstellenPascal-Nicolas Becker
 
DepositOnce - Das Repositorium für Forschungsergebnisse der TU Berlin
DepositOnce - Das Repositorium für Forschungsergebnisse der TU BerlinDepositOnce - Das Repositorium für Forschungsergebnisse der TU Berlin
DepositOnce - Das Repositorium für Forschungsergebnisse der TU BerlinPascal-Nicolas Becker
 

Andere mochten auch (9)

Basic aspects of Open Access
Basic aspects of Open AccessBasic aspects of Open Access
Basic aspects of Open Access
 
Strategic Development - Future Plans for DSpace
Strategic Development - Future Plans for DSpaceStrategic Development - Future Plans for DSpace
Strategic Development - Future Plans for DSpace
 
Infotreff: Persistent Identifier
Infotreff: Persistent IdentifierInfotreff: Persistent Identifier
Infotreff: Persistent Identifier
 
16. DINI-Jahrestagung: Linked Data und Repositorien
16. DINI-Jahrestagung: Linked Data und Repositorien16. DINI-Jahrestagung: Linked Data und Repositorien
16. DINI-Jahrestagung: Linked Data und Repositorien
 
Repositorieninhalte als Linked Data bereitstellen
Repositorieninhalte als Linked Data bereitstellenRepositorieninhalte als Linked Data bereitstellen
Repositorieninhalte als Linked Data bereitstellen
 
DSpace 5 und Linked (Open) Data
DSpace 5 und Linked (Open) DataDSpace 5 und Linked (Open) Data
DSpace 5 und Linked (Open) Data
 
DepositOnce - Das Repositorium für Forschungsergebnisse der TU Berlin
DepositOnce - Das Repositorium für Forschungsergebnisse der TU BerlinDepositOnce - Das Repositorium für Forschungsergebnisse der TU Berlin
DepositOnce - Das Repositorium für Forschungsergebnisse der TU Berlin
 
Zustand und Entwicklung von DSpace
Zustand und Entwicklung von DSpaceZustand und Entwicklung von DSpace
Zustand und Entwicklung von DSpace
 
DSpace und das Semantic Web
DSpace und das Semantic WebDSpace und das Semantic Web
DSpace und das Semantic Web
 

Ähnlich wie SWIB14 Weaving repository contents into the Semantic Web

Drupal and Apache Stanbol
Drupal and Apache StanbolDrupal and Apache Stanbol
Drupal and Apache StanbolAlkuvoima
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Gautier Poupeau
 
Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?gvernik
 
Hadoop and object stores can we do it better
Hadoop and object stores  can we do it betterHadoop and object stores  can we do it better
Hadoop and object stores can we do it bettergvernik
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Oscar Corcho
 
From Box to Hydra via Archivematica
From Box to Hydra via ArchivematicaFrom Box to Hydra via Archivematica
From Box to Hydra via ArchivematicaJisc RDM
 
Produce and consume_linked_data_with_drupal
Produce and consume_linked_data_with_drupalProduce and consume_linked_data_with_drupal
Produce and consume_linked_data_with_drupalSTIinnsbruck
 
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Cory Lampert
 
05 darwino db
05   darwino db05   darwino db
05 darwino dbdarwinodb
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
High and Lows of Library Linked Data
High and Lows of Library Linked DataHigh and Lows of Library Linked Data
High and Lows of Library Linked DataAdrian Stevenson
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
Refactoring HUBzero for Linked Data
Refactoring HUBzero for Linked DataRefactoring HUBzero for Linked Data
Refactoring HUBzero for Linked DataYongyang Yu
 
Intro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-AthensIntro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-AthensStoitsis Giannis
 
Integrating an electronic lab notebook with a data repository; American Chemi...
Integrating an electronic lab notebook with a data repository; American Chemi...Integrating an electronic lab notebook with a data repository; American Chemi...
Integrating an electronic lab notebook with a data repository; American Chemi...rmacneil88
 
Elns and repositories, American Chemical Society, Dallas, March 2014
Elns and repositories, American Chemical Society, Dallas, March 2014Elns and repositories, American Chemical Society, Dallas, March 2014
Elns and repositories, American Chemical Society, Dallas, March 2014ResearchSpace
 
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”DuraSpace
 
NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)Christine Stohn
 
ISWC GoodRelations Tutorial Part 2
ISWC GoodRelations Tutorial Part 2ISWC GoodRelations Tutorial Part 2
ISWC GoodRelations Tutorial Part 2Martin Hepp
 

Ähnlich wie SWIB14 Weaving repository contents into the Semantic Web (20)

Drupal and Apache Stanbol
Drupal and Apache StanbolDrupal and Apache Stanbol
Drupal and Apache Stanbol
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...
 
Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?
 
Hadoop and object stores can we do it better
Hadoop and object stores  can we do it betterHadoop and object stores  can we do it better
Hadoop and object stores can we do it better
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
From Box to Hydra via Archivematica
From Box to Hydra via ArchivematicaFrom Box to Hydra via Archivematica
From Box to Hydra via Archivematica
 
Produce and consume_linked_data_with_drupal
Produce and consume_linked_data_with_drupalProduce and consume_linked_data_with_drupal
Produce and consume_linked_data_with_drupal
 
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
Linked data demystified:Practical efforts to transform CONTENTDM metadata int...
 
05 darwino db
05   darwino db05   darwino db
05 darwino db
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
High and Lows of Library Linked Data
High and Lows of Library Linked DataHigh and Lows of Library Linked Data
High and Lows of Library Linked Data
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Refactoring HUBzero for Linked Data
Refactoring HUBzero for Linked DataRefactoring HUBzero for Linked Data
Refactoring HUBzero for Linked Data
 
Intro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-AthensIntro to-technologies-Green-City-Hackathon-Athens
Intro to-technologies-Green-City-Hackathon-Athens
 
Integrating an electronic lab notebook with a data repository; American Chemi...
Integrating an electronic lab notebook with a data repository; American Chemi...Integrating an electronic lab notebook with a data repository; American Chemi...
Integrating an electronic lab notebook with a data repository; American Chemi...
 
Elns and repositories, American Chemical Society, Dallas, March 2014
Elns and repositories, American Chemical Society, Dallas, March 2014Elns and repositories, American Chemical Society, Dallas, March 2014
Elns and repositories, American Chemical Society, Dallas, March 2014
 
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
11.5.14 Presentation Slides, “Fedora 4.0 in Action at Penn State and Stanford”
 
NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)
 
ISWC GoodRelations Tutorial Part 2
ISWC GoodRelations Tutorial Part 2ISWC GoodRelations Tutorial Part 2
ISWC GoodRelations Tutorial Part 2
 

Mehr von Pascal-Nicolas Becker

Integration der gemeinsamen Normdatei in DSpace auf Basis von lobid
Integration der gemeinsamen Normdatei in DSpace auf Basis von lobidIntegration der gemeinsamen Normdatei in DSpace auf Basis von lobid
Integration der gemeinsamen Normdatei in DSpace auf Basis von lobidPascal-Nicolas Becker
 
DSpace - Rolle, Aufgaben und Verhältnis der Open-Source-Community und ihrer S...
DSpace - Rolle, Aufgaben und Verhältnis der Open-Source-Community und ihrer S...DSpace - Rolle, Aufgaben und Verhältnis der Open-Source-Community und ihrer S...
DSpace - Rolle, Aufgaben und Verhältnis der Open-Source-Community und ihrer S...Pascal-Nicolas Becker
 
Stewarding national user groups to strengthen international Open Source Softw...
Stewarding national user groups to strengthen international Open Source Softw...Stewarding national user groups to strengthen international Open Source Softw...
Stewarding national user groups to strengthen international Open Source Softw...Pascal-Nicolas Becker
 
OR2019 - The revenge of the repository rodeo - DSpace
OR2019 - The revenge of the repository rodeo - DSpaceOR2019 - The revenge of the repository rodeo - DSpace
OR2019 - The revenge of the repository rodeo - DSpacePascal-Nicolas Becker
 
DepositOnce - Das Repositorium der TU Berlin für Forschungsdaten und Publikat...
DepositOnce - Das Repositorium der TU Berlin für Forschungsdaten und Publikat...DepositOnce - Das Repositorium der TU Berlin für Forschungsdaten und Publikat...
DepositOnce - Das Repositorium der TU Berlin für Forschungsdaten und Publikat...Pascal-Nicolas Becker
 

Mehr von Pascal-Nicolas Becker (8)

Integration der gemeinsamen Normdatei in DSpace auf Basis von lobid
Integration der gemeinsamen Normdatei in DSpace auf Basis von lobidIntegration der gemeinsamen Normdatei in DSpace auf Basis von lobid
Integration der gemeinsamen Normdatei in DSpace auf Basis von lobid
 
DSpace - Rolle, Aufgaben und Verhältnis der Open-Source-Community und ihrer S...
DSpace - Rolle, Aufgaben und Verhältnis der Open-Source-Community und ihrer S...DSpace - Rolle, Aufgaben und Verhältnis der Open-Source-Community und ihrer S...
DSpace - Rolle, Aufgaben und Verhältnis der Open-Source-Community und ihrer S...
 
Stewarding national user groups to strengthen international Open Source Softw...
Stewarding national user groups to strengthen international Open Source Softw...Stewarding national user groups to strengthen international Open Source Softw...
Stewarding national user groups to strengthen international Open Source Softw...
 
OR2019 - The revenge of the repository rodeo - DSpace
OR2019 - The revenge of the repository rodeo - DSpaceOR2019 - The revenge of the repository rodeo - DSpace
OR2019 - The revenge of the repository rodeo - DSpace
 
Workshop Docker for DSpace
Workshop Docker for DSpaceWorkshop Docker for DSpace
Workshop Docker for DSpace
 
DepositOnce - Das Repositorium der TU Berlin für Forschungsdaten und Publikat...
DepositOnce - Das Repositorium der TU Berlin für Forschungsdaten und Publikat...DepositOnce - Das Repositorium der TU Berlin für Forschungsdaten und Publikat...
DepositOnce - Das Repositorium der TU Berlin für Forschungsdaten und Publikat...
 
The State of DSpace
The State of DSpaceThe State of DSpace
The State of DSpace
 
Forschungsdaten und DSpace
Forschungsdaten und DSpaceForschungsdaten und DSpace
Forschungsdaten und DSpace
 

Último

Retail marketing Supply chain management SLIDESHARE.pptx
Retail marketing Supply chain management SLIDESHARE.pptxRetail marketing Supply chain management SLIDESHARE.pptx
Retail marketing Supply chain management SLIDESHARE.pptxBharathBunny10
 
LAUNCH: Intersections between violence against children and violence against ...
LAUNCH: Intersections between violence against children and violence against ...LAUNCH: Intersections between violence against children and violence against ...
LAUNCH: Intersections between violence against children and violence against ...UNICEF Office of Research - Innocenti
 
110 Philippines. quiz bee Power PoInt Presentation
110 Philippines. quiz bee Power PoInt Presentation110 Philippines. quiz bee Power PoInt Presentation
110 Philippines. quiz bee Power PoInt PresentationNorHaiFatun
 
Self Editing Your Novel Part 3: Who's Telling This Story?
Self Editing Your Novel Part 3: Who's Telling This Story?Self Editing Your Novel Part 3: Who's Telling This Story?
Self Editing Your Novel Part 3: Who's Telling This Story?Beth Jusino
 
Leadership in Difficult Times- Strategies for Overcoming Challenges - Reflect...
Leadership in Difficult Times- Strategies for Overcoming Challenges - Reflect...Leadership in Difficult Times- Strategies for Overcoming Challenges - Reflect...
Leadership in Difficult Times- Strategies for Overcoming Challenges - Reflect...Kayode Fayemi
 
BaruwaRaquella_Retail Store Presentation.pptx
BaruwaRaquella_Retail Store Presentation.pptxBaruwaRaquella_Retail Store Presentation.pptx
BaruwaRaquella_Retail Store Presentation.pptxRaquellaBaruwa
 
DAY 06 A Revelation 03-10-2024 PpPT.pptx
DAY 06 A Revelation 03-10-2024 PpPT.pptxDAY 06 A Revelation 03-10-2024 PpPT.pptx
DAY 06 A Revelation 03-10-2024 PpPT.pptxFamilyWorshipCenterD
 
Evaluating LLM Models for Production Systems Methods and Practices -
Evaluating LLM Models for Production Systems Methods and Practices -Evaluating LLM Models for Production Systems Methods and Practices -
Evaluating LLM Models for Production Systems Methods and Practices -alopatenko
 
wonder woman:quiz on female achievements
wonder woman:quiz on female achievementswonder woman:quiz on female achievements
wonder woman:quiz on female achievementsRemya Roshni
 
2024 QRC PLM Recruitment Praesentation.pdf
2024 QRC PLM Recruitment Praesentation.pdf2024 QRC PLM Recruitment Praesentation.pdf
2024 QRC PLM Recruitment Praesentation.pdfJoerg Speikamp
 

Último (12)

Retail marketing Supply chain management SLIDESHARE.pptx
Retail marketing Supply chain management SLIDESHARE.pptxRetail marketing Supply chain management SLIDESHARE.pptx
Retail marketing Supply chain management SLIDESHARE.pptx
 
LAUNCH: Intersections between violence against children and violence against ...
LAUNCH: Intersections between violence against children and violence against ...LAUNCH: Intersections between violence against children and violence against ...
LAUNCH: Intersections between violence against children and violence against ...
 
110 Philippines. quiz bee Power PoInt Presentation
110 Philippines. quiz bee Power PoInt Presentation110 Philippines. quiz bee Power PoInt Presentation
110 Philippines. quiz bee Power PoInt Presentation
 
Self Editing Your Novel Part 3: Who's Telling This Story?
Self Editing Your Novel Part 3: Who's Telling This Story?Self Editing Your Novel Part 3: Who's Telling This Story?
Self Editing Your Novel Part 3: Who's Telling This Story?
 
Leadership in Difficult Times- Strategies for Overcoming Challenges - Reflect...
Leadership in Difficult Times- Strategies for Overcoming Challenges - Reflect...Leadership in Difficult Times- Strategies for Overcoming Challenges - Reflect...
Leadership in Difficult Times- Strategies for Overcoming Challenges - Reflect...
 
Tethex Cards - complete presentation in English
Tethex Cards - complete presentation in EnglishTethex Cards - complete presentation in English
Tethex Cards - complete presentation in English
 
BaruwaRaquella_Retail Store Presentation.pptx
BaruwaRaquella_Retail Store Presentation.pptxBaruwaRaquella_Retail Store Presentation.pptx
BaruwaRaquella_Retail Store Presentation.pptx
 
DAY 06 A Revelation 03-10-2024 PpPT.pptx
DAY 06 A Revelation 03-10-2024 PpPT.pptxDAY 06 A Revelation 03-10-2024 PpPT.pptx
DAY 06 A Revelation 03-10-2024 PpPT.pptx
 
Evaluating LLM Models for Production Systems Methods and Practices -
Evaluating LLM Models for Production Systems Methods and Practices -Evaluating LLM Models for Production Systems Methods and Practices -
Evaluating LLM Models for Production Systems Methods and Practices -
 
wonder woman:quiz on female achievements
wonder woman:quiz on female achievementswonder woman:quiz on female achievements
wonder woman:quiz on female achievements
 
2024 QRC PLM Recruitment Praesentation.pdf
2024 QRC PLM Recruitment Praesentation.pdf2024 QRC PLM Recruitment Praesentation.pdf
2024 QRC PLM Recruitment Praesentation.pdf
 
NOC_SXSW_Non-ObviousThinking_2024_SLIDES.pptx
NOC_SXSW_Non-ObviousThinking_2024_SLIDES.pptxNOC_SXSW_Non-ObviousThinking_2024_SLIDES.pptx
NOC_SXSW_Non-ObviousThinking_2024_SLIDES.pptx
 

SWIB14 Weaving repository contents into the Semantic Web

  • 1. Weaving repository contents into the Semantic Web Pascal-Nicolas Becker | Technische Universität Berlin | SWIB14 | Bonn, December 1-3, 2014
  • 2. Digital Repositories Source: The Directory of Open Access Repositories, http://www.opendoar.org, retrieved June 06, 2014. Repositories are systems to safely store and publish digital objects and their descriptive metadata. Not in the meaning of software repositories. Examples: • Digital archives • Institutional repositories (preprints, postprints, open access publications, …) • Digital image libraries • Research data repositories • … More than 2500 Open Access repositories worldwide. Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 2
  • 3. Repository contents are particularly suited The data stored in repositories are particularly suited to be used in the Semantic Web: • Metadata already exist in a structured form. • They do not have to be generated or entered manually for publication as Linked Data. • “Just” convert the data in RDF, add links and publish them respecting the Linked Data Principles. Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 3
  • 4. xxx.lanl.org / ArXiv.org Source: Paul Ginsparg, First Steps Towards Electronic Research Communication. In: Computer in Physics, Vol. 8, No. 4, 1994, pp. 390-396. Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 4 “Although the WorldWideWeb still represents only a small fraction of the overall usage, this access mode is expected to become dominant in the near future.” Paul Ginsparg 1994
  • 5. Current data exchange with Repositories • OAI-PMH (Open Archive Initiative – Protocol for Metadata Harvesting): de facto standard in the context of repositories • But: limited to that context • Google retired support for OAI-PMH in 2008 (used before as alternative to the sitemap protocol) • “Just” an interface, not a format  Linked Data is a generic, native way of data exchange, not only in the field of repositories  Data published following the Linked Data Principles is self-descriptive  Linked Data simplifies data exchange with repositories Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 5
  • 6. Characteristics of repositories • Different repositories may use different metadata schemas.  Conversion must be highly configurable and extendable. • Metadata may use already existing vocabularies (e.g. Dublin Core, LCSH, …).  Convert metadata values to URIs / links. • Repository contents change rarely (to be citable and reliable). Conversion may be time intensive.  Convert data and store converted data in a cache. • Repositories generate URIs that shall be used to address their content.  Reuse those URIs, add content negotiation to them. • Persistent Identifiers (handle, DOI, …) violate the Linked Data Principles.  Use Persistent Identifiers in form of HTTP(S) URIs (http://dx.doi.org/...). Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 6
  • 7. Extending Repositories • Add a Triple Store. • Use it as cache for converted data. • Use it to provide a SPARQL endpoint. • Add methods to convert data into RDF and to add links. • Add a module to serve data as RDF serializations. • Add content negotiation. RDF Conversion Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 7 File System Relational Database Triple Store Authorization System Browse and Search Persistent Identifier Mgt. Event System User Administration ... Web UI OAI-PMH Interface REST SWORD ... RDF Serialization Interfaces Business Logic Storage Layer
  • 8. What do Repositories store? “Repositories are systems to safely store and publish digital objects and • Digital objects their descriptive metadata.”  We can‘t convert the files (technical problems, far too much work).  But we can convert the metadata and link to the files! Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 8  One or several files: Documents (PDF, Text, …), Tables (CSV, …), Images (PNG, Tiff …), Audio (Wave, …), Video, File Archives, … • Descriptive metadata  Structured metadata as key – value: dc.title, dc.contributor.author, dc.description, dc.date.available, dc.subject.lcsh, dc.subject.ddc, …
  • 9. Convert existing metadata to RDF • Repository software can be extended to support more or other metadata fields. • Dublin Core is used often, but there are other metadata schemas as well.  Make the conversion highly configurable!  Use RDF for the configuration (so all features of RDF can be used in the configuration easily).  Use Reification to describe the results.  Use Placeholders where necessary, e.g. URIs used by the repository.  Use Regular Expressions to generate Literals and/or URIs from a metadata value.  Create a vocabulary to write such configurations. Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 9
  • 10. Example: DSpace Metadata RDF Mapping Vocabulary http://digital-repositories.org/ontologies/dspace-metadata-mapping/ • One Mapping describes how to convert one metadata field in RDF. • Can detect the metadata field by its name (key) and a regular expression used on its value. • Creates one or several triples. • Can use a placeholder for the URI of the object being converted currently. • Can create Literals or Resources as needed. • Can specify value types and language tags. • Can use the language tag DSpace stores for some metadata fields. • Can reuse the metadata value, of course. • May use regular expressions to modify metadata values used as Literals or Resource URIs. Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 10
  • 11. Example: DSpace Metadata RDF Mapping Vocabulary @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix dm: <http://digital-repositories.org/ontologies/dspace-metadata-mapping/0.2.0#> . @prefix : <#> . :title dm:metadataName "dc.title" ; dm:creates [ dm:subject dm:DSpaceObjectIRI ; dm:predicate dcterms:title ; dm:object dm:DSpaceValue ; ] ; . Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 11
  • 12. Example: DSpace Metadata RDF Mapping Vocabulary :doi dm:metadataName „dc.identifier.doi" ; dm:condition „^doi:“ ; dm:creates [ dm:subject dm:DSpaceObjectIRI ; dm:predicate dc:identifier; dm:object [ a dm:ResourceGenerator ; dm:modifier [ dm:matcher „^doi:(.*)$“ ; dm:replacement „http://dx.doi.org/$1“ ; ] ; dm:pattern „$DSpaceValue“ ; ] ; ] ; . Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 12
  • 13. Describing Repositories • Beside converting metadata it is worth describing the repository itself. • Who is running the repository? Does it have an OAI-PMH interface? Where can I find a SPARQL endpoint? How is the content structured? … • A vocabulary to link to the digital objects (files) is needed as well. • For DSpace, I created the DSpace Repository Ontology: http://digital-repositories.org/ontologies/dspace/ • A Digital Repositories Ontology would be great, describing repositories independent from the software used to create them. • A mapping between such an ontology and the DSpace Repository Ontology, the EPrints Ontology or any other would be great! • If you are interested in creating such an Ontology as well: please contact me. Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 13
  • 14. Things to mention, even if they should be clear • Reuse existing URIs wherever possible, don’t create your own URI if there already exists one. • E.g.: For classifications like the Library of Congress Subject Headings URIs already exists. • Create URIs only for you own entities or if you have enough information. • Do not create URIs for authors unless you can distinguish different authors with the same name! • Think about whether the author should create his or her own URI or if it is really up to you to create one. • But create URIs for the objects in your repository. • Create links wherever possible. Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 14
  • 15. DSpace 5 • DSpace is the most often used software for Open Access Repositories worldwide • Release of DSpace version 5.0 planned for December 2014 (release candidates are out, testathon is running) • Will contain support for Linked Data (RDF/XML, Turtle, N-Triples, SPARQL) • Will support content negotiation • Highly configurable, good default configuration included • Test it yourself: http://demo.dspace.org/data/handle/10673/5/ttl http://demo.dspace.org/data/handle/10673/5/ttl?text wget -O - --header=‘Accept: text/turtle’ http://demo.dspace.org/jspui/handle/10673/5 or download and install a release candidate If you’re about to use DSpace 5.0 or above please consider switching Linked Data Support on. Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 15
  • 16. Technische Universität Berlin Universitätsbibliothek Pascal-Nicolas Becker p.becker@tu-berlin.de Servicezentrum Forschungsdaten und –publikationen http://www.szf.tu-berlin.de Repository DepositOnce http://depositonce.tu-berlin.de Thesis „Repositorien und das Semantic Web“ (in German) http://www.pnjb.de/uni/diplomarbeit/ Weaving repository contents into the Semantic Web | Pascal-Nicolas Becker | SWIB14 | Bonn, December 3, 2014 Slide 16