Lightning talk about generating Linked Data from a digital object management system at the National Library of Latvia. Conference: http://swib.org/swib12/programme.php
Linked Data from a Digital Object Management System
1. Linked Data
from
Digital Object
Management System
@ the National Library of Latvia
Uldis Bojārs - SWIB12 – 28-Nov-2012
2. Uldis Bojārs
• uldis.bojars@gmail.com
• @CaptSolo
• National Library of Latvia
• Semantic Web expert
• PhD in Computer Science, DERI Galway
(National University of Ireland, Galway)
5. • Core functionality – digital object
management and preservation
– production system (not a pilot)
• Development: custom, outsourced
• Linked Data functions (added on)
6. Context
• Core functionality
– Must be reliable and with good performance
• Linked Data functions (added on)
– Aim: bootstrap linked data at NLL
– Linked data interface (URIs, HTTP conneg, RDF data)
– SPARQL endpoint
• Developers
– Lack of developers who have experience building
production-level systems based on RDF stores
8. Architecture
• Core system (MSSQL, C#, .Net)
– Ingest, object management, …
– (DB allows to add links to other objects, web pages)
– (new Digital Object metadata fields can be added)
• RDF / Linked Data adaptor module
– URIs, HTTP content negotiation
– HTML, RDF, XML
– (for new Digital Object fields can specify how to export in RDF)
• Separate RDF / SPARQL server
– SPARQL endpoint
– (no impact on core system)
9. Synchronisation
• Named graphs
• Push-sync
– core system knows when something is updated
and sends changes to the RDF store
– updates at object level (named graphs)
• SPARQL CLEAR, INSERT
10. Data
• Digital object packages (XML)
– from various sources
– mapped to RDF: (mix of various vocabs)
• Authority records
– from ALEPH: ~170 k records
– may use DOM2 to expose authority data as RDF
– in RDF: SKOS
• via https://github.com/kefo/marcauth-2-madsrdf
• Classifiers
– digital object types, access rights, languages, …
– in RDF: SKOS
12. <file id="91"
mimeType="image/jpeg" name="junijs15-16_040.jpg" size="2112976" … >
<fileMetadata>
<field name="Type">JPEG image</field>
<field name="Name">91.jpg</field>
<field name="Size">2.01 MB</field>
<field name="Title">OLYMPUS DIGITAL CAMERA</field>
<field name="Subject">OLYMPUS DIGITAL CAMERA</field>
<field name="Content created">30.10.2012 11:37:52</field>
<field name="Date last saved">30.10.2012 11:37:52</field>
<field name="Program name">Version 1.1</field>
<field name="Width">2736 pixels</field>
<field name="Height">3648 pixels</field>
<field name="Horizontal resolution">96 dpi</field>
<field name="Vertical resolution">96 dpi</field>
…
What about modeling file metadata (for various content types)?
Source XML data not very useful.
13. Issues / Questions
• Technical issues
– how to reliably work with RDF stores
• Modeling
– Digital object metadata
• using a mix of vocabs. can BIBFRAME help?
– File metadata (for various file types)
• https://answers.semanticweb.com/questions/19810/file-metadata-
ontology
– Classifiers
• Existing vocabs that can be reused? (for digital object types, …)
• Best practices
– Have you done something similar?
– What choices did you make?
14. Looking for:
Suggestions and feedback:
– modeling, technical decisions, …
– … anything else that comes to mind …
Collaboration ideas, projects:
– to do useful things with this information
• (re digital objects, authority data, …)
– further research and development
uldis.bojars@gmail.com / @CaptSolo
Hinweis der Redaktion
data about Digital Objects, Authorities, Classifiers