Linked data enhanced publishing for special collections (with Drupal)
1. ZBW is member of the Leibniz Association
Linked Data Enhanced Publishing
for Special Collections
Joachim Neubert
ZBW – German National Library of Economics
Leibniz Information Centre for Economics
ELAG 2013
Ghent, Belgium
29.5.2013
2. Motivation
• Special collections often consist of specific (sometimes unique)
kinds of objects with special attributes (e.g., type or material,
selected from a list) which require non-standard data structures
• Custom navigation (e.g., by historical period or dynasty) is eligible
• Static pages (e.g., a “news” section, “about” or “help” pages) are
often required, too
Turn-key “standard” systems for these requirements are not
available
Page 2
3. My own background
• Scientific software developer at ZBW – German National Library for
Economics, mainly concerned with Linked Open Data and
knowledge organization systems and services
• Published 20th Century Press Archives in 2010, with some 100,000
digitized newspaper articles in dossiers (http://zbw.eu/beta/p20,
custom application written in Perl)
• Published a repository of ZBW Labs projects recently – basicly
project descriptions and a blog (http://zbw.eu/labs, Drupal based)
Page 3
4. Page 4
Further Agenda
1) Linked Data, Content Management Systems, and Drupal
2) Customizing Drupal 7 for special collections
3) Current limitations of RDF/LD support in Drupal 7
4) Outlook
7. Weaving the web of Linked Data
Page 7
http://www.worldcat.org/oclc/8281462
http://viaf.org/viaf/213505112
http://dbpedia.org/resource/Frida_Kahlo
8. Page 8
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
9. So, why linked data enhanced publishing?
• Differentiate the subjects of your web pages and their attributes
• Thus, foster data reuse in 3rd party services and applications
• Mashups
• Search engines
• Create meaningful links, adding value for users
Page 9
10. Why use a content management system?
• Standard tasks (browser compatibility, page templates, responsive
css, site navigation, search, form handling, calendar, wysiwyg,
revisions, translations, permissions, data management , security)
made easy
• Easy-to-add web 2.0 features (blogging and comments, tags, rating,
forums, …)
• Know-how available outside a single development team
Page 10
11. Why Drupal?
• Open & modular architecture
● Extensible by modules
● Standards-based
● Scalable
• Widely used all over the world
• Vibrant open source community,
and commercial services, too
Page 11
http://drupal.org/getting-started/before/overview
http://de.slideshare.net/scorlosquet/drupal-as-a-semantic-web-platform
21. Output in RDFa
• Drupal renders RDF mappings as HTML attributes
• No frickling in HTML producing code or templates
• Works out of the box for different Drupal themes (screen designs)
• In Drupal 7, by default XHTML/RDFa 1.0
• Themes for HTML5/RDFa 1.1 available (e.g., Zen)
Page 21
22. M
<http://zbw.eu/labs/project/zbw-labs> a schema:CreativeWork,
doap:Project;
dc:description """
ZBW Labs Website is a semantically enriched directory of ZBW Labs
Projects. Labs projects range from small showcases, which may or may
not be part of a larger project, to full-fleged applications in beta
state.
The new ZBW Labs website is based on Drupal 7 and uses RDFa, which is
part of Drupal Core. Used vocabularies are Dublin Core Terms (dc),
Description of a Project (doap) und Schema.org (schema).
"""@en;
dc:subject <http://zbw.eu/labs/en/taxonomy/term/1>,
<http://zbw.eu/labs/en/taxonomy/term/11>,
<http://zbw.eu/labs/en/taxonomy/term/3>,
<http://zbw.eu/labs/en/taxonomy/term/50>;
dc:title "ZBW Labs Website"@en;
schema:name "ZBW Labs Website"@en;
[...]
doap:created "2012-04"^^xsd:gYearMonth,
"2012-04-01T00:00:00+02:00"^^xsd:gYearMonth;
doap:homepage <http://zbw.eu/labs>;
doap:name "ZBW Labs Website"@en;
doap:shortdesc "ZBW Labs projects exposed as Linked Open Data"@en;
xhv:license <http://creativecommons.org/publicdomain/zero/1.0/> .
<http://zbw.eu/labs/en/taxonomy/term/3> a skos:Concept;
rdfs:label "Publishing Technologies"@en;
skos:prefLabel "Publishing Technologies"@en . Page 22
23. Additional Linked Data / RDF features
• Serialize Drupal RDF data in RDF/XML, Turtle, NT (Modules: RDFx,
RESTful Web Services) *, JSON-LD (Modules: JSON-LD, restws)
• Expose Drupal RDF data in a SPARQL endpoint (Module: SPARQL)
• Support microdata output (Module: Microdata)
• Consume RDF data from other SPARQL endpoints and display it as
part of a Drupal site (Module: SPARQL Views)
• Add links to other Linked Data entities (Module: Web Taxonomy)
* currently does not work with PostgreSQL – for a workarround, see http://drupal.org/node/1999754#comment-7438562
Page 23
24. Web Taxonomy: Using vocabularies from the web
• Autocomplete widget for Drupal fields, powered by vocabularies
maintained elsewhere
• Prerequisites:
• a web-accessible autosuggest service which delivers terms and
their URIs (may be JSON, SPARQL results or even SOAP)
• a custom coded plugin to access the service
Page 24
25. Page 25
Plugin example: Economics Taxonomies
Code downloadable and installable from https://drupal.org/sandbox/jneubert/1447918
• Third party thesauri, such as STW Thesaurus for Economics, can be
re-used for indexing a collection
26. Extending Drupal even further
Drupal is not only a CMS, but also a Content Mangement Framework
• Well defined APIs (database abstraction layer, Field API, RDF
Mapping API, Form API, Entity API, …)
• Entity API allows building custom entities with arbitrary properties
• … even residing in remote databases
requires substantial programming skills
Page 26
28. Cool URIs require work
1) out-of-the-box default URI
http://zbw.eu/labs/en?q=node/25
2) with the „Clean URLs“ feature enabled
http://zbw.eu/labs/en/node/25
3) with the (core) „Path“ module enabled and an alias defined
http://zbw.eu/labs/en/project/zbw-labs
4) removing the language path element (en/de) in multilingual sites for
a language-independent resource URI requires custom code
http://zbw.eu/labs/project/zbw-labs
(see code example in http://groups.drupal.org/node/247058#comment-798823)
Page 28
29. Nested RDF structures only with custom code
Workarround example:
Git repository URI in DOAP ontology demands a separate node, e.g.,
<> a schema:CreativeWork, doap:Project;
doap:repository [ a doap:GitRepository;
doap:location "http://github.com/some/id.git" ];
• Create field_gitrepository and map to doap:location
• Create a custom template file for the field (field--field_gitrepository--
lproject.tpl.php)
<div rel="doap:repository" class="field-items"<?php print $content_attributes; ?>>
<div about="[_:repos]" typeof="doap:GitRepository">
<?php foreach ($items as $delta => $item): ?>
<div class="field-item"<?php print $item_attributes[$delta]; ?>><?php
print render($item); ?></div>
<?php endforeach; ?>
</div>
</div>
</div>
Page 29
30. Further limitations in Drupal 7
• RDFa support currently works for single entities – pages with multiple
entities (search results, term pages, views, etc.) are not supported
• RDFa output may break under certain special conditions
(http://drupal.org/node/1778226)
Page 30
31. Outlook to Drupal 8
• Drupal base functionalities refactored for using Symphony framework
• Aimed at an extended service-oriented architecture – design goal:
“Each piece of content gets its own URL”
(http://www.unleashedmind.com/en/blog/sun/drupal-8-the-path-forward)
• RDF module split up into a plain RDF mapping module and another
module for creating RDFa markup
Page 31
32. To sum up
• Linked data publishing via a CMS, in particular Drupal, is a valid
option
• If your data can be mapped to an essentially flat RDF data structure,
linked data can be added mostly by site builders, without much
additional effort
• Sometimes research is required on how to solve problems, and at
times glue code has to be written
• But: most of the code for your web application is already there, and is
supported by a large and helpful Open Source community
Page 32
33. Page 33
Thank you!
Joachim Neubert
ZBW – Leibniz Information Centre for Economics
j.neubert@zbw.eu
http://zbw.eu/labs