Keynote presentation for CSWS 2013 Conference in Shanghai, China.
Some slides borrowed from Jan Wielemaker, Guus Schreiber, Jacco van Ossenbruggen, Niels Ockeloen, Antske Fokkens, Serge ter Braake.
1. Linked Data for
Digital Heritage
and History
Victor de Boer
VU University Amsterdam
Keynote CSWS 2013 Shanghai
2. About me
Victor de Boer
Assistant professor at VU University Amsterdam
Domain-driven Semantic Technologies, Linked Data
Cultural Heritage
Digital History
Linked Data for Development
3. Linked Data is ``a term used to describe a
recommended best practice for exposing,
sharing, and connecting pieces
of data, information, and knowledge on the
Semantic Web using URIs and RDF.’’ --Wikipedia
4. The evolution of Science
Antonie van Leeuwenhoek’s
microscope (17th C.)
Large Hadron Collider in
Switzerland (21th C.)
5. Why Linked Data for E-science
Large amounts of data
Efficient analysis, data mining
Sharing data, information and knowledge
between scientists
Across continents
Across disciplines
10. MultimediaN E-Culture project
• Museums have increasingly nice websites
• But: most of them are driven by stand-alone collection
databases
• Data is isolated, both syntactically and semantically
• If users can do cross-collection search, the individual
collections become more valuable!
• Semantic Search
12. Search for objects which are linked
via concepts (semantic link)
China
Kanton
PartOf
Query
“China”
Use the type of semantic link to provide
meaningful presentation of the search results
Rijksmuseum: View
of Canton, with
two Dutch ships
Semantic Search
13. Vocabulary alignment
• In large virtual collections
there are always multiple
vocabularies with its own
perspective
– In multiple languages
– You can’t just merge them
• But you can use vocabularies
jointly by defining a limited
set of links
• It is surprising what you can
do with just a few links
18. Amsterdam Museum
• Formerly Amsterdam Historic
Museum
– “The rich collection of works of art,
objects and archaeological finds brings
to life the fortunes of Amsterdammers
of days gone by and today.”
• In March 2010 published their whole
collection online
– 70.000 objects
– CC license
19. Requirements for conversion and linking
• Transparent conversion
and linking of the data
– Use of provenance and
reproducibility
• keep original
complexities of the data
• while making it
interoperable with other
(Europeana) data
• Retain the relation to
original data
19
20. Methods
ClioPatria
XMLRDF
1. XML ingestion (OAI)
2. Direct transformation to ‘crude’ RDF
3. Interactive RDF restructuring
4. Create a metadata mapping schema
5. Align vocabularies with external sources
6. Publish as Linked Data
Amalgame
Tools
28. (Narrative) historical methodology
• Historical facts derived mainly from archival
findings and existing literature
• Historians put them together into a
narrative/synthesis.
– The Narrative: a historical synthesis which can not be
scientifically proven (only made likely) based on facts
which can be proven or falsified. There is necessarily a
creative element in drawing up a narrative
Slides by BiographyNet team
29. Where do eScience and Biographical History meet?
• Quantitative analyses of a
larger group of people
(prosopography).
Surpassing the anecdotal.
• Finding relations/networks
between people which are
otherwise hard to detect
30. Johan Rudolph Thorbecke werd
in 1798 geboren op 14 januari
in Zwolle en komt uit een half-Duit
Johan Rudolph Thorbecke werd
in 1798 geboren op 14 januari
in Zwolle en komt uit een half-Duit
Linked Data for
BiograpyNet
Thorbecke
Biographical
Description
Provenance
Meta Data
NNBW
Person
Meta Data
“Thorbecke”
Biography
Parts
Birth
1798
Event
Biographical
Description
Enrichment NLP Tool
Person
Meta Data
Event
Birth
Johan Rudolph Thorbecke werd
in 1798 geboren op 14 januari
in Zwolle en komt uit een half-Duit
Zwolle
1798-01-14
31. Prototype under development
The information provided by the first system can
be used to:
1. Identify alternative descriptions of events
(same time, location and/or participants)
2. Identify relations between events
(same locations & time, consequent events,
same participants, etc.)
3. Initial networks of people
http://www.biographynet.nl
33. History of German occupied Dutch society
(1940-1945)
Published between 1969 and 1991 in 14
volumes, 30 parts, 18.000 pages
1. Digitization,
2. Open Data,
3. Enriched access with Linked Open Data
Het Koninkrijk der Nederlanden in de Tweede
Wereldoorlog
(The Kingdom of the Netherlands During World War II )