Linked Data for Digital History presentation for VU symposium "Connecting Data for Research". This presentation talks about the need for publishing interconnected research data using linked data and publishing tools and visualisaitions alongside those data. Examples include Dutch Ships and Sailors, DIVE and BiographyNet.
Linked Data for Digital History presentation for VU symposium "Connecting Data for Research"
1. Linked Data for
Digital History
Connecting Data for Research
Victor de Boer
With input from Christophe Guéret, Serge ter Braake,
Niels Ockeloen, Antske Fokkens, Dirk Roorda, Lora
Aroyo, Johan Oomen, Oana Inel, Jan Wielemaker, Jeroen Entjes
2. Victor de Boer
Web & Media Group, CS, Vrije Universiteit Amsterdam
Netherlands Institute for Sound and Vision
Cultural Heritage
Digital History
Linked Data for Development
3. Digital History
Sub-discipline of digital humanities
Part of the effort of historian is moved from
the physical archives to digital ones
Cross-domain collaboration
Img:www.doaks.org, www.dkrz.de
4. Tools and visualisations
http://armstrongdigitalhistory.org/, http://www.vcdh.virginia.edu/courses/fall07/hius401-f/,
http://digitalhistory.unl.edu/essays/thomasessay.php, http://www.philipvickersfithian.com/2013/05/gender-in-stacks-on-managing-small.html
5. “That is great. I would love that…
…but my research questions are slightly different.”
Img:Monty Python
6. Aging
Data Tool
C. Guéret based on http://redmonk.com/jgovernor/2007/04/05/why-applciations-are-like-fish-and-data-is-like0wine/
7. Even better
Do not bake the data into the tool and treat
data as an end product.
Build tools on top of the data.
Make sure others can do so as well.
Fig: C. Guéret
8. Linked Data for Digital History
• Represent heterogeneous datasets with their own
data models in common format: Resource
Description Format (RDF)
– Link what can be linked
• re-use and re-usability
• Linked Data is the (technically) best way to
publish and share your (research) data
OBJECT EVENT
PLACE
TIME
PERSON
CONCEPT
PROVENANCE
17. MEDIA HISTORIANS AND RESEARCHERS
MediaresearcherLarsArveRøsslandoftheUniversityofBergen.(Photo:AndreasR.Graven)
EXPLORATIVE SEARCH
Digital Hermeneutics: The combination of digital
(Web) technology and theory of interpretation
20. DATA CONNECTED IN KNOWLEDGE GRAPH
DIVE:MEDIA OBJECT SEM:EVENT
SEM:PLACE
SEM:TIME
SEM:ACTOR
SKOS:CONCEPT
OA:ANNOTATION
LINKS TO EUROPEANA
LINKS TO DBPEDIA
22. BiographyNet
Starting Point: Biography
Portal of the Netherlands;
www.biografischportaal.nl
125,000 short biographical
descriptions with limited metadata
from 23 Dutch biographical
dictionaries (~76,000 individuals)
What kind of historical
questions can be answered
with these data with the help
of computational methods
Biographynet.nl
23. Johan Rudolph Thorbecke werd
in 1798 geboren op 14 januari
in Zwolle en komt uit een half-Duit
Johan Rudolph Thorbecke werd
in 1798 geboren op 14 januari
in Zwolle en komt uit een half-Duit
Linked Data for
BiograpyNet
Thorbecke
Biographical
Description
Provenance
Meta Data
NNBW
Person
Meta Data
“Thorbecke”
Biography
Parts
Birth
1798
Event
Biographical
Description
Enrichment NLP Tool
Person
Meta Data
Event
Birth
Johan Rudolph Thorbecke werd
in 1798 geboren op 14 januari
in Zwolle en komt uit een half-Duit
Zwolle
1798-01-14
Biographynet.nl
24. a
Provenance in Biographynet
Ensure credibility of the demonstrator, to evaluate its
performance and to improve the academic status of the tool
Information involved Sources, but also: NER input data, etc.
Processes involved All steps in enrichment, aggregation…
People involved Who was responsible for pipeline, tool,
Biographynet.nl*Daniel Garijo, Yolanda Gil; http://www.opmw.org/model/p-plan
26. Framework generic solutions with historians
1. Preprocess, Clean, Model, Link, Enrich data in a collaboration with
domain experts
2. Access heterogeneous datasets in a convenient way to get an
intuition of the character and anomalies of the (linked) data;
3. Perform arbitrary queries to retrieve results relevant to their
research questions;
4. Verify the veracity of query results, by following provenance links
to original material
5. Retrieve and analyze the data with tool of preference.
6. Republish and share results
27. Historical tool criticism
… willingness from historians to invest the time to
learn about computer processes (at least the basic
principles)
Possibilities for education at universities to bridge
the gap between computer science and humanities
studies and make tool criticism an integral part of
student’s curricula
“Why do we still teach history student to decipher
17th Century handwriting, but not SQL”
Humanities research foc
As (digital) humanities researchers seek more (international and cross-domain) collaboration, integrating humanities datasets becomes more important to thoseresearchers. One subdomain where this is very much prevalent is in (social) historical research. Often historical researchers collect data from historical archivesfor their specific research questions. However, these datasets are often not presented in sharable formats to other researchers. If they are shared at all, thedatasets are published in a multitude of formats. To further the digital historyagenda, it has been recognized that representing and sharing data is key [4, 10].
How can we facilitate building applications on top of Linked Cultural Data