1. Linked Open Europeana: Semantics for the Digital Humanities Prof. Dr. Stefan Gradmann Humboldt-Universität zu Berlin / School of Library and Information Science [email_address]
17. The Web of Things … Somewhat Mistaken Taken from Ronald Carpentier's Blog at http://carpentier.wordpress.com/ 2007/08/08/1-2-3/ What's wrong with this picture?
18. … and the Way we extend the Web in scope to make it a 'Web of Things'
23. Little visibility on the WWW (“Where's the Web in the SW?” Frank van Harmelen, 2006)
24.
25. => Linked Open Data extends the Web of documents in syntax and scope without falling back into the mistakes of Artificial Intelligence. Future extensions may well grow into a truly 'semantic' web … ( ≠Web 3.0)
26. The Europeana Data Model: Making Europeana Part of Linked Open Data Partially based on Martin Doerr, Stefan Gradmann, Steffen Hennicke, Antoine Isaac, Herbert Van de Sompel: The Europeana Data Model (IFLA 2010)
27. Pre-EDM This made V. Reding promise a „European Digital Library“ in 2005
98. In such a perspective, a strong profile characteristic for Europeana paradoxically results from what may be perceived as a specific European weakness: the scattered, heterogeneous and multilingual nature of our cultural resources requiring semantic foundations for conceptual interoperability!
99. A semantics aware Europeana delivers what has been asked for in the “Cyberinfrastructure for the Social Sciences and Humanities” report commissioned by the ACLS: “Our Cultural Commonwealth”
100.
101. provide APIs enabling specialised reasoning for machine based digital heuristics
102.
103. -> Meeting with Digital Humanists and Europeana Developers in Paris 4 th of April as a starting event for a continuous process!
104.
105. Stefan Gradmann: Knowledge = Information in Context: on the Importance of Semantic Contextualisation in Europeana. Europeana White Paper 1. http://www.scribd.com/doc/32110457/Europeana-White-Paper-1
The current data model of Europeana are the “Europeana Semantic Elements” (ESE). ESE addresses the issue of interoperability between the data from the different domains represented in Europeana by reducing the data to a “flat”, Dublin-Core like representation. This is a “simple and robust” approach but it has some drawbacks: The original metadata and information perspective are not visible anymore. And at the same time we can not specialize to finer-grained models or connect to external resources like LOD community. The EDM addresses exactly these shortcomings . It tries to transcend the different information perspectives which are represented in Europeana. It acts as a top-level ontology in order to make objects from different domains interoperable while still preserving the original data. The EDM is destined to replace ESE after the 2011 release of Europeana. The ESE will then be an „application profile“ of EDM. That means that all ESE data in Europeana will be still compatible with the new system.
The current data model of Europeana are the “Europeana Semantic Elements” (ESE). ESE addresses the issue of interoperability between the data from the different domains represented in Europeana by reducing the data to a “flat”, Dublin-Core like representation. This is a “simple and robust” approach but it has some drawbacks: The original metadata and information perspective are not visible anymore. And at the same time we can not specialize to finer-grained models or connect to external resources like LOD community. The EDM addresses exactly these shortcomings . It tries to transcend the different information perspectives which are represented in Europeana. It acts as a top-level ontology in order to make objects from different domains interoperable while still preserving the original data. The EDM is destined to replace ESE after the 2011 release of Europeana. The ESE will then be an „application profile“ of EDM. That means that all ESE data in Europeana will be still compatible with the new system.
EDM re-uses three ontologies all of which are defined as a RDFS model. SKOS SKOS is an ontology to model KOS (vocabularies) in the Semantic Data Layer of Europeana. It specifically enables cross-vocabulary matching between concepts. Dublin Core Dublin Core is used to describe the core features of culture objects. ESE uses “old” Dublin Core Element Set. EDM uses “new” Dublin Core Metadata Terms which are specializations of the 15 “old” Dublin Core Elements. The use of DC Terms ensures backward compatibility to ESE. OAI ORE The typical record about an object provided to Europeana will included several information pieces: e.g. with descriptive metadata, views (thumbnails, video files, audio files, text documents etc.), links to landing pages etc. OAI ORE allows us to group and organize these information pieces: the abstract “provided object” (Object), the descriptive metadata (Proxy), any “view” of the provided object (Digital Representation).
Mona Lisa as described and depicted by the French ministry of culture (Directions des musees de France)
This is the metadata record of the French ministry of culture modeled in EDM. Each bubble represent a resource. In the bubble you have the class of the resource (its type) in italics and beneath the URI of the resource which identifies it. The arrows are the semantic links (the properties) between the resources. If there are two properties then the one below is the sub-property of the other one with a more specific meaning. First we have the Aggregation node which groups together all information pieces delivered by the Ministry. It aggregates the node representing the physical object “Mona Lisa”, the digital representations of the Mona Lisa, and the proxy node which is specific to a given provider, and is used to represent the description of the provided object, as seen from the perspective of that specific provider. This is how every metadata record provided to Europeana will look like in its basic form. Why manage central nodes for provided objects? The ORE model says so: an ORE proxy has to be proxy for some "view- independent" resource. Users are looking for (real world) objects (the painting Mona Lisa) and not for the specific view on it of Louvre, or Jaconde (of which they normally do not know anyway). So the approach is: Find the object first (PhysicalThing) and then proceed to the specific views on it. This is also the LOD approach.
The EDM wants to preserve the original information perspective of a provider on his data as much as possible. The ability to create sub-classes and sub-properties with RDFS is a crucial aspect. For this purpose EDM provides a range of generic properties and classes as anchors to which more specialized properties can be connected by the providers. This is called mapping . Example: The EDM property “ens:hasMet” is used to relate an object to the various things (persons, places, etc.) which somehow participated in its history. Here the provider mapped his more specific property “formerOwner” to “hasMet” and thereby specifying the actual relation of Francois I. to the Mona Lisa painting. This co-existence of the generic and the specific level allows for example: to search for the painting using a generic description-based index to display the information for that painting using the finer-grained distinctions made by the provider. There might relatively wide semantic gaps between the EDM property and a sub-property provided (e.g. ens: hasMet and ex1:schema/formerOwner). Europeana expects communities to agree on application profiles in order to minimize such gaps and to implement functions building on and exploiting such contributions.
Europeana wants to contextualize and enrich its objects by linking them to resources which contain additional knowledge. This enables richer functions, such as query expansion (e.g., using alternatives for a creator's name), recommendation of objects using semantic relations between them (objects created by connected artists), etc. This is the same Proxy from the slide before but now all the string values are converted to resources and typed. For example the subject of the painting Mona Lisa “femme” is now a resource typed as a concept and with the english and french spelling of the concept attached taken from a KOS in the Semantic Data Layer. And in the same KOS we could also properly find the broader term for this concept. Furthermore we could semantically align the concept femme with the concept femme in the Wikipedia (LOD cloud) and take all the information available there for this specific subject, including the many translations of the term itself. To increase the data value of its objects.
What we looked at so far can be understood as object-centric modeling. The second general modeling approach is event-centric which tries to tell a story about the object’s history. For this purpose EDM provides a simple “event-centric core” of one class and three properties: ens:Event: hub for event descriptions ens:wasPresentAt, holding between any resource and an event it is involved in; ens:happenedAt, holding between an event and a place; ens:occurredAt, holding between events and the time spans during which they occurred. This is to give you an impression of what is possible without going into details.
This is a (more or less fictional) example of three records about a translation of Edgar Allan Poe’s “ The Narrative of Arthur Gordon Pym of Nantucket ” to french: Record from BNF about an edition from 1868 Record from Gallica about an edition from 1868 (which offers a digital version of the book online: this the WebResource) Record from BNF with an edition from 2007 A few things I want to point out: Two records about the same thing and both point to the same object of interest, the 1868 edition. The user will look for this edition and not for the specific view of Gallica or BNF on this edition. So this node is the point of entry from which a user will proceed to a specific view on the object. It is also apparent now why Proxies for the descriptive metadata are helpful: Because this way we can keep the two views on the 1868 edition distinct. Finally the link „isDerivativeOf“ is an example of an inter-object link. So, for example, if a user found the 2007 edition he will be also hinted to the digital version of the 1868 edition in Gallica. With respect to FRBR one could start discussing now what and where is the work, expression, manifestation, and item here. Although the development of the EDM has been inspired by FRBR it is not implemented yet. That will happen after 2011.
EDM is still under development, and will continue to be refined until the end of 2010. It will be implemented during 2011, in the lead up to the Danube release of Europeana. Before, during and after the implementation of EDM, data that is compliant only with ESE will continue to be accepted. EDM is compatible with ESE and no data will need to be resubmitted. Europeana will make available a converter, and any provider who wishes to resubmit data, in order to increase its richness within Europeana, will be able to do so if they wish but will be under no obligation. How will EDM data be delivered to Europeana? Providers will have to create mapping to EDM and deliver it alongside their data which ideally are metadata records properly linked (IDs) to a vocabulary. The data has to be in XML or RDF. From this Europeana will create EDM data which includes enrichments and linking to external resources (vocabularies in the semantic data layer and/or the LOD cloud). Prototyping? At the end of the year we will start to produce first EDM data for the productive version of Europeana. This data will be taken from existing ESE data and from rich data delivered to Europeana by then.
First a few words about the envisioned information architecture of Europeana: This is how the information space of Europeana will be restructured : At the “bottom” we have the objects which are provided to Europeana. Above we have the “Semantic Data Layer” which is new. It contains various kinds of KOSs with knowledge about people, places, concepts, and so on. These concepts are linked to the objects below and thereby contextualize and enrich them.
The data provided to Europeana will come from many different kinds of domains like libraries, archives, or museums. They all will provide their specific collections and KOSs . That will naturally result in „isles of information“ . In order to make the data interoperable the concepts of the various KOSs in the Semantic Data Layer will be aligned , that means they will be connected via cross-vocabulary links . This technically enables applications to navigate through a semantic layer of concepts from different sources and to use it to access objects which are originally described by different but semantically related concepts.
Europeana intends to connect to the Linked Open Data community. In the Linked Open Data cloud we find many more knowledge sources like Dbpedia, Geonames, or Library of Congress Subject Headings. Europeana wants to use them to further contextualize and enrich the objects in its information space. At the same time Europeana wants to make its own data available to other communities. The EDM is crucial for realizing this vision. [ LOD cloud July 2009 ]
Hier könnte ein Exkurs zu RTP Doc ansetzen, wenn ich mehr als 20 Minuten Zeit hätte
Hier könnte ein Exkurs zu RTP Doc ansetzen, wenn ich mehr als 20 Minuten Zeit hätte
Hier könnte ein Exkurs zu RTP Doc ansetzen, wenn ich mehr als 20 Minuten Zeit hätte