Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
OAC Technical Summary
1. Open Annotation: Technical Overview
h"p://www.openannota-on.org/
h"p://groups.google.com/group/
oac‐discuss
Robert Sanderson
Herbert Van de Sompel
Prototyping Team
Research Library
Los Alamos Na<onal Laboratory
OAC is funded by the Andrew W. Mellon Founda<on
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel
2. Outline
• Motivation
• OAC’s Perspective on Annotations
• Design Choices
• Alpha 3 Data Model
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 2
3. Motivation (1)
• Annotations are a core ingredient of scholarship:
• Transcribe and annotate medieval manuscripts;
• Annotate maps with maps;
• Annotate video recording of endangered languages with non-
verbal communication events;
• Annotate 3D museum artifacts;
• Your use cases …
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 3
4. Motivation (2)
• Existing solutions for scholarly annotation are repository-centric, tied
to silos:
• Annotation in terms of specific repository and/or collection, no
global scope;
• Only consumable in client/server combination that created the
annotation;
• Annotations not shareable beyond original server – can not create
cross system services based on (enriched & merged) annotations.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 4
5. Open Annotation Collaboration
• Focus on interoperability for annotations in order to allow sharing of
annotations across:
• Annotation clients;
• Content collections;
• Services that leverage annotations.
• Focus on annotation for scholarly purposes. But desire to make the
OAC framework more broadly usable.
• In order to gain adoption => tools, communities, integration of
scholarly communication with other areas of discourse.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 5
6. OAC’s Perspective on Annotations (1)
• The following characterize an annotation:
• There is an author (human, software agent) of an annotation;
• The annotation occurs at some point in time;
• There is a body of an annotation;
• There is a target of an annotation;
• The body annotates the target, i.e. the body is somehow
“about” the target.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 6
7. OAC’s Perspective on Annotations (2)
• Body and target of an annotation can be of any media type.
• A video can annotate a Web page; a Web page can annotate a
video.
• This is contrary to the prevailing view in which the body is
textual.
• Annotation, body, and target can have different authors.
• This is contrary to the prevailing view in which annotation and
body have same authorship.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 7
8. OAC’s Perspective on Annotations (3)
• Most (scholarly) annotations involve parts of resources (image
regions, slices of a video, dimensions of a dataset).
• The annotation framework should provide support for resource
segment identification and description.
• A variety of more complex annotations involve multiple targets (and
maybe multiple bodies?).
• The annotation framework should support this.
• So far, no compelling use cases for multiple bodies have been
identified.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 8
9. Design Choices (1)
• Interoperability specified in terms of the Architecture of the Web (URI,
link, resource, representation, …) , Semantic Technologies, Linked
Data.
• Aligned with desire to more tightly embed scholarly
communication in overall human discourse;
• Aligned with trend towards machine-actionable scholarly
communication system;
• Aligned with approach we followed in other efforts (OAI-ORE for
Aggregations; Memento for versioning).
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 9
10. Design Choices (2)
• Client autonomy.
• Not an annotation protocol (cf. Annotea) but an annotation model
combined with a publish/subscribe mechanism;
• No reliance on a server that helps with generation of
annotations. Only a server that supports publish/discovery of
the annotation created by the client is assumed.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 10
12. Publish/Subscribe
publish subscribe consume
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 12
13. Design Choices (3)
• An annotation is an information resource (aka document).
• Earlier versions of the model regarded annotations as
conceptual, non-information resources. This approach was related
to:
• Ideas to model annotations as events;
• Ideas to model annotations as OAI-ORE Aggregations.
• Community feedback to this approach was negative:
• Added complexity without added value;
• The use of OAI-ORE, and especially its Proxies, was deemed
artificial.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 13
14. Design Choices (4)
• Significant attention to problems related to the ephemeral nature of the
Web.
• Representations of URI-addressable resources change over time,
resulting in ambiguous or incorrect annotations, as well as
annotations that lack (representations of) body and/or target.
• The approach provides hooks to allow:
• Recognizing ambiguous/incorrect annotations, e.g. via timestamp
and fixity for body and target;
• Reconstructing the annotation, e.g. via timestamps, scholarly
archives, and Memento.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 14
15. Design Choices (5)
• A model that gracefully builds from simple to complex
• The simple core of the model supports elementary annotation use
cases.
• The model becomes gradually more complex to accommodate
increasingly complex use cases.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 15
16. Basic Model
• The basic model has three resources:
• Annotation (an RDF document)
• Default: RDF/XML but others via Content Negotiation
• Body (the comment or text of the annotation)
• Target (the resource the body is about)
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 16
17. Basic Model Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 17
18. Additional Relationships and Properties
• Any of the resources can have additional information attached,
such as creator, date of creation, title, etc.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 18
20. Annotation Types
• There can be further types of Annotation, such as a Reply.
• Example: Replies are Annotations on Annotations.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 20
21. Annotation Types Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 21
22. Annotation Types Example
We'll come back to
these …
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 22
23. Inline Information
• It is important to be able to have content contained within the
Annotation document for reasons of Client Autonomy:
• Clients may be unable to mint new URIs for every resource
• Clients may wish to transmit only a single document
• Third parties can generate new URIs if the client does not
• The W3C has a Content in RDF specification that describes how
to do this:
• http://www.w3.org/TR/Content-in-RDF10/
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 23
24. Inline Information: Body
• We introduce a resource identified by a non resolvable URI, such
as a UUID URN, as the Body.
• We then embed the data within the Annotation document using
the 'chars' property from the Content in RDF ontology.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 24
25. Inline Body Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 25
26. Multiple Targets
• There are many use cases for multiple targets for an Annotation:
• Comparison of two or more resources
• Making a statement that applies to all of the resources
• Showing the provenance of resources
• Making a statement about multiple parts of a resource
• The OAC Data Model allows for multiple targets by simply having
more than one hasTarget relationship.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 26
27. Multiple Targets Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 27
28. Segments of Resources
• Most annotations are about part of a resource
• Different segments for different media types:
• Text: paragraph, arbitrary span of words
• Image: rectangular or arbitrary shaped area
• Audio: start and end time points, track name/number
• Video: area and time points
• Other: slice of a data set, volume in a 3d object, …
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 28
29. Segments of Resources (2)
• Web Architecture Segmentation:
• A URI with a Fragment identifies part of the resource
• Media-specific fragment identifiers; eg XPointer for XML
• W3C Media Fragments URI specification for simple
segments of media: http://www.w3.org/TR/media-frags/
• We introduce a method of constraining resources:
• Introduce an approach for arbitrarily complex segments that
cannot be expressed using Fragments
• Can be applied to Body or Target resource
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 29
30. Segments of Resources: Fragment URIs
• URI Fragments are a syntax for creating subsidiary URIs that
identify part of the main resource
• The syntax is defined per media type
• X/HTML: The named anchor or identified element
• http://www.example.net/foo.html#namedSection
• XML: An XPointer to the element(s)
• http://www.example.net/foo.xml#xpointer(/a/b/c)
• PDF: Many options, most relevant two operations:
• http://www.example.net/foo.pdf#page=2&viewrect=20,80,50,60
• Plain Text: Either by character position or line position:
• http://www.example.net/foo.txt#char=0,10
• http://www.example.net/foo.txt#line=1,5
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 30
• :
31. Segments of Resources: Media Fragments
• Media Fragments allow anyone to create URIs that identify part
of an image, audio or video resource.
• The most common case is for rectangular areas of images:
• http://www.example.org/image.jpg#xywh=50,100,640,480
• Link to the full resource as well, for all Fragment URIs
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 31
32. Media Fragments Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 32
33. Media Fragments Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 33
34. Complex Constraints
• If a Fragment URI is not possible, we introduce a Constraint to
describe the segment of interest
• Media Fragments embed the segment description in the URI
• Constraints are entire resources, so can be more expressive
• Constraints may also describe 'contextual' information
• The ConstrainedTarget
(CT-1) identifies the
segment of interest
• The type of description
is dependent on the
media type and nature of
the target resource.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 34
35. Constraint Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 35
36. Inline Information: Constraints
• We can also use inline information in the same way as for the Body
resource to include the Constraint data.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 36
37. Inline Constraint Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 37
38. RDF Constraints
• Instead of having the information in an external document, it could be
within the RDF of the Annotation document.
• We attach information to the
Constraint node
• The Annotation Ontology
models its "Selectors" in this
way
http://code.google.com/p/annotation-
ontology/
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 38
39. RDF Constraint Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 39
40. RDF Constraint plus Media Fragment
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 40
41. Constrained Body
• The body may also be constrained in the same way as Targets
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 41
42. Constrained Body
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 42
43. Annotating a Non-Document
• The Target of an Annotation does not have to be a document:
can be a Non Information Resource
• Non Information Resource as Target
• Annotations about:
• Concepts
• Physical things
• Locations
• or any other non-document
• Example: A video about a real life painting
• Non Information Resource as Body
• We'll come back to this…
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 43
44. Non Information Resource Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 44
45. Annotations and Time
• Web Resources change over time, but keep the same URI
• This is important for linking, but makes Annotation hard. We need
to know when the annotation applies to the resource.
• This is true for Body and Target(s).
• The created time is not sufficient, as the Annotation, Body and
Target could be created at different times. The Body could be
about a previous state of the Target.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 45
46. Web-Centric Annotation: No Persistence
Google Sidewiki Annotation on http://news.bbc.co.uk/ as of 2010-06-14
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 46
47. Web-Centric Annotation: No Annotations
Archived page from:
http://www.dracos.co.uk/work/bbc-news-archive/2010/03/08/07.05.html
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 47
48. Web-Centric Annotation: Desired Cross-Linking
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 48
49. Annotations and Time
• There are three different types of Annotation with respect to
Time:
• Timeless Annotations
• These are always relevant, regardless of the current
state of the resource.
• Uniform Annotations
• There is a single timestamp at which all of the resources
should be considered.
• Varied Annotations
• Each of the resources (Body, Targets) should be
considered at a different time.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 49
50. Timeless Annotations
• The model for a Timeless Annotation is the base model
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 50
51. Uniform Annotations
• If the same time is applicable to all resources, we attach it to the
Annotation using the oac:when predicate.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 51
52. Uniform Annotation Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 52
53. Varied Annotations
• If different timestamps are required for each resource, we use
oac:when from an oac:TimeConstraint.
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 53
54. Varied Annotation Example
Open Annotation: Technical Overview
Robert Sanderson, Herbert Van de Sompel 54