1. Dublin Core and Digital
Collections
March 18, 2013
Richard Sapon-White
1
2. Overview
Dublin Core (DC)
History
Elements, Principles and Qualifiers
Syntax
Uses
Digital Collections and DC
Discussion of DC and Controlled
Vocabularies
2
3. DC History
2nd International WWW Conference,
Chicago, 1994
1st Dublin Core Workshop, OCLC
headquarters, Dublin, Ohio, USA
Development by Dublin Core Metadata
Initiative – http://dublincore.org
3
4. DC History
Rapid increase in the number of WWW
resources
Problem: retrieval of relevant documents
Solution: a metadata schema that is
Simple enough for resource creators to use
Flexible enough to allow for more detailed
description
Not restricted to any one exchange syntax
4
5. Fifteen DC Elements
Identifier Language Source
Relation Rights Coverage
Format Type Publisher
Description Date Contributor
Subject Title Creator
5
6. Characteristics of the Dublin
Core
•All elements are optional
•All elements are repeatable
•Elements may be displayed in any order
•Extensible
•International in scope
6
8. Dumb-Down
The fifteen core elements are usable with or
without qualifiers
Qualifiers make elements more specific:
Element Refinements narrow meanings, never
extend
Encoding Schemes give context to element values
If software encounters an unfamiliar qualifier,
look it up –or just ignore it!
8
9. The One-to-One Principle
• Describe one manifestation of a resource with
one record
Example: a digital image of the Mona Lisa is not
described as if it were the same as the original
painting
• Separate descriptions of resources from
descriptions of the agents responsible for
those resources
Example: email addresses and affiliations of creators
are attributes of the creator, not the resource
9
10. Appropriate Values
“Best practice for a particular element
or qualifier may vary by context, but in
general an implementer cannot always
predict that the interpreter of the
metadata will always be a machine. This
may impose certain constraints on how
metadata is constructed, but the
requirement of usefulness for discovery
should be kept in mind.”--from Using Dublin
Core by Diane Hillman
10
11. Qualified Dublin Core
Includes:
Additional element: Audience
Element Refinements
Value Encoding Schemes
11
12. Element Refinements
Make element meanings narrower, more specific:
a Date Created versus Date Modified
an Is Replaced By versus Replaces Relation
Depending on syntax chosen, refinements may appear as
stand-alone tags instead of with elements:
<dct:created>2002-10-04</dct:created>,instead of:
<dc:date><dct:created>2002-10-04
</dct:created></dc:date>
•Requires a schema to dumb-down Date Created to Date
Dublin Core is simple enough to support both usages
12
13. Encoding Metadata Records
Mid-1990s: HTML tags embedded in Web
pages
Simple, easy to deploy, but inflexible, hard to
maintain
Bad tags like DC.Creator.eyecolor imply a non-
existent support for nesting and for entity
distinctions
2000+: Better XML/RDF alternatives
RDF metadata supports complex structures without breaking
simple DC grammar
Open Archives Initiative promotes mass adoption of an XML
schema for simple, unqualified Dublin Core records - along
with a protocol to make them available 13
14. HTML-Encoded DC
<link rel=“schema.DC”
href=http://purl.org/dc/elements/1.1
title=“Dublin Core Metadata Element Set,
version 1.1”>
<meta name=“DC.Element_name”
content=“element_value”>
Example: <meta name=“DC.Title”
content=“Using Dublin Core”>
14
16. Problems with using HTML for
DC
Useful only when web crawlers and
search engine indexers can detect
<meta> tags
HTML not useful for complex
constructions (e.g., when repeated
elements need to be grouped)
HTML only useful with metadata
embedded in documents
16
17. RDF/XML-Encoded DC
Allows multiple metadata schemes to
be read by humans and parsed by
machines
Allows multiple objects to be described
All namespaces must first be defined
17
18. RDF/XML-Encoded DC Example
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-
syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description
rdf:about="http://media.example.com/audio/guide.ra">
<dc:creator>Rose Bush</dc:creator>
<dc:title>A Guide to Growing Roses</dc:title>
<dc:description>Describes process for planting
and nurturing different kinds of rose
bushes.</dc:description>
<dc:date>2001-01-20</dc:date>
</rdf:Description>
18
19. Value Encoding Schemes
Indicate that the value is:
a term from a controlled vocabulary (e.g.,
Library of Congress Subject Headings)
a string formatted in a standard way (e.g.,
that "05/02" means May 2nd, not February
5th)
19
20. Uses of Dublin Core
Subject gateways and portals
Description of resources generated from DC
Digitization projects where full cataloging
would be too time consuming or problematic
http://osulibrary.oregonstate.edu/digitalcollecti
ons/
Union catalogs, search engine indexes, external
databases
Converted from more detailed metadata in a local
database 20
Hinweis der Redaktion
Really just a brief overview of it, pointing out highlights of the system
These terms are identified in Chapter 8, MAL Problems: some elements are ambiguous or overlap: creator and contributor. Is the creator the one who created the digital object or the original print version? (Problem with multiple versions) This is easily dealt with in the library community/AACR2, but in the web environment there is no authority (other than Dc) and the system has been simplified for a purpose.
The first three greatly simplify the metadata creation process (as opposed to AACR2/MARC, which specifiies these aspects for many, many fields). Could create metadata with only title, or with only title and creator. Repeatability also enables elements to be used for somewhat different purposes (as in both title and parallel title being recorded in two title elements. Extensible: ability to apply the core element set to various circumstances by extending the meaning of elements through the use of qualifiers International in scope: translated into several other languages; can be used in other languages because of its simplicity
Point out: encoding schemes can be identified by use of the attribute “scheme” (see DCMIType) element refinement can be done via “dot” representation (see 2003 for date created)
Last point: metadata not always embedded in documents, sometimes exist as separate records. For these purposes XML is the exchange syntax of choice
Namespace: set of values defined by a metadata scheme
Example of 2 nd : OSU’s digitial library collections (which we saw earlier in class) Example of last: Open Archives Institute’s protocol for metadata harvesting