It's All About the Metadata

IT’S ALL ABOUT THE METADATA
Shana L. McDanold
June 10, 2014
1

WHY DOES METADATA MATTER? – GEORGE
Search: Koran Search: Quran
2

WHY DOES METADATA MATTER? –
ONESEARCH
Search: Koran Search: Quran
3

WHY DOES METADATA MATTER? – GEORGE
Search: 9/11 Search: 9-11
4

ONESEARCH
Search: 9/11 Search: 9-11
5

DIGITALGEORGETOWN
Author Index Subject Index
6

WHY DOES METADATA MATTER?
 “This town built a memorial to the wrong guy”
 Ottawa, Canada
 “It’s the metadata, stupid: and it’s not just for your
audience” (Joshua Lasky, posted 5/21/2014)
 “To succeed in the digital age is to be able to easily
aggregate all of your articles in the most meaningful
way for each of your visitors. Competitors such as Circa
actively use metadata to surface relevant content during
breaking news events.” 7

WHY DOES METADATA MATTER?
 What are we trying to identify? OR What are people
trying to find?
 Works
 Individuals
 Places
 Things/objects
 Concepts
 Discovery and discovery enhancement
 Relationships
 “On the fly” collections of resources
 Users start elsewhere 8

WHAT DO WE DO WHEN WE CURATE [CREATE]
METADATA?
 Create and enhance descriptive metadata
 Apply controlled vocabularies
 Disambiguation of works, authors, etc.
 Unique identification of editions, works, etc.
 Collocation of editions, works, etc.
 Use agreed upon standards for data elements to
ensure consistent application/use
 MARC
 DigitalGeorgetown (DublinCore)
 RDF (Resource Description Framework)
9

HOW DO WE EXPOSE “OUR” METADATA?
 Controlled vocabulary and mapping
 Genres
 Subjects/Concepts
 Classification
 Identification:
 People
 Places/Geographic
 Works
 OWL (Web Ontology Language)
 SKOS (Simple Knowledge Organization System)
 Normalization
 Indexing 10

OWL: WEB ONTOLOGY LANGUAGE
 Utilizes RDF (Resource Description Framework)
 5.2 Individual identity
 Many languages have a so-called "unique names" assumption:
different names refer to different things in the world. On the web,
such an assumption is not possible. For example, the same
person could be referred to in many different ways (i.e. with
different URI references). For this reason OWL does not make
this assumption. Unless an explicit statement is being made that
two URI references refer to the same or to different individuals,
OWL tools should in principle assume either situation is possible.
 OWL provides three constructs for stating facts about the identity
of individuals:
 owl:sameAs is used to state that two URI references refer to the same
individual.
 owl:differentFrom is used to state that two URI references refer to different
individuals
 owl:AllDifferent provides an idiom for stating that a list of individuals are all
different.
11

SKOS: SIMPLE KNOWLEDGE ORGANIZATION
SYSTEM
 Utilizes RDF (Resource Description Framework)
 2.3 Semantic Relationships
 In KOSs semantic relations play a crucial role for defining
concepts. The meaning of a concept is defined not just by the
natural-language words in its labels but also by its links to other
concepts in the vocabulary. Mirroring the fundamental categories
of relations that are used in vocabularies such as thesauri
[ISO2788], SKOS supplies three standard properties:
 skos:broader and skos:narrower enable the representation of hierarchical
links, such as the relationship between one genre and its more
specific species, or, depending on interpretations, the relationship between
onewhole and its parts;
 skos:related enables the representation of associative (non-hierarchical)
links, such as the relationship between one type of event and a category of
entities which typically participate in it. Another use for skos:related is
between two categories where neither is more general or more specific.
Note that skos:related enables the representation of associative (non-
hierarchical) links, which can also be used to represent part-whole links
that are not meant as hierarchical relationships. 12

CURATED METADATA IN THE WILD – LIBRARY
OF CONGRESS
 Library of Congress data exposed as linked data
 “The Library of Congress Linked Data Service enables
both humans and machines to programmatically access
authority data at the Library of Congress. This service is
influenced by -- and implements -- the Linked Data
movement's approach of exposing and inter-connecting
data on the Web via dereferenceable URIs.”
13

CURATED METADATA IN THE WILD -
WORLDCAT
 Bibliographic records
14

CURATED METADATA IN THE WILD -
WORLDCAT
 Google searches!
15

CURATED METADATA IN THE WILD - OTHERS
 Wikipedia/dbpedia
 WorldCat: links to WorldCat Identities
 http://www.worldcat.org/identities/lccn-n79-007035/
 LCCN: links to LC National Authority File (NAF)
 http://id.loc.gov/authorities/names/n79007035.html
 VIAF record
 https://viaf.org/viaf/88919448/
 ISNI (International Standard Name Identifier) record
 http://isni-url.oclc.nl/isni/0000000121429031
16

 Wikipedia/dbpedia
 Disambiguation
 http://en.wikipedia.org/w/index.php?title=Category:All_disambi
guation_pages
 Identity management:
 John Smith http://en.wikipedia.org/wiki/John_Smith
 St. Mary’s Church
http://en.wikipedia.org/wiki/St._Mary%27s_Church
 Georgetown http://en.wikipedia.org/wiki/Georgetown
 Hamlet http://en.wikipedia.org/wiki/Hamlet_(disambiguation)
17

 “MARC 21 records for
CONSER serials either
cataloged or processed by
LC or by CONSER
(Cooperative Online Serials
Program) participants. Also
includes records with ISSN
assignments and U.S.
Newspaper Program
cataloging. Records include
all languages. Available in
MARC 21 and MARCXML
formats.”
eCIP CONSER
18

BUILDING CURATED METADATA: OTHER
OPTIONS
 Crowd sourcing
 Archives and Alumni
 Identification of individuals for identity control
 Penn Provenance project
 “We are trying to identify former owners and virtually reunite
dispersed collections, and we welcome any information you
have about the images posted here.”
 Incorporate data into records; establish identities
 https://www.flickr.com/photos/58558794@N07
19

CONCLUSION
 All comes back to the basics of metadata work:
 DESCRIPTION
 COLLOCATION
 DISAMBIGUATION (uniquely identifiable)
 RELATIONSHIPS
20

It's All About the Metadata

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (12)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie It's All About the Metadata

Ähnlich wie It's All About the Metadata (20)

Mehr von Shana McDanold

Mehr von Shana McDanold (8)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

It's All About the Metadata

Hinweis der Redaktion