The Other Side of Linked Open Data: Managing Metadata Aggregation
1. The Other Side of Linked Data:
Managing Metadata Aggregation
ALCTS Metadata Interest Group
ALA Midwinter 2014
2. Where Are We Now?
⢠Major projects so far focused on exposing
selected portions of their data for
âexperimentationâ
â Whoâs using this data?
â Can LOD for libraries succeed on that basis?
⢠LOD is not just outputs, needs actual use to
inform practice
â A more complete view of the environment and
workflow should help
3. Outline
⢠Limitations of the traditional database strategy
â Including records, normalization, de-duplication, etc.
⢠Components of a fuller view
â Workflow
â Inputs, outputs
â Data cache and services
â Need for automated orchestration
â The maintenance conundrum
4. Substituting a Cache for a Database
⢠Supports multiple streams of data
⢠Allows detailed provenance to be carried over
time
⢠Separates services from data storage
⢠Allows more extensive automation (and
orchestration of services)
⢠Focuses valuable human effort where itâs
needed: analysis, design and implementation
of improvement services
5. Workflow
⢠Obtain data (possibly as ârecordsâ)
⢠Store data as statements in cache
⢠Evaluate data by source or collection
⢠Improve data using specific services, as
determined by evaluation
⢠Publish improved data
⢠[Rinse, repeat]
9. Yellow=Data we share now
Orange=Data we propose to share
Green=Data categories we can share
10. Developing and Defining Services
⢠Small single purpose services are easier to
develop and maintain
â What services you need are determined by goals,
evaluation results, etc.
â âOrchestrationâ of services applies them to specific
kinds of data, in order
â Services can be described, and linked, to expose
who, what, when and how to downstream users
11. Developing Automated Interaction
⢠Rule: Use humans for things requiring human
understanding and decision making
â Use machines for everything else
â A manual process for something a machine can do as
well or better is a failure
⢠Improvement services can be granular, invoked in
prescribed order, and report results for later use
â Continuous improvement necessary to respond to
continuous change
12.
13. Data Maintenance
⢠Improved data returns as statements to the data
cache, with provenance attached
⢠Statement strategy avoids overwriting of new data
over âimprovedâ data
⢠Each new statement adds to what is known about a
described resource
⢠Statements can be cherry picked and exposed to others in
statements or records, in âflavorsâ or as a âeverything we
haveâ
If LOD exists in multiple versions, and nobody uses it, does it make noise?
Evaluation using statistical analysis tool, from http://dcpapers.dublincore.org/pubs/article/view/744, Analyzing Metadata for Effective Use and Re-Use
Naomi Dushay, Diane I. Hillmann
Revised diagram from: Orchestrating metadata enhancement services: Introducing Lenny
Jon Phipps, Diane I. Hillmann, Gordon Paynter. Note that XForms in this context means âTransformsââwas well before an XForms standard that means something specific.
http://dcpapers.dublincore.org/pubs/article/view/803