OCLC Research project on archival approach to managing born-digital materials in research libraries, including how archivists' skills and knowledge can benefit broader digital library development.
OpenShift Commons Paris - Choose Your Own Observability Adventure
Jd born digital to dlf 20111031
1. Born-Digital
An Archival Approach
Jackie Dooley
Program Officer
OCLC Research
Digital Libraries Federation
Baltimore, 31 Oct 2011
2. Heads up!!
This project is a work in
progress.
We’re eager for your feedback.
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 2
3. Assumptions
1. The average research library has
made limited progress with born
digital materials beyond IRs.
2. Archivists can and should be major
players in digital library
development.
3. Archival approaches to date have
focused on complex solutions.
4. Resources are very limited.
5. Most institutions need a “baby
steps” approach to get started.
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 3
4. Project objectives
• Explore where “special collections and
archives” intersect with “born digital” and
“digital library”
• Articulate the relevant skills and expertise
held by archivists
• Describe how these pertain to various types of
born-digital materials
• Outline “baby steps” to begin preserving
physical media
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 4
5. Taking Our Pulse: The OCLC Research
Survey of Special Collections and Archives
<http://www.oclc.org/research/publications/library/2010/2010-11.pdf>
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 5
6. Among our key findings …
“Your three most challenging issues”
1. Space
2. Born-digital materials
3. Digitization
Tough economy renders “business as usual”
impossible; 75% of library budgets
diminished
Survey population: 275 research libraries
in U.S. and Canada
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 6
7. Top education and training needs
1. Born-digital materials: 83%
1. Information technology: 65%
2. Intellectual property: 56%
3. Cataloging and metadata: 51%
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 7
8. Born-digital archival materials are …
Undercollected
Undercounted
Undermanaged
Unpreserved
Inaccessible
American Heritage Center
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 8
9. Key to percentages:
Red = % of respondents
Black = numerical data
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 9
10. Born-digital archival materials
• Digital materials currently held by: 79%
• Holdings reported by: 35%
• Percent held by top two libraries: 51%
• Percent held by top 13 libraries: 93%
• Gigabytes reported
• Entire population: 85,000 GB
• Mean: 1,465 GB
• Median: 90 GB
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 10
11. Born-digital archival materials
• Assignment of responsibility for born-digital
management made by: 55%
• Most see multiple fundamental impediments: 89%
We conclude that …
Management of born-digital materials in
research libraries remains in its infancy.
Collecting is generally reactive, sporadic,
limited.
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 11
12. Born-digital: Assignment of responsibility
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 12
15. Born-digital: Recommended actions
1. Define the characteristics of born-digital
materials that warrant their management as
“special collections.”
2. Define a reasonable set of basic steps for
initiating a program for responsibly managing
born-digital archival materials.
3. Develop use cases and cost models for
selection, management, and preservation of
born-digital archival materials.
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 15
16. Our born-digital special collections project
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 16
17. Why this project?
1. Majority of research libraries have yet to take
even baby steps in born-digital management.
1. Majority of archivists have yet to take action
because they think they don’t know enough,
don’t have specialized resources, are generally
intimidated, need guidance on how to conquer
fear and take initial steps.
1. Research library directors often don’t know
how/why archivists’ skills and expertise are
broadly relevant tolibrarywide management of
digital library content.
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 17
18. Our target audiences
• Research library directors and higher
administration
• Archivists and special collections librarians
• Other research library specialists
• Collection development
• Digital library
• Information technology
• Institutional repository
• Metadata
• Scholarly communications
• Web development
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 18
19. Born-digital archival materials are …
• Audio
• Databases
• Email
• Institutional records
• Manuscripts
• Moving images
• Photographs
• Publications
• Social media
• Static data sets
• Textual documents
• Video games
• Websites
• Works of art American Heritage Center
… and more
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 19
20. There is no
one-size-fits-allsolution
for all types of digital content.
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 20
21. Archival skills and expertise
• Intellectual property
• Appraisal
• Legal issues
• Authenticity
• Preservation as
• Collective metadata permanence
• Collection development • Privacy and
• Context confidentiality
• Deeds of gift • Provenance
• Donor relations … but we need new
• Hierarchical skills too
relationships
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 21
22. Know your digital donors
•Primary/core •Naming
identities? conventions?
•Work products? •“Deleted” files?
•Habits? •Cloud content?
•Relationship
between physical
and digital content?
•Equipment?
•Storage locations?
•Restricted
information?
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 22
23. Know your digital donors
•Digital will
• Which content?
• Who should have access?
• Restrictions?
•New deed of gift provisions
• Copyright still applies
• Sustained donor access
• Right to duplicate files
• Right to make web-accessible
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 23
24. Advise your digital donors
The management of your digital materials can
be enhanced if you handle them in groups and
organize them in a logical manner. This
structure should be consistent with the
organization of any paper records you have,
or records in other media, so that all records
related to the same activity or subject, or of
the same type, can be identified as part of one
conceptual grouping.
--Author’s guidelines for digitalpreservation
Yale University
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 24
26. Address fears
Kirschenbaum& Nelson, RBS L-95
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 26
27. Manage sensitive personal information
Kirschenbaum& Nelson, RBS L-95
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 27
28. Collections management baby steps
•Inventory what you have
• Types of physical media?
• Estimated number of gigabytes?
• Maximum per physical object
•Initial appraisal
• What types of content?
• Level of significance/uniqueness?
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 28
29. Technical baby steps
• LearnBASIC “do no
harm” file management
• Capture metadata
• Identify file formats
• Virus scans
• Bit imaging
• Checksums
• Document all actions
• Who did what?
• Source of metadata Stanley Fish Papers, Univ. of California, Irvine
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 29
30. Technical baby steps
• Photograph physical
media
• Transfer from physical
media to secure storage
• Make copies; keep
archival copy
Smithsonian Archives
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 30
31. Ignore this (for now)!
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 31
32. Ignore this (for now)!
Kirschenbaum& Nelson, RBS L-95
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 32
33. Organizational baby steps
•Make friends with IT
•Promote your skills
•Keep pursuing educational
opportunities
… and learn by baby steps
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 33
34. Identify your low-hanging fruit
•Contemporary physical media & file
formats
•Creator-curated email: convert to PDF
•Photographs: expose on Flickr
•Text documents: convert to PDF
•Web pages: select a harvester and go
for it
… what else?
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 34
35. PDF those manuscripts
Rorty Papers, Univ. of California, Irvine
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 35
36. Don’t go near these … yet
• Data curation
• Databases
• Email systems
• Dynamic data
• Information management systems
• Obsolete physical media & file formats
• Social media
… what else??
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 36
37. Reactions?
Questions?
Advice?
Jackie Dooley
dooleyj@oclc.org
Born-Digital “Baby Steps,” Digital Library Federation, 31 October 2011 37
Hinweis der Redaktion
The Born Digital project is just getting started. Jackie Dooley has the lead on this one. Focus on enhancing effective management of born-digital materials as they intersect with special collections and archives in research libraries. First off we attempted a definition of born digital: Items created and managed in digital form. Working with a number of advisors, we intend to identify the skills and practices in the archival tradition that will be of value in the preservation of, and access to, materials that were born digital. We’ll also assemble the minimal steps that need to be taken, while ensuring that no harm is done. Photo by Merrilee Proffitt. CC BY-NC
A 2009 survey of special collections and archives in the US and Canadashows that digitization of special collections and increasing user access to those collections are of critical importance to research libraries. The survey was a follow up to the 1998 ARL survey led directly to many high-profile initiatives to "expose hidden collections.“ We updated ARL’s survey instrument and extended the subject population to encompass the 275 libraries in the following five overlapping membership organizations:• Association of Research Libraries (124 universities and others)• Canadian Academic and Research Libraries (30 universities and others)• Independent Research Libraries Association (19 private research libraries)• Oberlin Group (80 liberal arts colleges)• RLG Partnership, U.S. and Canadian members (85 research institutions)The rate of response was 61% (169 responses).http://commons.wikimedia.org/wiki/File:Stethoscope_1.jpgStethoscope. Public domain.
The OCLC report reveals, despite the efforts to uncover hidden collections, much rare and unique material remains undiscoverable, and monetary resources are shrinking at the same time that user demand is growing. The balance sheet is both encouraging and sobering:• The size of ARL collections has grown dramatically, up to 300% for some formats• Use of all types of material has increased across the board• Half of archival collections have no online presence• While many backlogs have decreased, almost as many continue to grow• User demand for digitized collections remains insatiable• Management of born-digital archival materials is still in its infancy• Staffing is generally stable, but has grown for digital services• 75% of general library budgets have been reduced• The current tough economy renders “business as usual” impossibleThe top three “most challenging issues” in managing special collections were space (105 respondents), born-digital materials, and digitization.
Focused on issues that warrant shared actionAssessment--Develop and promulgate metrics that enable standardized measurement of key aspects of special collections use and management.Collections--Identify barriers that limit collaborative collection development.--Take collective action to share resources for cost-effective preservation of at-risk audiovisual materials.User Services--Develop and liberally implement exemplary policies to facilitate rather than inhibit access to and interlibrary loan of rare and unique materials.Cataloging and Metadata--Compile, disseminate, and adopt a slate of replicable, sustainable methodologies for cataloging and processing to facilitate exposure of materials that remain hidden and stop the growth of backlogs.--Develop shared capacities to create metadata for published--Convert legacy finding aids using affordable methodologies toDigitization--Develop models for large-scale digitization of special collections,--Determine the scope of the existing corpus of digitized rare books,Born-Digital Archival Materials--Define the characteristics of born-digital materials that warrant theirmanagement as “special collections.”--Define a reasonable set of basic steps for initiating an institutional program for responsibly managing born-digital archival materials.--Develop use cases and cost models for selection, management, and preservation of born-digital archival materials.Staffing--Confirm high-priority areas in which education and training
To start out, we recognized that different people may have entirely different things in mind when they use the term born-digital. So we identified nine different types of born-digital material (you could no doubt, add more). There’s a document, but those of you with short attention spans (or who’d like to see OCLC Research staff make fools of ourselves) may want to view the video on our YouTube channel.