SlideShare ist ein Scribd-Unternehmen logo
1 von 65
Downloaden Sie, um offline zu lesen
The Dark Side of
Digital Preservation:
Distributed Digital
Preservation
Sibyl Schaefer, Mary Molinaro, Katherine Skinner, Chip German, David Pcolar
Sam Meister
Archives Records 2016
Session 606
Saturday August 6, 2016
Atlanta, Georgia
Briefing for SAA, August 6, 2016
Chip German, Program Director, APTrust
and Senior Director, Content Stewardship, at
the University of Virginia Library
chip.german@aptrust.org
APTrust Web Site: aptrust.org
Sustaining Members:
Columbia University
Indiana University
Georgetown University
Johns Hopkins University
North Carolina State University
Pennsylvania State University
Syracuse University
University of Cincinnati
University of Connecticut
University of Maryland
University of Miami
University of Michigan
University of North Carolina
University of Notre Dame
University of Virginia
Virginia Tech
aptrust.org
• Quick description of APTrust:
• Provides simple, collaboratively designed and built preservation
environment (bit-level with fixity checking every 90 days) for the
digital scholarly and cultural record (NOW)
• Will provide collaboratively developed services for that content
(FUTURE), which may include:
• choice of preservation assurance levels/cost plans
• access
• format migration, emulation, software preservation
• analysis tools for research
• Currently uses Amazon Web Services (as back-end platform only)
for metadata
and events
Three content
copies in East Coast
data center (three
separate
availability zones)
S3 service
(preservation)
Three content copies
in West Coast data
center (three
separate availability
zones) + Fedora copy
Money
● Sustaining Members pay $20,000 per year in dues
○ Engage in governance, strategic direction-setting, development, etc.
○ Get allocation of 10 TB of content (with replication, total of 60 TB)
● University of Virginia provides $380,000 per year toward staffing costs
● Institution wishes to deposit more than 10 TB of content?
○ Pass-through of incremental costs in 5 TB blocks of content
■ $2,750 per year
○ Approximately $600,000 cash reserve
○ Serves as DPN ingest/replicating node (full cost-recovery from DPN)
Current metrics in our 2nd year of production
● As of August 2, we have 64,111 content objects from 8 institutions in
production environment
● 17.2 TB of content, 3 million (as of June) PREMIS events
● Additional institutions with content in current testing
Future
● Much more volume in deposits
● Develop collaborations for additional services
● Develop sophisticated reporting for depositors
● Services for smaller cultural heritage organizations, libraries, museums;
new categories of APTrust members
Challenges
● Drive costs downward to encourage preservation
■ Balance between preservation principles and cost (if not, just
storage)
■ Technical architecture enables interchangeable back-end platforms
● Balance between
■ importance of Trusted Digital Repository certification
■ cost and workload of TDR audit (potentially $10-$15K every three
years, although Sibyl is educating me about this)
APTrust as a DPN ingest and replication node
● APTrust members can designate their deposit to DPN (individually sign a
separate agreement with DPN)
● APTrust “unbags” a verified copy of deposit from APTrust bag and
“rebags” it to DPN standard into AWS Glacier service in the AWS VA data
center (where three copies are kept, each in a separate availability zone)
● APTrust then collaboratively replicates the data to two other DPN
replication nodes
Sibyl Schaefer // @archivelle
University of California, San Diego
Digital preservation dark storage network spanning multiple
institutions and geographic regions.
Partners:
University of California, San Diego (UCSD)
National Center for Atmospheric Research (NCAR)
University of Maryland Institute for Advanced
Computer Studies
● Funded by a series of NDIPP grants designed to
develop the technological infrastructure needed for
digital preservation.
● First ingest date: 2008
● Nodes are all non-profit research centers.
● Content-agnostic - depositors form SIPS for ingest,
Chronopolis runs minimal packaging processes on data
● Majority of data is UCSD digital library and research
data
CRL TRAC Certification
2012
Focused on active preservation – constant checking of
items.
Chronopolis
JBOD
Replication service
ACE
NCAR JBOD
Replication service
ACE
UMIACS
Ingest
Ingest service
Postgres
Isilon
Replication service
ACE
UCSD
Intake service
DuraCloud as a pathway to Chronopolis
● Provides an existing hosted user interface, reduced need for new development
● Simplifies the process of moving content in and out of the systems
● Shared institutional values
Chronopolis as a storage provider for DuraCloud
● Extends the DuraCloud network to include a non-commercial, highly distributed,
dark archive option
DuraCloud Vault
● Chronopolis + DuraCloud + DPN
S3 Glacier
DuraCloud
UI
Storage
Mediation
Manifest Store
Original
Content Store
REST API
Sync
Tool
Data Owner
Snapshot
Bridge
Bridge
Storage
Bridge
Application
Maryland
UCSD
NCAR
DPN
Node 2
DPN
Node 3
DPN
Network
Chronopolis and DPN
Same:
● dark storage
● 3 copies at geographically different nodes
Different:
● business model
● stewardship succession
● administration
● membership involvement
SAA August 2016
www.dpn.org
Mary Molinaro
Chief Operating Officer
mary@dpn.org
Dave Pcolar
Chief Technical Officer
dave@dpn.org
DPN is a 50+ member cooperative organization investing in
long-term, scalable digital preservation for the academy.
As a project of Internet2, DPN is able to leverage established Research and
Education Networks to facilitate large scale data movement into preservation
environments across the United States.
Why DPN?
● Despite institutions’ individual efforts only a fraction of
what should be preserved is being preserved.
● How will we ensure that the digital assets created and
collected in our institutions are available to scholars in the
future?
● A solution is needed to save at risk content in a way that
protects against technical failure, natural disaster, and
institutional failure.
What does a solution look like?
1. Geographically distributed
2. Content replicated
3. Audit and repair
4. Technological diversity
5. Meet best practices for repositories
6. Succession agreements
The business model
Currently:
● Members pay an annual fee of $20,000
● Members receive 5TB of preservation
storage per year
● Pay once / stored in perpetuity
● Members can purchase additional TB
Now working on solutions for smaller and
larger institutions - available next year
DPN is in production and
accepting deposits!
What is next?
❖ Address problems we know about
➢ getting started
➢ workflow at the local level
❖ Continuing to examine cost
❖ Increase technical diversity within DPN
➢ evaluating new nodes
❖ Open the door to new members
Meeting the DPN Mission
● To save the most valuable content from the academy in a
dark preservation system that is geographically dispersed
and is supported by replication and audit
● The content is protected by succession agreements made
at the time of deposit
● This content is being preserved for humanity
SAA August 2016
www.dpn.org @DPNdotOrg
Mary Molinaro
Chief Operating Officer
mary@dpn.org
Dave Pcolar
Chief Technical Officer
dave@dpn.org
Isabella Stewart Gardner Museum Orientation
Welcome to the Cooperative!
MetaArchive Cooperative:
Case Study in
Collaboration
Sam Meister
Educopia Institute
www.metaarchive.org
Why collaborate?
To preserve our
digital collections
Maintain integrity for
continued access
To preserve our
institutional missions
Outsourcing core mission is
dangerous
Collaboration Benefits
Co$t effective
Scalable
Transparent
Control
Responsive
Standards-based
MetaArchive History
●2004 - Founded as part of NDIIPP funding
and activities
●2006 - Educopia Institute founded to serve
as administrative home for MetaArchive
●2007 - MetaArchive becomes initial Affiliated
Community
●2014 - Celebrated 10 years of successful
preservation network
39
Community-building
Network management expertise
Administrative, legal, and financial
services
Distributed digital preservation
Institutions maintain control over their own
content
Preservation as a process, not a
push-button exercise
Simplicity in ingest, management
43
Hallmarks
MetaArchive is a cooperative, not a vendor:
All hardware and software assets are owned
by members
Membership fees and storage fees go to a
central pool of support for members’ co-op
activities
44
Cooperative Preservation
◼ Compatible with any repository system
▪ E.g., Dspace, Fedora, Archivalware, ETDb,
CONTENTdm, BePress, Digital Commons, etc
◼ Member institutions determine their own
curatorial practices
◼ MetaArchive is a community of support to
help them make informed decisions
45
Philosophy in Practice
MetaArchive Practices
Basic processes
“Producer” (OAIS) determines curation practices;
brings SIPs to MetaArchive
46
MetaArchive Practices
Basic processes
“Producer” (OAIS) determines curation practices;
brings SIPs to MetaArchive
Multiple copies of AIPs dispersed across
geographical, political, and environmental lines
47
AU AU
AU AU
AU
AU
Member
Cache
Member
Cache
Member
Cache
Member
Cache
Member
Cache
Member
Cache
MetaArchive Practices
Basic processes
“Producer” (OAIS) determines curation practices;
brings SIPs to MetaArchive
Multiple copies of AIPs dispersed across
geographical, political, and environmental lines
Checks and repairs automated across network
49
A
U
A
U A
U
A
U
A
U
A
U
A
U
A
U
A
U A
U
A
U
A
U
A
U
A
U
MetaArchive Practices
Basic processes
“Producer” (OAIS) determines curation practices;
brings SIPs to MetaArchive
Multiple copies of AIPs dispersed across
geographical, political, and environmental lines
Checks and repairs automated across network
Deaccession cycle versus data deletion
52
● Auburn University
● Boston College
● Cal Poly San Luis Obispo
● Carnegie Mellon University
● Consorci de Biblioteques
Universitaris de Catalunya
● Florida State University
● Isabella Stewart Gardner
Museum
● Greene County Public Library
● HBCU Library Alliance
● Indiana State University
● Oregon State University
● Penn State University
● Pontificia Universidade
Catolica do Rio de Janeiro
● Purdue University
● Rockefeller Archives Center
● University of Louisville
● University of North Texas
● University of South Carolina
● Virginia Tech University
Membership
54
Collaborative members: $2.5K/year
Preservation members: $3K/year
Sustaining members: $5.5K/year
Server cost: <$5K/term
Storage cost: $585/TB/year
55
Membership Levels
Membership Responsibilities
●Undertake a 3-year membership term
●Take responsibility for content preparation,
evaluation, staging, and ingest testing
●Monitor collections to ensure accurate
long-term preservation
●Host and maintain a MetaArchive cache
(server) or pay in a technology support fee
●Join and contribute to Committees
56
Governance Model
Independent
Nonprofit
Member owned, operated, governed
Governance Model
Charter
Outlines mission, goals, and organizing
principles of cooperative
Governance Procedures
Outlines roles and responsibilities for officers
and member participants
Member agreements
Outlines terms and responsibilities of
membership
Leadership
Steering Committee
Primary governing body responsible for overall
management, coordination, communication,
and reporting efforts
One representative from each Sustaining
Member
Strategic direction-setting, policy setting,
including membership costs and storage fees
Community Building
Common, neutral center
Distribution of work
Community of engagement
Building knowledge
Accomplishing preservation
Concentrated effort toward unified goals
Community Building
Communicate!
Monthly discussion-topic focused
community calls
Two Steering Committees per year
Community Listserv
MetaArchive Wiki
Sustainability
Cooperative framework
Strong organizational center
Limited dependence on any one member
Collaborative model for long-term preservation
Geographic diversity/distribution
Expertise diffusion
Maintain cost-effective, in-house options
63
www.metaarchive.org
Thank you!
www.metaarchive.org
@metaarchive
sam@educopia.org
@samalanmeister
Preservation and Community Action
Questions?
Questions?
Questions?
Questions?
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Avoiding the 927 Problem: Standards, Digital Preservation, and Communities of...
Avoiding the 927 Problem: Standards, Digital Preservation, and Communities of...Avoiding the 927 Problem: Standards, Digital Preservation, and Communities of...
Avoiding the 927 Problem: Standards, Digital Preservation, and Communities of...Artefactual Systems - Archivematica
 
Bibliographic Infrastructure for Shared Print Management
Bibliographic Infrastructure for Shared Print ManagementBibliographic Infrastructure for Shared Print Management
Bibliographic Infrastructure for Shared Print ManagementConstance Malpas
 
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...Artefactual Systems - Archivematica
 
WorldCat Holdings Presentation
WorldCat Holdings PresentationWorldCat Holdings Presentation
WorldCat Holdings PresentationSue Bennett
 
Mwdl overview for_ri_20140823
Mwdl overview for_ri_20140823Mwdl overview for_ri_20140823
Mwdl overview for_ri_20140823Sandra McIntyre
 
JCDL 2015 Doctoral Consortium - A Framework for Aggregating Private and Publi...
JCDL 2015 Doctoral Consortium - A Framework for AggregatingPrivate and Publi...JCDL 2015 Doctoral Consortium - A Framework for AggregatingPrivate and Publi...
JCDL 2015 Doctoral Consortium - A Framework for Aggregating Private and Publi...Mat Kelly
 
IWMW 2006: Archiving the Web What can Institutions learn from National and In...
IWMW 2006: Archiving the Web What can Institutions learn from National and In...IWMW 2006: Archiving the Web What can Institutions learn from National and In...
IWMW 2006: Archiving the Web What can Institutions learn from National and In...IWMW
 
Archivematica integration handshaking towards comprehensive digital preserva...
Archivematica integration  handshaking towards comprehensive digital preserva...Archivematica integration  handshaking towards comprehensive digital preserva...
Archivematica integration handshaking towards comprehensive digital preserva...Artefactual Systems - Archivematica
 

Was ist angesagt? (15)

Wittenberg Portico: Lessons From a Community Supported Archive
Wittenberg Portico: Lessons From a Community Supported ArchiveWittenberg Portico: Lessons From a Community Supported Archive
Wittenberg Portico: Lessons From a Community Supported Archive
 
Archivematica presentation to SJSU iSchool Colloquia series
Archivematica presentation to SJSU iSchool Colloquia seriesArchivematica presentation to SJSU iSchool Colloquia series
Archivematica presentation to SJSU iSchool Colloquia series
 
Avoiding the 927 Problem: Standards, Digital Preservation, and Communities of...
Avoiding the 927 Problem: Standards, Digital Preservation, and Communities of...Avoiding the 927 Problem: Standards, Digital Preservation, and Communities of...
Avoiding the 927 Problem: Standards, Digital Preservation, and Communities of...
 
Bibliographic Infrastructure for Shared Print Management
Bibliographic Infrastructure for Shared Print ManagementBibliographic Infrastructure for Shared Print Management
Bibliographic Infrastructure for Shared Print Management
 
Personal Digital Archiving 2015 - NYU - Workshop
Personal Digital Archiving 2015 - NYU - WorkshopPersonal Digital Archiving 2015 - NYU - Workshop
Personal Digital Archiving 2015 - NYU - Workshop
 
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...
Do Something Now: Why Perfect is the Enemy of Good (Enough) in Digital Preser...
 
147 eileen fenton2006fall
147 eileen fenton2006fall147 eileen fenton2006fall
147 eileen fenton2006fall
 
WorldCat Holdings Presentation
WorldCat Holdings PresentationWorldCat Holdings Presentation
WorldCat Holdings Presentation
 
Mwdl overview for_ri_20140823
Mwdl overview for_ri_20140823Mwdl overview for_ri_20140823
Mwdl overview for_ri_20140823
 
Workshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and ArchivematicaWorkshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and Archivematica
 
JCDL 2015 Doctoral Consortium - A Framework for Aggregating Private and Publi...
JCDL 2015 Doctoral Consortium - A Framework for AggregatingPrivate and Publi...JCDL 2015 Doctoral Consortium - A Framework for AggregatingPrivate and Publi...
JCDL 2015 Doctoral Consortium - A Framework for Aggregating Private and Publi...
 
IWMW 2006: Archiving the Web What can Institutions learn from National and In...
IWMW 2006: Archiving the Web What can Institutions learn from National and In...IWMW 2006: Archiving the Web What can Institutions learn from National and In...
IWMW 2006: Archiving the Web What can Institutions learn from National and In...
 
Archivematica integration handshaking towards comprehensive digital preserva...
Archivematica integration  handshaking towards comprehensive digital preserva...Archivematica integration  handshaking towards comprehensive digital preserva...
Archivematica integration handshaking towards comprehensive digital preserva...
 
2015 05-07-mac
2015 05-07-mac2015 05-07-mac
2015 05-07-mac
 
292 daniel dollar ssp yale_28_may2008
292 daniel dollar ssp yale_28_may2008292 daniel dollar ssp yale_28_may2008
292 daniel dollar ssp yale_28_may2008
 

Ähnlich wie Distributed Digital Preservation Networks: A Case Study in Collaboration

collection development policy for e-resources
collection development policy for e-resourcescollection development policy for e-resources
collection development policy for e-resourcesaditi bhandarkar
 
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShare
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShareResearch Data Services @ Edinburgh: MANTRA & Edinburgh DataShare
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShareHistoric Environment Scotland
 
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...University of Connecticut Libraries
 
Using Archivemedia to preserve research data
Using Archivemedia to preserve research dataUsing Archivemedia to preserve research data
Using Archivemedia to preserve research dataARDC
 
Disasters at any Scale: The MetaArchive Cooperative’s Community-based approac...
Disasters at any Scale: The MetaArchive Cooperative’s Community-based approac...Disasters at any Scale: The MetaArchive Cooperative’s Community-based approac...
Disasters at any Scale: The MetaArchive Cooperative’s Community-based approac...Educopia
 
Digital Preservation in Production (DPN and DuraCloud Vault)
Digital Preservation in Production (DPN and DuraCloud Vault)Digital Preservation in Production (DPN and DuraCloud Vault)
Digital Preservation in Production (DPN and DuraCloud Vault)DuraSpace
 
A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...Jenny Mitcham
 
L&P Humphrey Stewart-Shearer-Joint Session Project ARC & Federated DMP Pilot
L&P Humphrey Stewart-Shearer-Joint Session Project ARC & Federated DMP PilotL&P Humphrey Stewart-Shearer-Joint Session Project ARC & Federated DMP Pilot
L&P Humphrey Stewart-Shearer-Joint Session Project ARC & Federated DMP PilotCASRAI
 
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013SALCTG
 
Managing Digital Content Over Time: Identify and Select
Managing Digital Content Over Time: Identify and SelectManaging Digital Content Over Time: Identify and Select
Managing Digital Content Over Time: Identify and SelectRecollection Wisconsin
 
"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with Archivematica"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with ArchivematicaJenny Mitcham
 
Managing Digital Content Over Time: Identify and Select
Managing Digital Content Over Time: Identify and SelectManaging Digital Content Over Time: Identify and Select
Managing Digital Content Over Time: Identify and SelectRecollection Wisconsin
 
Institutional Repositories.pptx
Institutional Repositories.pptxInstitutional Repositories.pptx
Institutional Repositories.pptxSheejamolMathew
 
MetaArchive Cooperative: Case Study in Collaboration
MetaArchive Cooperative: Case Study in CollaborationMetaArchive Cooperative: Case Study in Collaboration
MetaArchive Cooperative: Case Study in CollaborationEducopia
 
Fedora Futures - CNI 2012
Fedora Futures - CNI 2012Fedora Futures - CNI 2012
Fedora Futures - CNI 2012Tom-Cramer
 

Ähnlich wie Distributed Digital Preservation Networks: A Case Study in Collaboration (20)

collection development policy for e-resources
collection development policy for e-resourcescollection development policy for e-resources
collection development policy for e-resources
 
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShare
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShareResearch Data Services @ Edinburgh: MANTRA & Edinburgh DataShare
Research Data Services @ Edinburgh: MANTRA & Edinburgh DataShare
 
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
 
Using Archivemedia to preserve research data
Using Archivemedia to preserve research dataUsing Archivemedia to preserve research data
Using Archivemedia to preserve research data
 
Disasters at any Scale: The MetaArchive Cooperative’s Community-based approac...
Disasters at any Scale: The MetaArchive Cooperative’s Community-based approac...Disasters at any Scale: The MetaArchive Cooperative’s Community-based approac...
Disasters at any Scale: The MetaArchive Cooperative’s Community-based approac...
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
 
Digital Preservation in Production (DPN and DuraCloud Vault)
Digital Preservation in Production (DPN and DuraCloud Vault)Digital Preservation in Production (DPN and DuraCloud Vault)
Digital Preservation in Production (DPN and DuraCloud Vault)
 
RDM & ELNs @ Edinburgh
RDM & ELNs @ EdinburghRDM & ELNs @ Edinburgh
RDM & ELNs @ Edinburgh
 
A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...
 
Sgci esip-7-20-18
Sgci esip-7-20-18Sgci esip-7-20-18
Sgci esip-7-20-18
 
L&P Humphrey Stewart-Shearer-Joint Session Project ARC & Federated DMP Pilot
L&P Humphrey Stewart-Shearer-Joint Session Project ARC & Federated DMP PilotL&P Humphrey Stewart-Shearer-Joint Session Project ARC & Federated DMP Pilot
L&P Humphrey Stewart-Shearer-Joint Session Project ARC & Federated DMP Pilot
 
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
 
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
Birgit Plietzsch “RDM within research computing support” SALCTG June 2013
 
Managing Digital Content Over Time: Identify and Select
Managing Digital Content Over Time: Identify and SelectManaging Digital Content Over Time: Identify and Select
Managing Digital Content Over Time: Identify and Select
 
"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with Archivematica"Filling the Digital Preservation Gap" with Archivematica
"Filling the Digital Preservation Gap" with Archivematica
 
Managing Digital Content Over Time: Identify and Select
Managing Digital Content Over Time: Identify and SelectManaging Digital Content Over Time: Identify and Select
Managing Digital Content Over Time: Identify and Select
 
Institutional Repositories.pptx
Institutional Repositories.pptxInstitutional Repositories.pptx
Institutional Repositories.pptx
 
RDM Programme at University of Edinburgh
RDM Programme at University of EdinburghRDM Programme at University of Edinburgh
RDM Programme at University of Edinburgh
 
MetaArchive Cooperative: Case Study in Collaboration
MetaArchive Cooperative: Case Study in CollaborationMetaArchive Cooperative: Case Study in Collaboration
MetaArchive Cooperative: Case Study in Collaboration
 
Fedora Futures - CNI 2012
Fedora Futures - CNI 2012Fedora Futures - CNI 2012
Fedora Futures - CNI 2012
 

Mehr von Educopia

Spanning Our Field Libraries: Mindfully Managing LAM Collaborations
Spanning Our Field Libraries: Mindfully Managing LAM CollaborationsSpanning Our Field Libraries: Mindfully Managing LAM Collaborations
Spanning Our Field Libraries: Mindfully Managing LAM CollaborationsEducopia
 
Communities of Action: Distributed Digital Preservation
Communities of Action: Distributed Digital PreservationCommunities of Action: Distributed Digital Preservation
Communities of Action: Distributed Digital PreservationEducopia
 
Cultivating sustaining-networks-katherine-skinner
Cultivating sustaining-networks-katherine-skinnerCultivating sustaining-networks-katherine-skinner
Cultivating sustaining-networks-katherine-skinnerEducopia
 
From act-to-impact-katherine
From act-to-impact-katherineFrom act-to-impact-katherine
From act-to-impact-katherineEducopia
 
Layers of Leadership for Archives, Libraries and Museums
Layers of Leadership for Archives, Libraries and MuseumsLayers of Leadership for Archives, Libraries and Museums
Layers of Leadership for Archives, Libraries and MuseumsEducopia
 
Building a National Agenda for Saving Online News
Building a National Agenda for Saving Online NewsBuilding a National Agenda for Saving Online News
Building a National Agenda for Saving Online NewsEducopia
 
A Preservation Compass
A Preservation CompassA Preservation Compass
A Preservation CompassEducopia
 

Mehr von Educopia (7)

Spanning Our Field Libraries: Mindfully Managing LAM Collaborations
Spanning Our Field Libraries: Mindfully Managing LAM CollaborationsSpanning Our Field Libraries: Mindfully Managing LAM Collaborations
Spanning Our Field Libraries: Mindfully Managing LAM Collaborations
 
Communities of Action: Distributed Digital Preservation
Communities of Action: Distributed Digital PreservationCommunities of Action: Distributed Digital Preservation
Communities of Action: Distributed Digital Preservation
 
Cultivating sustaining-networks-katherine-skinner
Cultivating sustaining-networks-katherine-skinnerCultivating sustaining-networks-katherine-skinner
Cultivating sustaining-networks-katherine-skinner
 
From act-to-impact-katherine
From act-to-impact-katherineFrom act-to-impact-katherine
From act-to-impact-katherine
 
Layers of Leadership for Archives, Libraries and Museums
Layers of Leadership for Archives, Libraries and MuseumsLayers of Leadership for Archives, Libraries and Museums
Layers of Leadership for Archives, Libraries and Museums
 
Building a National Agenda for Saving Online News
Building a National Agenda for Saving Online NewsBuilding a National Agenda for Saving Online News
Building a National Agenda for Saving Online News
 
A Preservation Compass
A Preservation CompassA Preservation Compass
A Preservation Compass
 

Kürzlich hochgeladen

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 

Kürzlich hochgeladen (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 

Distributed Digital Preservation Networks: A Case Study in Collaboration

  • 1. The Dark Side of Digital Preservation: Distributed Digital Preservation Sibyl Schaefer, Mary Molinaro, Katherine Skinner, Chip German, David Pcolar Sam Meister Archives Records 2016 Session 606 Saturday August 6, 2016 Atlanta, Georgia
  • 2. Briefing for SAA, August 6, 2016 Chip German, Program Director, APTrust and Senior Director, Content Stewardship, at the University of Virginia Library chip.german@aptrust.org APTrust Web Site: aptrust.org
  • 3. Sustaining Members: Columbia University Indiana University Georgetown University Johns Hopkins University North Carolina State University Pennsylvania State University Syracuse University University of Cincinnati University of Connecticut University of Maryland University of Miami University of Michigan University of North Carolina University of Notre Dame University of Virginia Virginia Tech aptrust.org
  • 4. • Quick description of APTrust: • Provides simple, collaboratively designed and built preservation environment (bit-level with fixity checking every 90 days) for the digital scholarly and cultural record (NOW) • Will provide collaboratively developed services for that content (FUTURE), which may include: • choice of preservation assurance levels/cost plans • access • format migration, emulation, software preservation • analysis tools for research • Currently uses Amazon Web Services (as back-end platform only)
  • 5. for metadata and events Three content copies in East Coast data center (three separate availability zones) S3 service (preservation)
  • 6. Three content copies in West Coast data center (three separate availability zones) + Fedora copy
  • 7. Money ● Sustaining Members pay $20,000 per year in dues ○ Engage in governance, strategic direction-setting, development, etc. ○ Get allocation of 10 TB of content (with replication, total of 60 TB) ● University of Virginia provides $380,000 per year toward staffing costs ● Institution wishes to deposit more than 10 TB of content? ○ Pass-through of incremental costs in 5 TB blocks of content ■ $2,750 per year ○ Approximately $600,000 cash reserve ○ Serves as DPN ingest/replicating node (full cost-recovery from DPN) Current metrics in our 2nd year of production ● As of August 2, we have 64,111 content objects from 8 institutions in production environment ● 17.2 TB of content, 3 million (as of June) PREMIS events ● Additional institutions with content in current testing
  • 8. Future ● Much more volume in deposits ● Develop collaborations for additional services ● Develop sophisticated reporting for depositors ● Services for smaller cultural heritage organizations, libraries, museums; new categories of APTrust members Challenges ● Drive costs downward to encourage preservation ■ Balance between preservation principles and cost (if not, just storage) ■ Technical architecture enables interchangeable back-end platforms ● Balance between ■ importance of Trusted Digital Repository certification ■ cost and workload of TDR audit (potentially $10-$15K every three years, although Sibyl is educating me about this)
  • 9. APTrust as a DPN ingest and replication node ● APTrust members can designate their deposit to DPN (individually sign a separate agreement with DPN) ● APTrust “unbags” a verified copy of deposit from APTrust bag and “rebags” it to DPN standard into AWS Glacier service in the AWS VA data center (where three copies are kept, each in a separate availability zone) ● APTrust then collaboratively replicates the data to two other DPN replication nodes
  • 10. Sibyl Schaefer // @archivelle University of California, San Diego
  • 11. Digital preservation dark storage network spanning multiple institutions and geographic regions. Partners: University of California, San Diego (UCSD) National Center for Atmospheric Research (NCAR) University of Maryland Institute for Advanced Computer Studies
  • 12. ● Funded by a series of NDIPP grants designed to develop the technological infrastructure needed for digital preservation. ● First ingest date: 2008 ● Nodes are all non-profit research centers. ● Content-agnostic - depositors form SIPS for ingest, Chronopolis runs minimal packaging processes on data ● Majority of data is UCSD digital library and research data
  • 14. Focused on active preservation – constant checking of items.
  • 15. Chronopolis JBOD Replication service ACE NCAR JBOD Replication service ACE UMIACS Ingest Ingest service Postgres Isilon Replication service ACE UCSD Intake service
  • 16. DuraCloud as a pathway to Chronopolis ● Provides an existing hosted user interface, reduced need for new development ● Simplifies the process of moving content in and out of the systems ● Shared institutional values Chronopolis as a storage provider for DuraCloud ● Extends the DuraCloud network to include a non-commercial, highly distributed, dark archive option DuraCloud Vault ● Chronopolis + DuraCloud + DPN
  • 17.
  • 18. S3 Glacier DuraCloud UI Storage Mediation Manifest Store Original Content Store REST API Sync Tool Data Owner Snapshot Bridge Bridge Storage Bridge Application Maryland UCSD NCAR DPN Node 2 DPN Node 3 DPN Network
  • 19. Chronopolis and DPN Same: ● dark storage ● 3 copies at geographically different nodes Different: ● business model ● stewardship succession ● administration ● membership involvement
  • 20. SAA August 2016 www.dpn.org Mary Molinaro Chief Operating Officer mary@dpn.org Dave Pcolar Chief Technical Officer dave@dpn.org
  • 21. DPN is a 50+ member cooperative organization investing in long-term, scalable digital preservation for the academy.
  • 22. As a project of Internet2, DPN is able to leverage established Research and Education Networks to facilitate large scale data movement into preservation environments across the United States.
  • 23. Why DPN? ● Despite institutions’ individual efforts only a fraction of what should be preserved is being preserved. ● How will we ensure that the digital assets created and collected in our institutions are available to scholars in the future? ● A solution is needed to save at risk content in a way that protects against technical failure, natural disaster, and institutional failure.
  • 24. What does a solution look like? 1. Geographically distributed 2. Content replicated 3. Audit and repair 4. Technological diversity 5. Meet best practices for repositories 6. Succession agreements
  • 25. The business model Currently: ● Members pay an annual fee of $20,000 ● Members receive 5TB of preservation storage per year ● Pay once / stored in perpetuity ● Members can purchase additional TB Now working on solutions for smaller and larger institutions - available next year
  • 26. DPN is in production and accepting deposits!
  • 27. What is next? ❖ Address problems we know about ➢ getting started ➢ workflow at the local level ❖ Continuing to examine cost ❖ Increase technical diversity within DPN ➢ evaluating new nodes ❖ Open the door to new members
  • 28. Meeting the DPN Mission ● To save the most valuable content from the academy in a dark preservation system that is geographically dispersed and is supported by replication and audit ● The content is protected by succession agreements made at the time of deposit ● This content is being preserved for humanity
  • 29.
  • 30. SAA August 2016 www.dpn.org @DPNdotOrg Mary Molinaro Chief Operating Officer mary@dpn.org Dave Pcolar Chief Technical Officer dave@dpn.org
  • 31. Isabella Stewart Gardner Museum Orientation Welcome to the Cooperative! MetaArchive Cooperative: Case Study in Collaboration Sam Meister Educopia Institute www.metaarchive.org
  • 33. To preserve our digital collections
  • 36. Outsourcing core mission is dangerous
  • 38.
  • 39. MetaArchive History ●2004 - Founded as part of NDIIPP funding and activities ●2006 - Educopia Institute founded to serve as administrative home for MetaArchive ●2007 - MetaArchive becomes initial Affiliated Community ●2014 - Celebrated 10 years of successful preservation network 39
  • 40.
  • 41.
  • 43. Distributed digital preservation Institutions maintain control over their own content Preservation as a process, not a push-button exercise Simplicity in ingest, management 43 Hallmarks
  • 44. MetaArchive is a cooperative, not a vendor: All hardware and software assets are owned by members Membership fees and storage fees go to a central pool of support for members’ co-op activities 44 Cooperative Preservation
  • 45. ◼ Compatible with any repository system ▪ E.g., Dspace, Fedora, Archivalware, ETDb, CONTENTdm, BePress, Digital Commons, etc ◼ Member institutions determine their own curatorial practices ◼ MetaArchive is a community of support to help them make informed decisions 45 Philosophy in Practice
  • 46. MetaArchive Practices Basic processes “Producer” (OAIS) determines curation practices; brings SIPs to MetaArchive 46
  • 47. MetaArchive Practices Basic processes “Producer” (OAIS) determines curation practices; brings SIPs to MetaArchive Multiple copies of AIPs dispersed across geographical, political, and environmental lines 47
  • 49. MetaArchive Practices Basic processes “Producer” (OAIS) determines curation practices; brings SIPs to MetaArchive Multiple copies of AIPs dispersed across geographical, political, and environmental lines Checks and repairs automated across network 49
  • 52. MetaArchive Practices Basic processes “Producer” (OAIS) determines curation practices; brings SIPs to MetaArchive Multiple copies of AIPs dispersed across geographical, political, and environmental lines Checks and repairs automated across network Deaccession cycle versus data deletion 52
  • 53. ● Auburn University ● Boston College ● Cal Poly San Luis Obispo ● Carnegie Mellon University ● Consorci de Biblioteques Universitaris de Catalunya ● Florida State University ● Isabella Stewart Gardner Museum ● Greene County Public Library ● HBCU Library Alliance ● Indiana State University ● Oregon State University ● Penn State University ● Pontificia Universidade Catolica do Rio de Janeiro ● Purdue University ● Rockefeller Archives Center ● University of Louisville ● University of North Texas ● University of South Carolina ● Virginia Tech University Membership
  • 54. 54
  • 55. Collaborative members: $2.5K/year Preservation members: $3K/year Sustaining members: $5.5K/year Server cost: <$5K/term Storage cost: $585/TB/year 55 Membership Levels
  • 56. Membership Responsibilities ●Undertake a 3-year membership term ●Take responsibility for content preparation, evaluation, staging, and ingest testing ●Monitor collections to ensure accurate long-term preservation ●Host and maintain a MetaArchive cache (server) or pay in a technology support fee ●Join and contribute to Committees 56
  • 58. Governance Model Charter Outlines mission, goals, and organizing principles of cooperative Governance Procedures Outlines roles and responsibilities for officers and member participants Member agreements Outlines terms and responsibilities of membership
  • 59. Leadership Steering Committee Primary governing body responsible for overall management, coordination, communication, and reporting efforts One representative from each Sustaining Member Strategic direction-setting, policy setting, including membership costs and storage fees
  • 60. Community Building Common, neutral center Distribution of work Community of engagement Building knowledge Accomplishing preservation Concentrated effort toward unified goals
  • 61. Community Building Communicate! Monthly discussion-topic focused community calls Two Steering Committees per year Community Listserv MetaArchive Wiki
  • 62. Sustainability Cooperative framework Strong organizational center Limited dependence on any one member Collaborative model for long-term preservation Geographic diversity/distribution Expertise diffusion Maintain cost-effective, in-house options