This presentation was given by Jon Wheeler and Karl Benedict of the University of New Mexico during the joint NISO-NFAIS Virtual Conference held on December 7, 2016
Wheeler & Benedict -- Enabling the Preservation Relay
1. Enabling the Preservation Relay: Interoperable
Repository Architectures
J O N W H E E L E R J W H E E L 0 1 @ U N M . E D U
K A R L B E N E D I C T K B E N E @ U N M . E D U
U N I V E RS I T Y O F N E W M E X I C O
C O L L E G E O F U N I V E R S I T Y L I B R A R I E S A N D L E A R N I N G S C I E N C E S
2. Background
Data Management and Institutional Repository Service Drivers
◦ Data repositories as stakeholders in contrast with individuals as stakeholders
◦ Collections in contrast with individual items
◦ How do we add value (or maintain added value) to previously published
data?
Scope, Cost, and Service Models
◦ Organizational priorities & sustainability
◦ Complementary roles for institutional and data repositories
6. Case Study: GSToRE
Geographic Storage, Transformation, and Retrieval Engine
◦ Part of the US National Spatial Data Infrastructure (NSDI)
◦ Underlying platform for the NM Resource Geographic Information System
(NM RGIS), the NM NSF EPSCoR Program’s Data Portal, and the data hub for
a second multi-state NSF EPSCoR environmental modeling project
Requirements & Capabilities
◦ Integration with external discovery and access services
◦ Support for diverse geospatial and non-geospatial data types
◦ Standards based, tiered RESTful Services Oriented Architecture (SOA)
No Explicit Focus on Long-term Archival Storage
• Emphasis on value-added services
• Data management best practices
8. GSToRE Replication and
Repository API
Initial experimentation with harvest via standard search, metadata and
data retrieval API (production services)
◦ Enables search and retrieval of data and metadata elements for all data in
GSToRE
◦ Processed GSToRE items not identified by GSToRE
Experimental Repository API
◦ Enables “flagging” of individual data objects in GSToRE as harvest targets for
one or more repositories – e.g. Data.gov, DataONE,
LoboVault/DigitalRepository.unm.edu
◦ Enables explicit specification of harvest intention by GSToRE
◦ Intended as an interface for middleware processes
9. Overview of a Migration
Source Repositories as Stakeholders
◦ Communication, roles, and responsibilities
◦ Copyright, use, and access requirements
Collections and Items
◦ Which data to transfer?
◦ Metadata requirements
◦ Collection & Disciplinary context
◦ Repository context
◦ Item context
◦ Content and format requirements
Evolving the Conceptual Model into Practical Strategies
◦ Revisiting the GSToRE harvest prototype for Sevilleta LTER data
12. Bibliography
1. Baker, Karen S, and Florence Millerand. “Infrastructuring Ecology: Challenges in Achieving Data Sharing.” Collaboration in the New Life Sciences. Ashgate.(To Be Published in
2010), 2010.
2. Baker, Karen S., and Lynn Yarmey. “Data Stewardship: Environmental Data Curation and a Web-of-Repositories.” International Journal of Digital Curation 4, no. 2 (October 15,
2009): 12–27. doi:10.2218/ijdc.v4i2.90.
3. “Certification and Assessment of Digital Repositories | CRL.” Accessed December 5, 2016. https://www.crl.edu/archiving-preservation/digital-archives/certification-assessment.
4. Consultative Committee for Space Data Systems. “Recommended Standard for Producer-Archive Interface Specification (PAIS),” February 2014.
https://public.ccsds.org/Pubs/651x1b1.pdf.
5. ———. “Reference Model for an Open Archival Information System (OAIS), Recommended Practice,” 2012.
6. Conway, Esther, David Giaretta, Simon Lambert, and Brian Matthews. “Curating Scientific Research Data for the Long Term: A Preservation Analysis Method in Context.”
International Journal of Digital Curation 6, no. 2 (2011): 38–52.
7. Conway, Esther, Brian Matthews, David Giaretta, Simon Lambert, Michael Wilson, and Nick Draper. “Managing Risks in the Preservation of Research Data with Preservation
Networks.” International Journal of Digital Curation 7, no. 1 (March 12, 2012): 3–15. doi:10.2218/ijdc.v7i1.210.
8. “Data Seal of Approval Management.” Accessed December 5, 2016. https://assessment.datasealofapproval.org/.
9. Janée, Greg, Justin Mathena, and James Frew. “A Data Model and Architecture for Long-Term Preservation.” In Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital
Libraries, 134–144. ACM, 2008.
10. Key Perspectives Ltd. “Data Dimensions: Disciplinary Differences in Research Data Sharing, Reuse and Long Term Viability. SCARP Synthesis Study.” Digital Curation Center, 2010.
http://hdl.handle.net/1842/3364
11. Ray, Joyce M. Research Data Management: Practical Strategies for Information Professionals. West Lafayette, Indiana: Purdue University Press, 2014.
12. “re3data.org | Registry of Research Data Repositories.” Accessed December 6, 2016. http://www.re3data.org/.
13. Treloar, Andrew, David Groenewegen, and Cathrine Harboe-Ree. “The Data Curation Continuum: Managing Data Objects in Institutional Repositories.” D-Lib Magazine 13, no. 9
(2007): 4.