Infrastructure Requirements Local Source: Florence Millerand, Cyberinfrastructure along social and technical dimensions Technical Social Global Embodiment of Standards Reach/ Scope Links with conven- tions of practice Learned as part of membership Embedded- ness Build on an Installed base Visible on breakdown Transparency
Gaps in Infrastucture Technical Social Global Embodiment of Standards Reach/ Scope Links with conven- tions of practice Learned as part of membership Embedded- ness Build on an Installed base Visible on breakdown Transparency
Scope of OAIS Activities SIP = Submission Information Package AIP = Archival Information Package DIP = Dissemination Information Package SIP Descriptive Info. AIP AIP DIP Administration P R O D U C E R C O N S U M E R queries result sets MANAGEMENT Ingest Access Data Management Archival Storage Descriptive Info. Preservation Planning orders
Repository-Centered View of Metadata Creation Primary Concern of Repository Developers Producer Consumer queries result sets orders OAIS Archival Information Packages Submission Information Packages Dissemination Information Packages
Hinweis der Redaktion
During my talk today I will discuss some of the new challenges in digital preservation. In particular, I will focus on new demands for long-term archiving of scientific data coming from the scientific community. The late Jim Gray, computer scientists and researcher at Microsoft, coined the phrase “the Fourth Paradigm” to describe the new revolution in the scientific method, where data-intensive science, allows researchers to mine existing data, find patterns, and discover emergent behaviors. This shift, along with other pressures, has created new demands to preserve scientific data for sharing and reuse. Government funding agencies increasingly require researchers to make their data available to other scientists and an increasing number of scientific journals insist that authors submit the raw data on which their findings are based as a condition of publication. In this talk, I will argue that many of the approaches we take to digital preservation that were designed to preserve the scientific and scholarly journal literature or were aimed at cultural heritage resources, need to be brought together into a deeper infrastructure for digital preservation in order to scale up to the demands of preserving scientific data. I will identify some of the gaps between current practice and infrastructure development and then suggest some ways on which our community might collaborate with the producers and consumers of scientific data to develop such an infrastructure for digital preservation.
I am going to assume that the audience is somewhat familiar with the challenges of digital preservation generally, as these have not changes significant for several decades. The primary difference between preservation of traditional material and digital material is that we cannot separate the information we are preserving from a large technological and interpretive environment on which digital information depends. This requires different strategies from those that have worked in the physical environment and, at lease with the current state of practice, adds significantly to the costs. Because technology changes, digital information has to be kept in live systems that require a continuous stream of resources.
During the last two decades, the preservation community has made significant advances in digital preservation. The most significant achievement was development and adoption of the Open Information Archival System (OAIS) Reference Model as an international standard. In addition to the Reference Model itself, there are models and tools for trusted repositories, a variety of metadata standards and standard data formats, and new software tools to manage digital archival repositories. The preservation community deserves a great deal of credit for drawing attention to these challenges, mobilizing resources for research and development, and deploying tools typically in an open manner.