Digital preservation of geoscience information is important to ensure continued access to valuable scientific data over long periods of time. The Viking Mars mission from 1975 illustrates this need, as the original magnetic tape data degraded and became unreadable after 20 years. Proper digital preservation strategies like the Open Archival Information System model help ensure long-term access through migration to new formats, technology emulation, and institutional repositories. The presentation outlines the OAIS model and its key elements, and proposes implementing a pilot institutional repository in India to test an OAIS-compliant preservation approach for geoscience data.
3. Importance of Digital Information
Preservation
1975 – Two Viking space probes sent to Mars by USA.
Data generated by unrepeatable mission cost $1 billion.
Recorded data on magnetic tapes was corrupted /
unidentifiable after 2 decades despite being kept in
climate controlled environment.
Scientists could not access data, unable to decode the
formats used.
3
4. Importance of Digital Information
Preservation
Original format developers not alive.
Finally old printouts tracked and retyped.
NASA therefore is the biggest supporter of Digital
Preservation Projects.
This illustrates wide gap in information generation and its
management.
4
5. Outline of Presentation
Digital information: forms and types
Geoscience information
Institutional Repositories (IR)
Digital Preservation (DP); strategies for
DP
OAIS model & its implementation
Indian scenario
Research proposal & expected results
5
6. Digital Information
Information in digital form
Born Digital
Converted from Analog
Types of Digital Information
Electronic Publications
Organizational and Personal Records
Data
Learning Objects like articles, books
Software Tools
Unpublished Materials
Electronic Manuscripts
Entertainment Products
Images (Digitally designed or digitized)
Websites
6
7. Threats
Media decay and failure
Massive storage failures, outdated media
Access Component
Obsolescence
Outdated formats, applications & systems
Human and Software errors &
External Events
7
8. Information Deluge
Present & Future Projections
Yawning gap between
Our ability to create digital information
Our infrastructure and capacity to manage and
preserve it over time
Cumulative effect foreseen as future “digital dark
ages”
8
9. Need for Digital Preservation
preserving natural/cultural heritages
for promoting academic research
enabling public access to legacy
collections
9
10. Geoscience Information
Encompasses complex human-natural system
Storehouse of massive heterogeneous data sets, and a wide variety of
content and data types which reflect the features of various research
fields of study
Every content holder aim at the needs of their particular community
and work independently with a loose collaboration and integration
Every content holder has their respective digital archive system with
individual data structure, management policy and search interface,
however, there is an inability to transform and integrate data with each
other transparently
Enabling and improving the interoperability for heterogeneous
collections is important
Source : Loudon, T.V. Geoscience after IT : Part A & Part B. Computers & Geosciences, 2000, 26(3A),
26(3A),
A1-13.
10
11. Institutional Repositories (1)
Definition :
An institute-based repository is a set of
services that an academic institution
offers to the members of its community
for the management and dissemination
of digital materials created by the
institution and its community members.
Source: Clifford A. Lynch (February 2003), “Institutional Repositories: Essential Infrastructure for
Scholarship in the Digital Age” ARL Bimonthly Report 226: 1-7. http://www.arl.org/newsltr/226/ir.html
11
12. Institutional Repositories (2)
Main Objectives
to create global visibility for an institution's
scholarly research;
to collect content at a single location;
to provide open access to institutional research
output by self-archiving it;
to store and preserve other institutional digital
assets, including unpublished or otherwise easily
lost ("grey") literature (e.g., theses or technical
reports).
12
13. Institutional Repositories (3)
IR Softwares
DSpace (dspace.mit.edu)
Eprints.org
Subject Specific IRs
arXiv (www.arXiv.org)
RePEc (Research Papers in Economics)
(www.repec.org)
CogPrints (www.cogprints.org)
NASA Technical Report Server (ntrs.nasa.gov)
Networked Computer Science Technical Reference
Library (www.ncstrl.org) 13
14. Institutional Repositories (4)
An IR is a model for a preservation system
It requires “most essentially an organizational commitment to the
stewardship of … digital materials, including long-term
preservation where appropriate, as well as organization and
access or distribution”
Attributes of a “Trusted Digital Repository”
“…an organisation that has responsibility for the long-term
maintenance of digital resources, as well as making them
available [through time and across changing
technologies] to communities agreed on by the depositor
and the repository .”
Research Libraries Group
http://www.rlg.org/longterm/attributes01.pdf 14
15. Definition: Digital Preservation
The maintenance of digital materials over the long-term
with a view to ensuring its continued accessibility. It
ensures that the digital resources are stored correctly
and maintained adequately in the online world, such
that they are available consistently for use over time.
“Long-term” includes timescales of decades or even centuries
15
16. Preservation Strategies
Technology preservation
Keep the hardware alive
Technology emulation
Create an environment to be able to run the
existing software
Data migration
Convert data to new formats to run in new
applications
16
17. Open Archival Information
System (OAIS)
SIP = Submission Information
Package
AIP = Archive In formation
Package
DIP = Dissemination Information
Package
Published by Consultative Committee for Space Data System
(CCSDS) 2002, ISO 14721 : 2003 standard
An archive consists of an organization of people and systems
with responsibility to preserve information and make it available
to users.
17
18. OAIS: Definitions
To define an Open Archival Information System
The term 'open' means that the document was developed in
an open way, and does not imply that access to any OAIS
should be unrestricted
An archive is defined as an "organization that intends to
preserve information for access and use by a designated
community." (p. 1-8)
While an OAIS itself need not be permanent, the information
being maintained has been deemed to need "Long Term
Preservation"
Long term = long enough for there to be a concern about the
impact of changing technologies
18
19. OAIS: Purpose and Scope
Primary focus on digital information
Specific aims include:
A framework for the understanding and awareness of the
archival concepts needed for long term preservation (access)
Terminology and concepts for describing and comparing:
Architectures and operations
Preservation strategies and techniques
Data models
Consensus on elements and processes for long term
preservation
A foundation for other standards
19
20. OAIS: Applicability
Applicability:
Applicable to any archive, but mainly focused on
organisations with responsibility for making
information available for the long term
Of interest to those who create information
Conformance
An OAIS must support the information model - but
does not specify any particular method of
implementation
Mandatory responsibilities (section 3.1)
20
21. Implementing OAIS (1)
Summing up the fundamentals :
OAIS is a reference model (conceptual framework), NOT a
blueprint for system design
It informs the design of system architectures, the development
of systems and components
It provides common definitions of terms, a common language
and means of making comparison
But it does NOT ensure consistency or interoperability between
implementations
21
25. Summing Up : OAIS
The OAIS model is a foundation stone for
current and future digital preservation efforts
Itis already widely used to inform the
development of preservation tools and
repositories
Itcould be used in the future as a basis for
conformance
25
26. Indian Scenario (1)
Open Digital Repository
Indian Institute of Science (http://etd.ncsi.ernet.in)
National Chemical Laboratory (http://dspace.ncl.res.in/dspace/index.jsp)
Indian Statistical Institute (http://library.isibang.ac.in:8080/dspace/index/jsp)
Social Science Data
The Census of India
M.S.Swaminathan Research Foundation
Museums and Art Galleries
Ministry of Culture, GOI
The National Archives
26
27. Indian Scenario (2)
Institute Resource
Central Water Commission Command area maps
National Bureau of Soil Survey and Soil maps and land use data
Soil Maps
Survey of India (SOI) Topographical maps, geodetic trigonometric
and levelling data, gravity & geomagnetic data,
GPS data, tidal data, repetitive geodetic &
geophysical data
Geological Survey of India (GSI) Geological maps on various scales, geological
and seismic data
National Remote Sensing Agency Satellite imageries, land use and wasteland
(NRSA) maps on different scales
Indian Meteorological Department Meteorological and seismic data
(IMD)
Ministry of Ocean Development Oceanic data 27
(MOD)
28. Proposal for IRs in India
1. Providing adequate financial and technical resources for ensuring
“digital preservation” in IRs
2. National Informatics Center (NIC) entrusted with framing guidelines
and policy
or establishing a new agency
For handling digital preservation, for collaboration, sharing and
avoiding duplication
3. Trusted Digital Repository for accurate and reliable information
4. Legally sustainable digital preservation policy
5. Joining the Digital Preservation Consortium
6. Attention to collection management of digital material in libraries
7. Amendment of the Delivery of Books Act and Press and Registration
Act to cover the digital material
8. Training of manpower for the management and preservation of
electronic records
9. Research in the area of digital preservation
28
29. Research Objectives
Testing a pilot IR in a stand alone mode
Implement an OAIS-compliant layer to the IR
drawing upon best practices
To develop a preservation strategy and a
custom made model addressing issues like
planning and policy for preservation, the role of
different players in the process, IPR and
copyright, etc
29
30. Research Methodology
Digitization Process
Digital Materials
Converted
Analog Materials
Born
Material Selection
Process
Institutional
Digital Preservation
Repository
Long Term Short Term
30
31. Expected Results
Thisresearch would identify all the
components necessary for the
implementation of the OAIS model for a
geoscience domain specific institutional
repository
31
33. Annexure 1
Preservation Description
Information
Provenence
Context
Reference
Fixity
Content Data Object Representation
Information
Physical Object Digital Object
33
34. Annexure 2
OAIS Mandatory Responsibilities:
Negotiating and accepting information
Obtaining sufficient control of the information
to ensure long-term preservation
Determining the "designated community"
Ensuring that information is "independently
understandable"
Following documented policies and
procedures
Making the preserved information available
34