The document discusses Mediapedia, a web-based resource created by the National Library of Australia to help manage the identification and documentation of different media carrier types and their technical requirements. Mediapedia allows users to identify carriers through various search methods and classifications. It provides descriptive information, images, and technical specifications for each carrier type. The goal is for specialists to collaborate and build a sustainable body of knowledge on carriers to help address the risks of lost access to digital content stored on aging or obsolete media over time.
1. Invited Demo: Mediapedia: Managing the Identification of
Media Carriers
Nicholas del Pozo Douglas Elford David Pearson
Digital Preservation Digital Preservation Digital Preservation
National Library of Australia National Library of Australia National Library of Australia
Parkes Place, ACT 2600 Parkes Place, ACT 2600 Parkes Place, ACT 2600
Australia Australia Australia
ndelpozo@nla.gov.au delford@nla.gov.au dapearso@nla.gov.au
ABSTRACT Even if there is an acceptance that access to a piece of media is ‘at
All digital information is stored on physical carriers. Given the risk’ of being lost, anyone trying to access its content will still be
variations in carrier types, the quantity produced and in faced with the following issues:
circulation, along with the potential importance of the content • the carrier type may need to be identified;
being stored on them, not taking any steps to document and
preserve the characteristics of different carrier types will make it • assuming the carrier type is known, then the technology
much more difficult, and eventually impossible, to extract content (and associated dependencies for accessing those
even in the short-term. technologies) must be ascertained; and
The Mediapedia is intended to provide a sustainable way of • even when the above issues are resolved, accessing the
facilitating carrier type identification as well as documenting their media may still be problematic: technology (or parts of
technical requirements and general preservation information. By it) may not be readily available, or the carrier itself may
enabling a community of specialist individuals and organizations have degraded so that it is no longer readable.
to collaborate in the documentation of these carriers it will For example, 5¼ inch floppy disks are now a problematic carrier
hopefully create a sustainable body of knowledge which can be type to access (accurately or otherwise), though this was not
centrally and persistently accessed via the web. From a always assumed to be the case. This is due to a number of factors
preservation and risk management perspective, we can either such as short term deterioration of the physical materials, possible
approach this problem as a community or ignore it at our corruption of the data content on them and the loss in the
individual peril. availability of hardware (e.g., drives, cables and motherboard)
and software (e.g., drivers and operating systems) which are
Keywords required to load, recognize and read the physical disk. These
Digital preservation, media carriers, National Library of factors are not necessarily mutually exclusive. Hardware,
Australia, obsolescence, open source software, Prometheus. software, and file format obsolescence can occur independently
from each other. In addition, not only can hardware become
obsolete, but it can also be susceptible to chemical or physical
1. INTRODUCTION degradation. For example, magnetic tape might start to de-
Anyone who has material stored upon obscure and older
laminate after a certain amount of time or after a certain amount
proprietary media carriers, and even more common carriers such
of usage. Moreover, these problems are not only applicable to
as audio and video materials or floppy disks and CDs, will
obsolete or older materials. Brand new carriers which are not yet
eventually encounter problems accessing this content. Although
in common usage may be just as inaccessible as older carriers that
accessing current common carrier types may appear to be self
are no longer in use (e.g., HD-DVD).
evident presently, this may not always be the case. Over time, if
individuals or organizations are not proactive there is a risk of Therefore, all carriers should be considered a temporary storage
loosing access to the content stored on these carriers. In some medium only. Ironically, in many cases these carriers have been
cases, by the time an organization realizes there is a problem, it perceived or marketed as long-term storage options. However,
may already be is too late to retrieve this content. both the life-cycle of the carrier and the knowledge about it are
dynamic. In the case of carrier specifications and documentation,
their often ephemeral and proprietary nature means that while
initially information may be readily available, it can easily
This work is licensed under the Creative Commons Attribution- disappear within a short period of time due to changing markets
Noncommercial-No Derivative Works 3.0 Unported license. You are free or business conditions. Information about older materials that pre-
to share this work (copy, distribute and transmit) under the following date the web is usually even more difficult to locate.
conditions: attribution, non-commercial, and no derivative works. To view
a copy of this license, visit http://creativecommons.org/licenses/by-nc- Because the problem is so diverse and complex, and the nature of
nd/3.0/. carriers so dynamic, there is no single or simple solution.
DigCCurr2009, April 1-3, 2009, Chapel Hill, NC, USA Therefore, there are implications to the types of carriers that
content is stored on (both in the short- and long-term) that may
not be immediately evident.
76
2. 2. MEDIAPEDIA The Mediapedia doesn’t just store descriptive information about
In order to assist in managing these risks to carriers, we need to carriers, but also contains information about their dependencies
know a range of information about them. For example: when and and genre specific technical knowledge, and is designed to be
by whom the carrier was created; when it was used; the advertised both a human and machine harvestable resource. The data is
shelf-life versus the actual shelf-life; the requirements to access a intended to be curated and sustained by a base of trusted sources
specific carrier type. Mediapedia was designed to be an open, across a range of media genres. High quality images of each
trusted and sustainable mechanism for documenting, retaining and carrier type can be used to quickly confirm or refine search results
disseminating this kind of knowledge [1]. The prototype of this (Figure 2). They can also be used as the basis for conducting a
web-based resource is intended to enable the identification of visual survey. This combination of a detailed classification system
various types of carriers and their associated dependencies. Basic and the ability to search across multiple identification
information which allows the identification of carrier types is characteristics, attributes, descriptive text or images allows human
provided, along with more detailed technical information about users to quickly identify carriers. Machine users can harvest
the carrier itself, and mechanisms that are needed to provide carrier information via persistent identifiers associated to each
ongoing access. Future versions could include information about carrier type.
storage requirements, community based risk assessments and
information about potential migration paths.
Unlike the Wikipedia [2], the Mediapedia does not require prior
knowledge of a carrier’s name for discovery. It was specifically
designed so that carrier types can be identified in a number of
different ways. A user can search across other physical
characteristics or descriptive details such as manufacturer,
product code and other specific identifying markings that can be
found on the carrier. For more advanced users, carriers can be
identified through the use of a detailed and systematic
classification system that was developed from several common
standards, including Dublin Core Type Vocabulary [3] and the
RDA/ONIX Framework for Resource Categorization [4]. The
primary function of this classification system is to organize the
Figure 2 Mediapedia – Media Carrier Variant
carriers into meaningful and flexible taxonomical groupings or
categories, and to make them discoverable to different audiences
(see Figure 1). As such, the user can also search by Carrier or 3. CONCLUSION
Process types within different Genres [5]. Given the variations in carrier types, the number of units
produced and currently in circulation, and given the potential
importance of the information being stored on them, not taking
any steps to document and preserve this content is not an
acceptable option. By identifying and knowing their
characteristics and dependencies, we can more proactively
manage the risk for these carriers, and therefore of the content
that they contain.
It is hoped that the creation and use of the Mediapedia provides a
sustainable way of facilitating carrier type identification as well
as documenting technical and preservation information. Enabling
a community of specialist individuals and organizations to
collaborate in the documentation of these carriers will hopefully
create a sustainable body of knowledge which can be centrally
and persistently located. In addition, as this type of web-based
service is not only human readable, but eventually also machine
harvestable, it could potentially be re-used by other services and
systems. From a preservation and risk management perspective,
we can approach this problem as a community, or ignore it at our
individual peril.
4. REFERENCES
[1] Mediapedia Prototype, at
http://www.nla.gov.au/mediapedia/apps/
Figure 1 Example of the Mediapedia Classification [2] Wikipedia Home Page, at
System. http://en.wikipedia.org/wiki/Main_Page
77
3. [3] Dublin Core Type Vocabulary, at [4] RDA/ONIX Framework for Resource Categorization, at
http://dublincore.org/documents/dcmi-type- http://www.loc.gov/marc/marbi/2007/5chair10.pdf
vocabulary/index.shtml [5] Mediapedia Classification information, at
http://www.nla.gov.au/mediapedia/#classificationschema
78