1. Introducing
Digital
Libraries
Eskinder Asmelash
AKSUM University
2010
2. The Current Environment
• Web 2.0 / Library 2.0
• Blogs / RSS Feeds / Wikis / Podcasts / Webcasts
• Open Source Software, Open Standards, Open
URL
• User Tagging, Automated Tagging
• OA -> OAP + OAA
• Open Resource Discovery Tools - Google Scholar
• E-Books, E-Journals, E-Resources
• Harvesting, Federation, Metasearching
• Digital Rights Management
3. Organizational Transformation
in Libraries
• Traditional / Automated
» Organization is physical
» Shelving of documents - Based on Subject Cln
» Key - Index / Catalogues / Cards / Digital Catalgs
» Cards - Real/Virtual - Author, Title, Descriptions
• Digital
» Organization in terms of digital files /objects
» Contains material digitized form
» Contains digital material
» Architecture
» Key - Metadata
4. Shift in Technologies /
Approaches
Traditional Automated Dig. Library
Limited/ Rigid Improved Efficient/ Flexible
AACR2 AACR2 Metadata
LCCS ISO 2709 DCMI -- W3C
DDC / UDC CCF EAD, TEI, DTD
Thesauri/LCSH MARC METS,MODS,
Thesauri Z39.50
MARC21
5. What is a Digital Library?
• A Working definition:
“A digital library is an organized and
focused collection of digital objects,
including text, images, video and audio,
along with methods for access and
retrieval, and for selection, creation,
organization, and maintenance of the
collection.”
6. What is a DL?
Collection of digital objects (text, video, audio) along
with methods for access and retrieval, [user]
and for selection, organization, and maintenance [lib]
• Kitchens for knowledge preparation
• WWW ≠ DL!—organization, selectivity
• Nice Web site ≠ DL!—import new
documents easily
7. Workflow in DLs
• Selection of source documents
• Content digitization/ acquisition
• Content organization
– Metadata preparation, full-text tagging
• Content publishing
– Quality control, Content loading
• Content indexing and storage (repository)
• Access and delivery (services)
10. Key Components
1. Initial conversion of content from
physical to digital form.
2. The extraction or creation of metadata
or indexing information describing the
content to facilitate searching and
discovery, as well as administrative
and structural metadata to assist in
object viewing, management, and
preservation.
11. Key Components…
3. Storage of digital content and metadata
in an appropriate multimedia
repository.
◦ The repository will include rights
management capabilities to enforce
intellectual property rights, if required. E-
commerce functionality may also be
present if needed to handle accounting
and billing.
12. Key Components…
4. Client services for the browser,
including repository querying and
workflow
5. Content delivery via file transfer or
streaming media
6. Patron access through a browser or
dedicated client
7. A private or public network.
13. Digital Objects
• Technically, a digital library is built up
from simple components, notably digital
objects
• A digital object is a way of structuring
information in digital form, some of
which may be metadata, and includes
a unique identifier, called a handle.
14. Digital Objects …
• A single work may have many parts, a
complex internal structure, and one or
more arbitrary relationships to other works.
• To represent the complexity of information
in the digital library, several digital objects
may be grouped together. This is called a
set of digital objects.
• All digital objects have the same basic
form, but the structure of a set of digital
objects depends upon the information it
represents.
16. What are digital libraries for?
• Scholarly communication, education, research
– E-journals, e-books, data sets, e-learning
• Access to cultural collections
– Cultural, heritage, historical & special
collections, museums, biodiversity
• E-governance
– Improved access to government policies,
plans, procedures, rules and regulations
• Archiving and preservation
• Many more …
17. DL Software Features
Different logical document types and
levels
◦ Book/ chapter, conference/paper, journal/
paper, lecture, project report, photographs,
etc
Associate metadata with document
types
18. DL Software Features…
Different document formats
◦ Word, PDF, HTML, PS, etc.
◦ Non-Latin scripts
Document acquisition/ publishing
◦ Online/ offline
◦ Central/ distributed
◦ Quality control
19. DL Software Features…
• Indexing and storage
– Automatic metadata extraction
– Structured/full text indexing
– Data compression
• Access and delivery
– Structured search, browse, object searching,
hierarchical browsing, fine-grained search
– CD/DVD-ROM distribution
– Personalization, customization
20. DL Software Features …
• Access/ rights management
◦ Who can access? What? How much? Usage restrictions
• Usage monitoring and reporting
◦ Who is using? How much? Uptime? Response time? Recall/
Precision? Failures?
• Preservation: Long term access
– Link checks, persistent object identification, content
refreshing
• Interoperability
– OAI, Z39.50 compliance
• Standards compliance
– XML, Dublin Core, Unicode
• Scaling up – for large collections
Hinweis der Redaktion
DTD (Document Type Definition) A formal specification of the structural elements and markup definitions to be used in encoding certain types of documents in SGML. Instances of DTDs include EAD, HTML and TEI . Encoded Archival Description (EAD), an SGML DTD that represents a highly structured way to create digital finding aids for a grouping of archival or manuscript materials. Encoded Archival Description (EAD), adopted as a standard by the Society of American Archivists (SAA) in 1999 Text Encoding Initiative (TEI),
These components might not all be part of a discrete digital library system, but could be provided by other related or multi-purpose systems or environments. Accordingly, integration is a consistent issue cited by digital library developers. To interoperate with the existing library infrastructure, the digital library must be designed to work with existing library catalogs and incorporate industry standards, formats, and protocols. The term “digital library” is often used to describe any multimedia management system holding digitized information, but this does not mean it will deliver true library application functionality. Thus, these digital library components must also be tailored to capture, encode, and deliver information according to the standard practices adopted by the library industry. Because of the rapid pace of technological change, some standards are concrete and others are emerging.
Repositories store and manage digital objects and other information. A large digital library may have many repositories of various types, including modern repositories, legacy databases, and Web servers. The interface to this repository is called the repository access protocol (RAP) . Features of RAP are explicit recognition of rights and permissions that need to be satisfied before a client can access a digital object, support for a very general range of disseminations of digital objects, and an open architecture with well defined interfaces.
From a computing view, the digital library is built up from simple components, notably digital objects . A digital object is a way of structuring information in digital form, some of which may be metadata , and includes a unique identifier, called a handle . However, the information in the digital library is far from simple. A single work may have many parts, a complex internal structure, and one or more arbitrary relationships to other works. To represent the complexity of information in the digital library, several digital objects may be grouped together. This is called a set of digital objects . All digital objects have the same basic form, but the structure of a set of digital objects depends upon the information it represents.