Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Digital Projects in Special Collections
1. Digital Projects in Special
Collections
SUSAN MCELRATH
UNIVERSITY ARCHIVIST
AMERICAN UNIVERSITY
MARCH 7, 2012
2. Digital Collections, Exhibits, and Repositories
What is the difference?
Repository
multiple collections or institutions
Collection
one collection or theme
Exhibit
one theme – a selection of items
9. Digitization Project Planning
What work needs to be done;
How it will be done (according to which standards,
specifications, best practices);
Who should do the work (and where);
How long the work will take;
How much it will cost, both to "resource" the
infrastructure and to do the content conversion
http://www.ncecho.org/dig/guide_1planning.shtml
http://www.nyu.edu/its/humanities/ninchguide/II/
10. Components of Digitization Projects
Planning and Project Management
Selection
File Formats – master & access derivatives
Conservation Treatment
Reformatting
Metadata Design & Creation
Quality Control
Web Platform
Open source vs. proprietary systems
Preservation
11. Selection Criteria
Should they be digitized?
Research Value
May they be digitized?
Copyright status
Can they be digitized?
Condition
Format
http://www.nedcc.org/resources/leaflets/6Reformatting/06Prese
rvationAndSelection.php
http://www.dlib.org/dlib/september09/ooghe/09ooghe.html
12. Digitization Standards
Technical Standards
Federal Agency Digitization Guidelines Initiative (FADGI)
http://www.digitizationguidelines.gov/
NARA
California Digital Library (CDL)
http://www.cdlib.org/services/dsc/tools/docs/cdl_gdi_v2.pdf
University of Colorado
https://www.cu.edu/digitallibrary/cudldigitizationbp.pdf
14. Web Platform Options
Open Source Software
OMEKA
Greenstone
DSpace
Fedora
Proprietary Software
Contentdm (OCLC)
Luna Insight
Digitool
15. Web Harvesting involves:
Identifying and collecting web resources
Providing search capability for archived web
collections
Managing and preserving web resources
16. Web Harvesting
The most common web archiving technique uses web
crawlers to automate the process of collecting web
pages. Web crawlers typically view web pages in the
same manner that users with a browser see the Web,
and therefore provide a comparatively simple
method of remotely harvesting web content.
17. Web Crawling Problems
Robots exclusion protocol may deny crawlers access
to portions of a website.
Large portions of a web site may be hidden in the
deep Web.
Crawler traps may cause a crawler to download an
infinite number of pages, so crawlers are usually
configured to limit the number of dynamic pages
they crawl.
Calendars often cause problems for crawlers.
18. Web Harvesting Resources
International Internet Preservation Consortium
http://netpreserve.org/about/index.php
Library of Congress
http://www.loc.gov/webarchiving
Archive-It (Service)
www.archive-it.org