The document discusses the challenges of preserving born-digital records at the Smithsonian Institution Archives. It notes the Archives' mission includes acquiring and preserving born-digital collections, but that this presents difficulties due to changing file formats, software incompatibility issues, and files being incorrectly labeled. Older file formats pose particular problems, as do storage media like DAT tapes which are susceptible to damage over time. The Archives uses tools and software to process, analyze, convert and preserve born-digital records in standardized formats.
Why Can’t I Read This File? Born-Digital Challenges at the Smithsonian Institution Archives
1. Why Can’t I Read This File? Born-Digital Challenges at the Smithsonian Institution Archives Lynda Schmitz Fuhrig Mid-Atlantic Regional Archives Conference Fall 2011, Bethlehem, PA
2.
3.
4. Create and promote products and services that broaden the understanding of the Smithsonian
5. Provide professional archival and conservation expertiseAbove, a collection storage area for the Smithsonian Institution Archives, located on the third floor of Capital Gallery West. Upper left, in 1894 a room on the fourth floor, East Wing of the Smithsonian Institution Building, was converted for use as the Smithsonian Institution Archives.
6. SI Archives Digital Services Division Curate and preserve born-digital collections Digitize images, video, and audio Research digital preservation issues Promote the archives through web and outreach SIA Accession 11-124
20. Current preservation formats MS Word/WordPerfect PDF/A or PDF PowerPoint, Excel PDF/A or PDF GIF, JPG, BMP, etc. TIF Access databases SIARD XML Audio WAV/BWF Websites crawled and captured as WARC Email saved to XML following CERP/EMCAP preservation schema Born-digital video not straight-forward. Different options Digitized video Motion JPG2000
21. Tools for processing Open source and proprietary software Jhove, Droid, FITS (FITS is also a format) MediaInfo In-house batch scripts Duke Data Accessioner Evaluating Curator’s Workbench CERP (SIA-Rockefeller Archive Center) parser
25. DATs (Digital Audio Tapes) Transfer them now, if you can! Machine production ended Tapes susceptible to fungus, other problems DAT recorded in 1990 for the Folk Masters radio program. SIA Accession 06-106
32. Resources for formats Sustainability of Digital Formats – Library of Congress http://www.digitalpreservation.gov/formats Pronom – The National Archives in the UK http://www.nationalarchives.gov.uk/PRONOM/Default.aspx Unified Digital Formats Registry – Expected date of operation 2012 http://www.udfr.org/ FILExt – File Extension Source http://filext.com/ TrID – File Identifier http://mark0.net/soft-trid-e.html