SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Preserving Software at Scale: The
Stephen Cabrinety Collection
Michael Olson, Stanford University Libraries
Douglas White, National Institute of Standards and Technology
Disclaimer
Trade names and company products are
mentioned in the text or identified. In no case
does such identification imply recommendation
or endorsement by the National Institute of
Standards and Technology, nor does it imply that
the products are necessarily the best available
for the purpose.
The Collection and NIST Grant









Collection consists of ~ 15,000 software titles from 1975 –
1995
Grant (Sept. 2013 – Aug. 2014) funded by National Institute
of Standards and Technology
Contains all media types from this period
Disk images to be added to National Software Reference
Library (NSRL) Reference Data Set
Disk images and photographs will be ingested into the
Stanford Digital Repository
Initial Stanford Tasks







Page software to campus
Register software titles in Digital Object Registry (DRUID,
Title, Source ID)
Enter descriptive metadata in NSRL database
Print tracking sheet
Ship to NIST
NIST NSRL Collection
Contains 14,500 pieces of computer software.
Focuses on Windows, Mac, Linux operating systems and
popular applications.
Modern formats : DVD & CD ROMs, 5¼ in. & 3 ½ in. disks.
Efforts 2005 to date:
19,500 media images
395 media errors (2%)
3,500 photograph sets
25,200 photos
SUL Cabrinety Collection
Focuses on games for Atari, Commodore, Amiga, Sega,
Nintendo, and Apple systems.
27 different operating systems represented.
Several formats : 8 in., 5¼ in., and 3 ½ in. computer disks,
cassettes, cartridges, CD-ROMs.
NIST Efforts to date:
900 media images
158 media errors (17%)
1,100 photograph sets
61,100 photos
NSRL Workstation

x
Workstation Equipment
Apple Mini, running Ubuntu 12.04 LTS
5000K lighting station
Canon T3i, tethered
Golden Thread Object Level Target
USB 3.5-inch floppy drive
Device Side Data FC5025 USB 5.25-inch floppy controller
ATA 5.25-inch floppy drive
USB barcode scanner
Firefox browser
Java photo organizer (custom, wraps gphoto2 etc.)
Perl media imager (custom, wraps dfcldd etc.)
Cartridge Media
Using Retrode adapter for SEGA Genesis and Super Nintendo
(SNES) games, plus plug-ins for Gameboy, Atari, Nintendo 64.
Could not generate a complete, consistent media image.
Every cartridge has metadata in a ROM “header” area; many
include a checksum, for anti-piracy use.
NSRL can calculate the SNES and SEGA Genesis checksums.
Game Boy and Nintendo are works in progress.
Detailed blog article recently published on Stanford website.
Results to date




Just received first batch of data from NIST
– 360 GB = 870 software titles, 116,000 unique files
Capture success rate:
– 83% with no modification or intervention
– Can increase by 5% with human intervention during imaging
– Can increase by 4% with intervention during image mount
– 8% of media have many (> 10%) sector read errors
Lessons and Improvements


Automation; less human interaction



Photography; use RAW and convert








Hardware for legacy media:
Apple physical formats
Large format floppy disks (8”)
Cassettes
Cartridge batteries
Lessons and Improvements






Data modeling beginning this month for repository
Copyright letter created to send to rights holders
Create persistent URL citation page (PURL) for software
Integration into Stanford Catalog called SearchWorks –
when rights allow




Just received first batch of data from NIST
360 GB = 870 software titles, 116,000 unique files
Copyright permissions letter created
Questions?
Michael Olson,
email: mgolson@stanford.edu
Douglas White
email: douglas.white@nist.gov

Weitere ähnliche Inhalte

Andere mochten auch

Macsinka gábor
Macsinka gáborMacsinka gábor
Macsinka gábor
macsyde
 
Confidentiality
ConfidentialityConfidentiality
Confidentiality
12ort
 

Andere mochten auch (11)

Macsinka gábor
Macsinka gáborMacsinka gábor
Macsinka gábor
 
Conventions analysis
Conventions analysisConventions analysis
Conventions analysis
 
'Self-Publish to Success' - notes from a talk
'Self-Publish to Success' - notes from a talk'Self-Publish to Success' - notes from a talk
'Self-Publish to Success' - notes from a talk
 
Values
ValuesValues
Values
 
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial NetworksSuperframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
 
Confidentiality
ConfidentialityConfidentiality
Confidentiality
 
Major Trends Transforming BYOD Security
Major Trends Transforming BYOD SecurityMajor Trends Transforming BYOD Security
Major Trends Transforming BYOD Security
 
Basında Bugün Göztepe
Basında Bugün GöztepeBasında Bugün Göztepe
Basında Bugün Göztepe
 
Target Your Obliques with The Half Kneeling Windmill
Target Your Obliques with The Half Kneeling WindmillTarget Your Obliques with The Half Kneeling Windmill
Target Your Obliques with The Half Kneeling Windmill
 
Html
HtmlHtml
Html
 
Bus stop
Bus stopBus stop
Bus stop
 

Ähnlich wie Preserving Software at Scale: The Stephen Cabrinety Collection

02 Types of Computer Forensics Technology - Notes
02 Types of Computer Forensics Technology - Notes02 Types of Computer Forensics Technology - Notes
02 Types of Computer Forensics Technology - Notes
Kranthi
 
Msra 2011 windows7 forensics-troyla
Msra 2011 windows7 forensics-troylaMsra 2011 windows7 forensics-troyla
Msra 2011 windows7 forensics-troyla
CTIN
 
Michaelwilliamsig2task1worksheet
Michaelwilliamsig2task1worksheetMichaelwilliamsig2task1worksheet
Michaelwilliamsig2task1worksheet
Hooaax
 
Methods and Principles of Sound Design and Production
Methods and Principles of Sound Design and ProductionMethods and Principles of Sound Design and Production
Methods and Principles of Sound Design and Production
Hooaax
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
thomasmcd6
 
Shaun Warburton IG2 task 1
Shaun Warburton IG2 task 1 Shaun Warburton IG2 task 1
Shaun Warburton IG2 task 1
warburton9191
 

Ähnlich wie Preserving Software at Scale: The Stephen Cabrinety Collection (20)

Digital Forensic Tools - Application Specific.
Digital Forensic Tools - Application Specific.Digital Forensic Tools - Application Specific.
Digital Forensic Tools - Application Specific.
 
Digital Forensic tools - Application Specific
Digital Forensic tools - Application SpecificDigital Forensic tools - Application Specific
Digital Forensic tools - Application Specific
 
Digitization Projects for Small Archives and Museums
Digitization Projects for Small Archives and MuseumsDigitization Projects for Small Archives and Museums
Digitization Projects for Small Archives and Museums
 
02 Types of Computer Forensics Technology - Notes
02 Types of Computer Forensics Technology - Notes02 Types of Computer Forensics Technology - Notes
02 Types of Computer Forensics Technology - Notes
 
Sound Recording Glossary Improved Version
Sound Recording Glossary   Improved VersionSound Recording Glossary   Improved Version
Sound Recording Glossary Improved Version
 
Deep Learning using OpenPOWER
Deep Learning using OpenPOWERDeep Learning using OpenPOWER
Deep Learning using OpenPOWER
 
Role of a Forensic Investigator
Role of a Forensic InvestigatorRole of a Forensic Investigator
Role of a Forensic Investigator
 
Forensic imaging
Forensic imagingForensic imaging
Forensic imaging
 
No specimen (software) left behind
No specimen (software) left behindNo specimen (software) left behind
No specimen (software) left behind
 
Digital Forensics in the Archive
Digital Forensics in the ArchiveDigital Forensics in the Archive
Digital Forensics in the Archive
 
What One Digital Forensics Expert Found on Hundreds of Hard Drives, iPhones a...
What One Digital Forensics Expert Found on Hundreds of Hard Drives, iPhones a...What One Digital Forensics Expert Found on Hundreds of Hard Drives, iPhones a...
What One Digital Forensics Expert Found on Hundreds of Hard Drives, iPhones a...
 
Msra 2011 windows7 forensics-troyla
Msra 2011 windows7 forensics-troylaMsra 2011 windows7 forensics-troyla
Msra 2011 windows7 forensics-troyla
 
Michaelwilliamsig2task1worksheet
Michaelwilliamsig2task1worksheetMichaelwilliamsig2task1worksheet
Michaelwilliamsig2task1worksheet
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
 
resume
resumeresume
resume
 
Methods and Principles of Sound Design and Production
Methods and Principles of Sound Design and ProductionMethods and Principles of Sound Design and Production
Methods and Principles of Sound Design and Production
 
Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2
 
MIS Lesson2 Hardware
MIS Lesson2 HardwareMIS Lesson2 Hardware
MIS Lesson2 Hardware
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
 
Shaun Warburton IG2 task 1
Shaun Warburton IG2 task 1 Shaun Warburton IG2 task 1
Shaun Warburton IG2 task 1
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Preserving Software at Scale: The Stephen Cabrinety Collection

  • 1. Preserving Software at Scale: The Stephen Cabrinety Collection Michael Olson, Stanford University Libraries Douglas White, National Institute of Standards and Technology
  • 2. Disclaimer Trade names and company products are mentioned in the text or identified. In no case does such identification imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the products are necessarily the best available for the purpose.
  • 3. The Collection and NIST Grant      Collection consists of ~ 15,000 software titles from 1975 – 1995 Grant (Sept. 2013 – Aug. 2014) funded by National Institute of Standards and Technology Contains all media types from this period Disk images to be added to National Software Reference Library (NSRL) Reference Data Set Disk images and photographs will be ingested into the Stanford Digital Repository
  • 4. Initial Stanford Tasks      Page software to campus Register software titles in Digital Object Registry (DRUID, Title, Source ID) Enter descriptive metadata in NSRL database Print tracking sheet Ship to NIST
  • 5.
  • 6.
  • 7. NIST NSRL Collection Contains 14,500 pieces of computer software. Focuses on Windows, Mac, Linux operating systems and popular applications. Modern formats : DVD & CD ROMs, 5¼ in. & 3 ½ in. disks. Efforts 2005 to date: 19,500 media images 395 media errors (2%) 3,500 photograph sets 25,200 photos
  • 8.
  • 9. SUL Cabrinety Collection Focuses on games for Atari, Commodore, Amiga, Sega, Nintendo, and Apple systems. 27 different operating systems represented. Several formats : 8 in., 5¼ in., and 3 ½ in. computer disks, cassettes, cartridges, CD-ROMs. NIST Efforts to date: 900 media images 158 media errors (17%) 1,100 photograph sets 61,100 photos
  • 11. Workstation Equipment Apple Mini, running Ubuntu 12.04 LTS 5000K lighting station Canon T3i, tethered Golden Thread Object Level Target USB 3.5-inch floppy drive Device Side Data FC5025 USB 5.25-inch floppy controller ATA 5.25-inch floppy drive USB barcode scanner Firefox browser Java photo organizer (custom, wraps gphoto2 etc.) Perl media imager (custom, wraps dfcldd etc.)
  • 12.
  • 13. Cartridge Media Using Retrode adapter for SEGA Genesis and Super Nintendo (SNES) games, plus plug-ins for Gameboy, Atari, Nintendo 64. Could not generate a complete, consistent media image. Every cartridge has metadata in a ROM “header” area; many include a checksum, for anti-piracy use. NSRL can calculate the SNES and SEGA Genesis checksums. Game Boy and Nintendo are works in progress. Detailed blog article recently published on Stanford website.
  • 14. Results to date   Just received first batch of data from NIST – 360 GB = 870 software titles, 116,000 unique files Capture success rate: – 83% with no modification or intervention – Can increase by 5% with human intervention during imaging – Can increase by 4% with intervention during image mount – 8% of media have many (> 10%) sector read errors
  • 15. Lessons and Improvements  Automation; less human interaction  Photography; use RAW and convert      Hardware for legacy media: Apple physical formats Large format floppy disks (8”) Cassettes Cartridge batteries
  • 16. Lessons and Improvements     Data modeling beginning this month for repository Copyright letter created to send to rights holders Create persistent URL citation page (PURL) for software Integration into Stanford Catalog called SearchWorks – when rights allow
  • 17.    Just received first batch of data from NIST 360 GB = 870 software titles, 116,000 unique files Copyright permissions letter created
  • 18.