SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Born-Digital Archives inCollecting Repositories: Turning Challenges into Byte-Size Opportunities Gretchen Gueguen, Mark A. Matienzo, Simon Wilson, and Peter Chan Session 502, 27 August 2011 Society of American Archivists Annual Meeting
AIMS Project "Born-Digital Collections: An Inter-Institutional Model for Stewardship“ Two year project to create a framework for stewardship of born-digital archival records in collecting repositories Funded by the Andrew W. Mellon Foundation
Partners
Grant Goals Processing of Hybrid Collections Software Development  Community Development Unconference (May 2011, Charlottesville, VA) UK Symposium (June 2011, London, England) Workshop (August 2011, Chicago, IL) White Paper and Project Report
Framework Development A framework for collecting and delivering the born-digital materials that are quickly beginning to constitute the collections of contemporary scholarly, literary, and political figures and organizations.
AIMS Framework Discovery and Access Accessioning
Collection Development Gretchen Gueguen University of Virginia
What is Collection Development? Actions and policies of institutions to bring in material for end users (both current and future); includes prioritizing, developing relationships with creators, assessments, negotiating agreements and preparing for accessioning. Within the AIMS framework Viable, practical method to capture/process born-digital material from hybrid collections requires sound work at the beginning (i.e. policies, practices, agreements with donors, etc.) to set up later work
Elements of Collection Development Prerequisites Establish relationship with donor Analyze Feasibility Negotiate Agreements Prepare for Accessioning
Prerequisites
 Neil Beagrie, "Plenty of Room at the Bottom?  Personal Digital Libraries and Collections,"  D-Lib Magazine (June 2005) Blagofaire. http://xkcd.com/239/
Donor Relationship

Enhanced Curation
Analyzing Feasibility

Negotiate Agreements
 All rights reserved by Chevrolet UK
Prepare for Accessioning... Scope and extent determined? Coordination with acquisition of analog material? Method and time  determined? Pre-acquisition appraisal performed? Enhanced curationcarried out? Test capture if needed? Development of new methodologies undertaken as needed/possible?
Accessioning Mark A. Matienzo, Yale University
What is Accessioning? Archival institution takes physical and legal custody of a group of records from a donor and documents the transfer in a register or other representation of the institution’s holdings Within AIMS Framework Processes which establish physical, administrative and intellectual control over transferred records; assessment and documentation of future needs; documentation of actions taken; beginning of safe storage and maintenance
Elements of Accessioning Prerequisites Transfer records and gain administrative control Physical control and stabilization Intellectual control and documentation to support further processes Maintain accessioned records
Case Study:Re-Accessioning at Yale Collaborative capacity building across two repositories Manuscripts and Archives Beinecke Rare Book and Manuscript Library Addressing previously received accessions of containing electronic records on media Still in testing phase, but working towards implementing in production
Types of Records and Media Wide variety of records creators Literary authors University faculty University offices Architectural firms Common types of media Floppy disks: 5.25” and 3.5” Optical media: CDROM, CD-R, DVD-R, etc. Zip disks USB flash drives
Goals of Re-Accessioning Identify, document, and register media Mitigate risk of media deterioration and obsolescence Extract basic metadata from filesystems on media and files contained on filesystems
Re-Accessioning Workflow
Disk Imaging Using “forensic” (bit-level) imaging process Ensure data on media is not manipulated using write-protection Uses software to acquire images Includes hash-based verification process
Media Log Using SharePoint list Contains unique identifier of media Records physical/logical characteristics of media Documents success, failure, or status of various processes and additional notes
Media Log
Media Log
Metadata Extraction Can be repurposed for descriptive, administrative, and technical metadata Uses command-line tools (Sleuthkit, fiwalk) Outputs XML document
Packaging and Transfer Using BagIt packages/Bagger application Packages contain disk images, extracted metadata, imaging logs, and high-level accession information Transfer to storage is verified by comparison against manifest
Arrangement & Description Simon WilsonHull University Archives
Purpose of Arrangement & Description  The general objectives for Arrangement & Description are:   - to preserve context   - to establish intellectual control of the material   - to provide a means of discovery  SAA definition, emphasis on minimizing the amount of handling Within the AIMS framework Processes which establish intellectual control of the material including implementation of policies and agreements with donors etc. to enable subsequent discovery and access
Elements of Arrangement and Description  Prerequisites  Plan for processing         - gather supporting information; files captured from media          (accessioning); convert files (for viewing); appraisal strategy;  	assess arrangement options; consider preservation issues  Processing        - implement arrangement strategy; add descriptive metadata and wider context (eg Collection Level Description); copyright & other legal considerations   4. 	Prepare for Discovery & Access- remove restricted access to b-d material during processing
Case Study - Stephen Gallagher Background:2005: 42 boxes paper archives  2010: born-digital material: 14,320 files (13.6GB) transferred to us via external hard drive and a box of Amstrad disks Create integrated catalogue to accommodate paper, born-digital and future accruals
Case Study - Stephen Gallagher Approach:  - current work higher priority in filing system - considered each work a distinct ‘project’  - structure reflect his way of working & the    archival principles of control that creator,    archivist & user can all understand Series level was most logical solution- all related files placed in the series - reasonable return for our effort
Case Study - Stephen Gallagher 300 files created using FinalDraft  screenwriter software ,[object Object],appropriate format for long term preservation Other issues:  ,[object Object]
 commercial implications: access via repository = publication?  - re-purposing of work from one (unsuccessful) project to another 
Challenges faced Each collection is unique, approach will vary:  ,[object Object]
 one-off collection (eg project) or likely to be subsequent accruals?
 collection type; differs for personal papers & organisational records 
 same personnel work on paper and born-digital components?  
 can we appraise without knowing the contents?  similar to paper material that is in a different language?
Challenges faced Volume of material :   - depositor perception that 'storage is cheap‘ - does this mean         we shouldn’t appraise the material we receive?   - wide range of file types encountered     - not practical to describe each and every file   - risk management - if you don’t check every    file for sensitive information        - we need to automate as much of the processing as possible
Hypatia Digital archivists' identified a gap in current tools – used experiences to define the requirements for a new tool  Key features identified:  ,[object Object]
drag'n'drop to create the intellectual arrangement
 ability to return to original order of the material
 view some file types, add descriptive metadata etc
 high level of granularity when applying rights & permissionsTechnical (acquired at accessioning) and descriptive metadata - Discovery & Access process
Discovery and Access Peter Chan Stanford University
What is Discovery & Access Discovery and Accessrefers to the systems and workflows that make processed or unprocessedmaterial and the metadata that support it available to users. 
Goals of D&A ,[object Object]
find out about material
understand whether it is available for consultation and if so, how
access material.
To apply appropriate access restrictions in order to protect private and sensitive information as well as intellectual property.
To provide access to material in a format and/or environment that presents the original’s significant properties.,[object Object]
D&A – EAD
D&A – Facet Browsing

Weitere Àhnliche Inhalte

Was ist angesagt?

FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM
 
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?Incremental Project
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsAaron Collie
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersJez Cope
 
Digital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigitalPreservationEurope
 
Martin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP OnlineMartin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP OnlineFuture Perfect 2012
 
From policy to practice with DMP Online
From policy to practice with DMP OnlineFrom policy to practice with DMP Online
From policy to practice with DMP OnlineSarah Jones
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATTony Ross-Hellauer
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011DLFCLIR
 
Data accessibilityandchallenges
Data accessibilityandchallengesData accessibilityandchallenges
Data accessibilityandchallengesjyotikhadake
 
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...cscpconf
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarFAIRDOM
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondBenoit Pauwels
 
Digital Preservation
Digital PreservationDigital Preservation
Digital PreservationMichael Day
 
Data management for proposal writing
Data management for proposal writingData management for proposal writing
Data management for proposal writingOlatunbosun Obileye
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesRebekah Cummings
 
ROER4D Open Data Initiative
ROER4D Open Data InitiativeROER4D Open Data Initiative
ROER4D Open Data InitiativeMichelle Willmers
 

Was ist angesagt? (18)

FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
 
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering Students
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
 
Digital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and Requirements
 
Martin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP OnlineMartin Donnelly Sarah Jones DMP Online
Martin Donnelly Sarah Jones DMP Online
 
From policy to practice with DMP Online
From policy to practice with DMP OnlineFrom policy to practice with DMP Online
From policy to practice with DMP Online
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Hypatia for dlf 2011
Hypatia for dlf 2011Hypatia for dlf 2011
Hypatia for dlf 2011
 
MANTRA Research Data Lifecycle
MANTRA Research Data LifecycleMANTRA Research Data Lifecycle
MANTRA Research Data Lifecycle
 
Data accessibilityandchallenges
Data accessibilityandchallengesData accessibilityandchallenges
Data accessibilityandchallenges
 
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Data management for proposal writing
Data management for proposal writingData management for proposal writing
Data management for proposal writing
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and Humanities
 
ROER4D Open Data Initiative
ROER4D Open Data InitiativeROER4D Open Data Initiative
ROER4D Open Data Initiative
 

Andere mochten auch

Processing workshop 2010_04_23_final
Processing workshop 2010_04_23_finalProcessing workshop 2010_04_23_final
Processing workshop 2010_04_23_finalarchiwicz
 
Cataloguing Photographs at The British Postal Museum & Archive
Cataloguing Photographs at The British Postal Museum & ArchiveCataloguing Photographs at The British Postal Museum & Archive
Cataloguing Photographs at The British Postal Museum & ArchiveMartind1199
 
Archival Standards – ISAD(G) & ISAAR(CPF)
Archival Standards – ISAD(G) & ISAAR(CPF)Archival Standards – ISAD(G) & ISAAR(CPF)
Archival Standards – ISAD(G) & ISAAR(CPF)Henny van Schie
 
Archivematica and Local Authority Archive Services
Archivematica and Local Authority Archive ServicesArchivematica and Local Authority Archive Services
Archivematica and Local Authority Archive ServicesPaweƂ Jaskulski
 
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...mikeum
 
Rebecca Grant - Archival Description and Archival Arrangement
Rebecca Grant - Archival Description and Archival ArrangementRebecca Grant - Archival Description and Archival Arrangement
Rebecca Grant - Archival Description and Archival Arrangementdri_ireland
 
Sm iic seminar workshop by arlante
Sm iic seminar workshop by arlanteSm iic seminar workshop by arlante
Sm iic seminar workshop by arlanteMa. Lourdes Flores
 
Chapter 24 the persuasive speech
Chapter 24   the persuasive speechChapter 24   the persuasive speech
Chapter 24 the persuasive speechProfessorEvans
 
The Needs of Archives: 16 (simple) rules for a better archival management
The Needs of Archives: 16 (simple) rules for a better archival managementThe Needs of Archives: 16 (simple) rules for a better archival management
The Needs of Archives: 16 (simple) rules for a better archival managementTom Cobbaert
 
Archival cataloging using ISAD-G
Archival cataloging using ISAD-GArchival cataloging using ISAD-G
Archival cataloging using ISAD-GFe Angela Verzosa
 
Archival Arrangement, Description & Access
Archival Arrangement, Description & AccessArchival Arrangement, Description & Access
Archival Arrangement, Description & Accesslindyhopper38
 
Archival Processing And Description
Archival Processing And DescriptionArchival Processing And Description
Archival Processing And DescriptionMichelle Belden
 
Organization of Archival Materials
Organization of Archival MaterialsOrganization of Archival Materials
Organization of Archival MaterialsFe Angela Verzosa
 
Chapter 12 types of organizational arrangements
Chapter 12 types of organizational arrangementsChapter 12 types of organizational arrangements
Chapter 12 types of organizational arrangementsProfessorEvans
 
Introduction to arrangement and description (feb 4&5, 2012)
Introduction to arrangement and description (feb 4&5, 2012)Introduction to arrangement and description (feb 4&5, 2012)
Introduction to arrangement and description (feb 4&5, 2012)Amanda Hill
 
Archival Management: Principles and Techniques
Archival Management: Principles and TechniquesArchival Management: Principles and Techniques
Archival Management: Principles and TechniquesFe Angela Verzosa
 
Principles Of Marketing 1
Principles Of  Marketing 1Principles Of  Marketing 1
Principles Of Marketing 1ali.jibran
 
Overview of Archival Processing
Overview of Archival ProcessingOverview of Archival Processing
Overview of Archival Processingjennifer whitlock
 

Andere mochten auch (20)

Processing workshop 2010_04_23_final
Processing workshop 2010_04_23_finalProcessing workshop 2010_04_23_final
Processing workshop 2010_04_23_final
 
Cataloguing Photographs at The British Postal Museum & Archive
Cataloguing Photographs at The British Postal Museum & ArchiveCataloguing Photographs at The British Postal Museum & Archive
Cataloguing Photographs at The British Postal Museum & Archive
 
Archival Standards – ISAD(G) & ISAAR(CPF)
Archival Standards – ISAD(G) & ISAAR(CPF)Archival Standards – ISAD(G) & ISAAR(CPF)
Archival Standards – ISAD(G) & ISAAR(CPF)
 
Archivematica and Local Authority Archive Services
Archivematica and Local Authority Archive ServicesArchivematica and Local Authority Archive Services
Archivematica and Local Authority Archive Services
 
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
ArchivesSpace-Archivematica-DSpace Workflow Integration Project Update (March...
 
Arranging and Describing Archives
Arranging and Describing ArchivesArranging and Describing Archives
Arranging and Describing Archives
 
Rebecca Grant - Archival Description and Archival Arrangement
Rebecca Grant - Archival Description and Archival ArrangementRebecca Grant - Archival Description and Archival Arrangement
Rebecca Grant - Archival Description and Archival Arrangement
 
Sm iic seminar workshop by arlante
Sm iic seminar workshop by arlanteSm iic seminar workshop by arlante
Sm iic seminar workshop by arlante
 
Chapter 24 the persuasive speech
Chapter 24   the persuasive speechChapter 24   the persuasive speech
Chapter 24 the persuasive speech
 
The Needs of Archives: 16 (simple) rules for a better archival management
The Needs of Archives: 16 (simple) rules for a better archival managementThe Needs of Archives: 16 (simple) rules for a better archival management
The Needs of Archives: 16 (simple) rules for a better archival management
 
Archival cataloging using ISAD-G
Archival cataloging using ISAD-GArchival cataloging using ISAD-G
Archival cataloging using ISAD-G
 
Archival Arrangement, Description & Access
Archival Arrangement, Description & AccessArchival Arrangement, Description & Access
Archival Arrangement, Description & Access
 
Archival Processing And Description
Archival Processing And DescriptionArchival Processing And Description
Archival Processing And Description
 
Archiving
ArchivingArchiving
Archiving
 
Organization of Archival Materials
Organization of Archival MaterialsOrganization of Archival Materials
Organization of Archival Materials
 
Chapter 12 types of organizational arrangements
Chapter 12 types of organizational arrangementsChapter 12 types of organizational arrangements
Chapter 12 types of organizational arrangements
 
Introduction to arrangement and description (feb 4&5, 2012)
Introduction to arrangement and description (feb 4&5, 2012)Introduction to arrangement and description (feb 4&5, 2012)
Introduction to arrangement and description (feb 4&5, 2012)
 
Archival Management: Principles and Techniques
Archival Management: Principles and TechniquesArchival Management: Principles and Techniques
Archival Management: Principles and Techniques
 
Principles Of Marketing 1
Principles Of  Marketing 1Principles Of  Marketing 1
Principles Of Marketing 1
 
Overview of Archival Processing
Overview of Archival ProcessingOverview of Archival Processing
Overview of Archival Processing
 

Ähnlich wie Saa Session 502 Born Digital Archives in Collecting Repositories

Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curationGarethKnight
 
Best Practices for Managing Born Digital Content
Best Practices for Managing Born Digital ContentBest Practices for Managing Born Digital Content
Best Practices for Managing Born Digital ContentRecollection Wisconsin
 
Keep Calm and Curate
Keep Calm and CurateKeep Calm and Curate
Keep Calm and CurateGarethKnight
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Di d dlf_handout
Di d dlf_handoutDi d dlf_handout
Di d dlf_handoutcwilliford
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital PreservationDigitalPreservationEurope
 
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)dri_ireland
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...faflrt
 
Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...GarethKnight
 
Pekin eca2010-v2
Pekin eca2010-v2Pekin eca2010-v2
Pekin eca2010-v2Anna Ashton
 
Data management (newest version)
Data management (newest version)Data management (newest version)
Data management (newest version)Graça Gabriel
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationMANENDRASINGH30
 
Completepresentation
CompletepresentationCompletepresentation
CompletepresentationAndrew Wesolek
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersRebekah Cummings
 

Ähnlich wie Saa Session 502 Born Digital Archives in Collecting Repositories (20)

Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
Data management
Data management Data management
Data management
 
Best Practices for Managing Born Digital Content
Best Practices for Managing Born Digital ContentBest Practices for Managing Born Digital Content
Best Practices for Managing Born Digital Content
 
20130222 kaptur training_goldsmiths
20130222 kaptur training_goldsmiths20130222 kaptur training_goldsmiths
20130222 kaptur training_goldsmiths
 
Keep Calm and Curate
Keep Calm and CurateKeep Calm and Curate
Keep Calm and Curate
 
Data management plans
Data management plansData management plans
Data management plans
 
Di d dlf_handout
Di d dlf_handoutDi d dlf_handout
Di d dlf_handout
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Data management
Data management Data management
Data management
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
 
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
Rebecca Grant - Archiving and Digital Preservation (Figshare Fest)
 
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
OAIS and It's Applicability for Libraries, Archives, and Digital Repositories...
 
Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...Who Decides? Reinterpreting archival processes for the management of digital ...
Who Decides? Reinterpreting archival processes for the management of digital ...
 
Pekin eca2010-v2
Pekin eca2010-v2Pekin eca2010-v2
Pekin eca2010-v2
 
Data management (newest version)
Data management (newest version)Data management (newest version)
Data management (newest version)
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
 
RDM for trainee physicians
RDM for trainee physiciansRDM for trainee physicians
RDM for trainee physicians
 
Completepresentation
CompletepresentationCompletepresentation
Completepresentation
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
 

Mehr von AIMS_Archives

AIMS Workshop Case Study 2: Re-accessioning at Yale
AIMS Workshop Case Study 2: Re-accessioning at YaleAIMS Workshop Case Study 2: Re-accessioning at Yale
AIMS Workshop Case Study 2: Re-accessioning at YaleAIMS_Archives
 
AIMS workshop pt. 4: Discovery and Access
AIMS workshop pt. 4: Discovery and AccessAIMS workshop pt. 4: Discovery and Access
AIMS workshop pt. 4: Discovery and AccessAIMS_Archives
 
AIMS Workshop pt. 2: Accessioning
AIMS Workshop pt. 2: AccessioningAIMS Workshop pt. 2: Accessioning
AIMS Workshop pt. 2: AccessioningAIMS_Archives
 
AIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
AIMS workshop Case Study 4: Discovery and Access to Hybrid CollectionsAIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
AIMS workshop Case Study 4: Discovery and Access to Hybrid CollectionsAIMS_Archives
 
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom CenterAIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom CenterAIMS_Archives
 
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...AIMS_Archives
 
AIMS Workshop Case Study 2: Accessioning Evolution
AIMS Workshop Case Study 2: Accessioning EvolutionAIMS Workshop Case Study 2: Accessioning Evolution
AIMS Workshop Case Study 2: Accessioning EvolutionAIMS_Archives
 
AIMS Workshop pt .1: Collection Development
AIMS Workshop pt .1: Collection DevelopmentAIMS Workshop pt .1: Collection Development
AIMS Workshop pt .1: Collection DevelopmentAIMS_Archives
 
AIMS workshop: Introduction
AIMS workshop: IntroductionAIMS workshop: Introduction
AIMS workshop: IntroductionAIMS_Archives
 

Mehr von AIMS_Archives (9)

AIMS Workshop Case Study 2: Re-accessioning at Yale
AIMS Workshop Case Study 2: Re-accessioning at YaleAIMS Workshop Case Study 2: Re-accessioning at Yale
AIMS Workshop Case Study 2: Re-accessioning at Yale
 
AIMS workshop pt. 4: Discovery and Access
AIMS workshop pt. 4: Discovery and AccessAIMS workshop pt. 4: Discovery and Access
AIMS workshop pt. 4: Discovery and Access
 
AIMS Workshop pt. 2: Accessioning
AIMS Workshop pt. 2: AccessioningAIMS Workshop pt. 2: Accessioning
AIMS Workshop pt. 2: Accessioning
 
AIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
AIMS workshop Case Study 4: Discovery and Access to Hybrid CollectionsAIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
AIMS workshop Case Study 4: Discovery and Access to Hybrid Collections
 
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom CenterAIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
AIMS Workshop Case Study 3b: Born-Digital Materials at the Ransom Center
 
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
AIMS Workshop Case Study 3: Arrangement and Description Case Study - Stephen ...
 
AIMS Workshop Case Study 2: Accessioning Evolution
AIMS Workshop Case Study 2: Accessioning EvolutionAIMS Workshop Case Study 2: Accessioning Evolution
AIMS Workshop Case Study 2: Accessioning Evolution
 
AIMS Workshop pt .1: Collection Development
AIMS Workshop pt .1: Collection DevelopmentAIMS Workshop pt .1: Collection Development
AIMS Workshop pt .1: Collection Development
 
AIMS workshop: Introduction
AIMS workshop: IntroductionAIMS workshop: Introduction
AIMS workshop: Introduction
 

KĂŒrzlich hochgeladen

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...gurkirankumar98700
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

KĂŒrzlich hochgeladen (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Saa Session 502 Born Digital Archives in Collecting Repositories

  • 1. Born-Digital Archives inCollecting Repositories: Turning Challenges into Byte-Size Opportunities Gretchen Gueguen, Mark A. Matienzo, Simon Wilson, and Peter Chan Session 502, 27 August 2011 Society of American Archivists Annual Meeting
  • 2. AIMS Project "Born-Digital Collections: An Inter-Institutional Model for Stewardship“ Two year project to create a framework for stewardship of born-digital archival records in collecting repositories Funded by the Andrew W. Mellon Foundation
  • 4. Grant Goals Processing of Hybrid Collections Software Development Community Development Unconference (May 2011, Charlottesville, VA) UK Symposium (June 2011, London, England) Workshop (August 2011, Chicago, IL) White Paper and Project Report
  • 5. Framework Development A framework for collecting and delivering the born-digital materials that are quickly beginning to constitute the collections of contemporary scholarly, literary, and political figures and organizations.
  • 6. AIMS Framework Discovery and Access Accessioning
  • 7. Collection Development Gretchen Gueguen University of Virginia
  • 8. What is Collection Development? Actions and policies of institutions to bring in material for end users (both current and future); includes prioritizing, developing relationships with creators, assessments, negotiating agreements and preparing for accessioning. Within the AIMS framework Viable, practical method to capture/process born-digital material from hybrid collections requires sound work at the beginning (i.e. policies, practices, agreements with donors, etc.) to set up later work
  • 9. Elements of Collection Development Prerequisites Establish relationship with donor Analyze Feasibility Negotiate Agreements Prepare for Accessioning
  • 10. Prerequisites
 Neil Beagrie, "Plenty of Room at the Bottom? Personal Digital Libraries and Collections," D-Lib Magazine (June 2005) Blagofaire. http://xkcd.com/239/
  • 14. Negotiate Agreements
 All rights reserved by Chevrolet UK
  • 15. Prepare for Accessioning... Scope and extent determined? Coordination with acquisition of analog material? Method and time determined? Pre-acquisition appraisal performed? Enhanced curationcarried out? Test capture if needed? Development of new methodologies undertaken as needed/possible?
  • 16. Accessioning Mark A. Matienzo, Yale University
  • 17. What is Accessioning? Archival institution takes physical and legal custody of a group of records from a donor and documents the transfer in a register or other representation of the institution’s holdings Within AIMS Framework Processes which establish physical, administrative and intellectual control over transferred records; assessment and documentation of future needs; documentation of actions taken; beginning of safe storage and maintenance
  • 18. Elements of Accessioning Prerequisites Transfer records and gain administrative control Physical control and stabilization Intellectual control and documentation to support further processes Maintain accessioned records
  • 19. Case Study:Re-Accessioning at Yale Collaborative capacity building across two repositories Manuscripts and Archives Beinecke Rare Book and Manuscript Library Addressing previously received accessions of containing electronic records on media Still in testing phase, but working towards implementing in production
  • 20. Types of Records and Media Wide variety of records creators Literary authors University faculty University offices Architectural firms Common types of media Floppy disks: 5.25” and 3.5” Optical media: CDROM, CD-R, DVD-R, etc. Zip disks USB flash drives
  • 21. Goals of Re-Accessioning Identify, document, and register media Mitigate risk of media deterioration and obsolescence Extract basic metadata from filesystems on media and files contained on filesystems
  • 23. Disk Imaging Using “forensic” (bit-level) imaging process Ensure data on media is not manipulated using write-protection Uses software to acquire images Includes hash-based verification process
  • 24.
  • 25. Media Log Using SharePoint list Contains unique identifier of media Records physical/logical characteristics of media Documents success, failure, or status of various processes and additional notes
  • 28. Metadata Extraction Can be repurposed for descriptive, administrative, and technical metadata Uses command-line tools (Sleuthkit, fiwalk) Outputs XML document
  • 29. Packaging and Transfer Using BagIt packages/Bagger application Packages contain disk images, extracted metadata, imaging logs, and high-level accession information Transfer to storage is verified by comparison against manifest
  • 30.
  • 31. Arrangement & Description Simon WilsonHull University Archives
  • 32. Purpose of Arrangement & Description The general objectives for Arrangement & Description are: - to preserve context - to establish intellectual control of the material - to provide a means of discovery  SAA definition, emphasis on minimizing the amount of handling Within the AIMS framework Processes which establish intellectual control of the material including implementation of policies and agreements with donors etc. to enable subsequent discovery and access
  • 33. Elements of Arrangement and Description Prerequisites  Plan for processing       - gather supporting information; files captured from media (accessioning); convert files (for viewing); appraisal strategy; assess arrangement options; consider preservation issues Processing    - implement arrangement strategy; add descriptive metadata and wider context (eg Collection Level Description); copyright & other legal considerations   4. Prepare for Discovery & Access- remove restricted access to b-d material during processing
  • 34. Case Study - Stephen Gallagher Background:2005: 42 boxes paper archives  2010: born-digital material: 14,320 files (13.6GB) transferred to us via external hard drive and a box of Amstrad disks Create integrated catalogue to accommodate paper, born-digital and future accruals
  • 35. Case Study - Stephen Gallagher Approach: - current work higher priority in filing system - considered each work a distinct ‘project’  - structure reflect his way of working & the archival principles of control that creator, archivist & user can all understand Series level was most logical solution- all related files placed in the series - reasonable return for our effort
  • 36.
  • 37. commercial implications: access via repository = publication? - re-purposing of work from one (unsuccessful) project to another 
  • 38.
  • 39. one-off collection (eg project) or likely to be subsequent accruals?
  • 40. collection type; differs for personal papers & organisational records 
  • 41. same personnel work on paper and born-digital components?  
  • 42. can we appraise without knowing the contents? similar to paper material that is in a different language?
  • 43. Challenges faced Volume of material :   - depositor perception that 'storage is cheap‘ - does this mean we shouldn’t appraise the material we receive?   - wide range of file types encountered     - not practical to describe each and every file - risk management - if you don’t check every file for sensitive information        - we need to automate as much of the processing as possible
  • 44.
  • 45. drag'n'drop to create the intellectual arrangement
  • 46. ability to return to original order of the material
  • 47. view some file types, add descriptive metadata etc
  • 48. high level of granularity when applying rights & permissionsTechnical (acquired at accessioning) and descriptive metadata - Discovery & Access process
  • 49. Discovery and Access Peter Chan Stanford University
  • 50. What is Discovery & Access Discovery and Accessrefers to the systems and workflows that make processed or unprocessedmaterial and the metadata that support it available to users. 
  • 51.
  • 52. find out about material
  • 53. understand whether it is available for consultation and if so, how
  • 55. To apply appropriate access restrictions in order to protect private and sensitive information as well as intellectual property.
  • 56.
  • 58. D&A – Facet Browsing
  • 59. D&A – Full text search
  • 60. D&A – See Contents on Web
  • 61. D&A – Tag & Annotation by Invited Persons / Public Annotation:
  • 62. Impacts fromCollection Development File formats: no restriction Computer medium: no restriction (punch card, open reel tape, 5.25 inch floppy, 3.5 inch floppy), File type: no restriction (computer program, data set, document, spreadsheet), Agreement: permission to post contents online.
  • 63. Impacts fromAccessioning Built 5.25 inch floppy capture station Ask Computer History Museum to read punch cards Open reel tapes – still outstanding
  • 64. Impacts fromProcessing AccessData FTK was used to search files with restricted information, annotate files with appropriate descriptive metadata (book title, articles, etc.), and rights metadata (access restriction), generate technical metadata for the delivery platform to act upon. Transit Solution was used to transform files to html format for display in web. A XSLT program was written to transform the XSL-FO output from FTK to XML content document. A Ruby program was written to ingest the XML content document, original files, and the display derivatives to Fedora.
  • 65. FTK – Bookmark and Label
  • 66. FTK – Full Text, Pattern Search & Fuzzy Hash
  • 68. Network Diagram for 50,000 Creeley Emails
  • 71. Want to know more? http://born-digital-archives.blogspot.com http://born-digital-archives.blogspot.com Gretchen Gueguen Mark Matienzo gmg2n@virginia.edumark.matienzo@yale.edu Simon Wilson Peter Chan s.wilson@hull.ac.ukpchan3@stanford.edu

Hinweis der Redaktion

  1. Hello and welcome to session 502: Born-Digital Archives in Collecting Repository: Turning Challenges into Byte-Size OpportunitiesMy name is Gretchen Gueguen and I’m Digital Archivist at the University of Virginia. This morning, along with my colleagues Mark Matienzo from Yale, Simon Wilson from the University of Hull, and Peter Chan from Stanford, I’m going to talk with you about the AIMS project.
  2. AIMS is the short title for a Mellon-funded grant project entitled Born-Digital Collections: An Inter-Institutional Model for Stewardship. This two-year project set out to create a framework for stewardship of born-digital archival records in the collecting repositories.
  3. As I’ve mentioned, the grant partners include UVA, Stanford, Hull and Yale and Virginia serves as the PI
  4. The grant set out to achieve it’s goal through 4 different areas of activity. The first was the processing of several hybrid collections which you are going to hear about later this morning. The Digital archivists at each institution, the four of us here this morning, were funded by the grant to carry out this processing.To facilitate this stewardship, the partners also sought to develop some software solutions. You won’t hear as much about these this morning, but they include Rubymatica, a ruby-based reworking of Archivematica for the creation of Submission Information Packages, and Functional Requirement for a software tool to facilitate arrangement, description and access to born-digital archival materials. These requirements led to work on developing Hypatia, which is what is known as a “Hydra Head” or a module for the Fedora/Solr/Blacklight Hydra stack, for access to born-digital materials.The partners also hosted several events to garner feedback and to encourage communication among the archival community, including a workshop that took place here in Chicago earlier this week.The final project deliverables will include a White Paper synthesizes the research done during the project and a project report to the Mellon Foundation.
  5. A large part of the White Paper focuses on what we are currently referring to as the AIMS framwork: “A framework for collecting and delivering the born-digital materials that are quickly beginning to constitute the collections of contemporary scholarly, literary, and political figures and organizations.”This is really a high-level look at the tools, strategies, methodologies, and practices needed to effectively manage b-d content
  6. The framework is characterized by four main functions of stewardship:Collection DevelopmentAccessioningArrangement and DescriptionDiscovery and AccessYou’ll notice that we do not include “preservation” as an explicit function here. That is an intentional omission because we believe that preservation is implicit in all of these functions. In addition aspects such developing a preservation repository or undertaking preservation activities are outside of this scope because they are larger institutional initiatives. They are mentioned as prerequisites to being able to do work in many steps, but since there are many guidelines out there we didn’t feel the need to reiterate them here.We are going to focus the rest of our presentation this morning on these four areas and share with you some of the work we have done.If you are interested in more on the background of the project, I will encourage you to check out our project blog, called born Digital Archives and I’ll put a URL up for the blog at the end of the presentation
  7. We are starting our model with activities related to Collection Development. These are the activities undertaken in order to bring material in to the institution.  These include activities we may be very familiar with like prioritizing, developing relationships with creators, doing assessments and negotiating agreements.Within the concept of the AIMS model, which is primarily a hybrid collection environment, this work will be necessary to develop sound capturing and processing activities later.
  8. We’ve defined collection development as having five distinct stages which I’m going to go over with you this morning:PrerequisitesEstablish relationship with donorAnalyze FeasibilityNegotiate AgreementsPrepare for Accessioning
  9. The first step is going through some prerequisites like having an appraisal processes: how will you assess or evaluate materials? How will you be able to determine value? Also you need to evaluate your storage capacity: Do you have enough space to keep this material in both the short- and long-term? What about future transfers? Do you have a sound data preservation strategy or methodology?One of the most important prerequisites is establishing Collection policies.Defining what it is that we want to collect takes on a couple of different questions.The first might be what types of material are we interested in, in the traditional collecting sense: prominent people, organizational records, etc.Next, we need to consider what part of those figures lives we are collecting. We use our digital devices for private activities, as well as more public ones
which are we interested in collecting?The next logical step then is to think about where this information might be on digital devices: stored files probably yes, but do we also need software, operating systems, hardware, internet activity or cloud material?All of these factors, and more come together in a collection development policy, and it can be very difficult to write, especially when you are just starting and don’t know
  10. Assuming that you have the needed prerequisites in place or have the capacity to work on them, you can move on to the actual work of collection development:The first step is establishing a relationship with the donor. In many ways this is parallel to existing analog work, but when dealing with born-digital materials you should start thinking early about how digital archive staff need to be involved? This is potentially going to be very different from access to physical materials and now is the time to discuss options. Now is also the time to discuss the creation of the data with the donor and capture any documentation that will help with later processing and access. But, how comfortable is your donor with digital concepts and access to digital materials? As an example of the difficulty that this can cause, I’d like to show the example of some work that the AIMS project did in this regard. This is a digital donor survey that the AIMS project created based on one created for the PARADIGM workbook. The original intention was that a donor could fill this out before accessioning.This is the first page
and this is the second
and the third
and the fourth
.and this is part two!We quickly realized that this would be overwhelming to potential donors, especially ones who hadn’t really thought much about things like their online persona or email preservation. We changed tactics and now recommend that this survey be used as prompt sheet for the archivist in an interview.
  11. Such an interview may be part of a program of enhanced curation, something Jeremy Leighton John at the British Library describes as not only collect[ing] the original archive but add[ing] value to it.“enhanced curation” techniques include things like documenting the creator’s workspace with high-resolution digital photography, creating a digital film of an oral history interviewwith donors about their computers and their computing habits, perhaps capturing video of screencasts of the donor describing the organization on their computer. This type of information can be invaluable as materials are accessioned and processed as the level of abstraction or unfamiliarity with a new system can make it difficult to gain intellectual control.
  12. Okay, so you are ready to move on to considering whether or not you even *can* acquire this material or more likely whether it is worth the costs. What is the cost analysis and risk analysis? Try a test capture
how does it work? Do you have the needed infrastructure and policies or can you create them? Can you even view files in order to appraise them? Do you need these guys to accomplish this? Or maybe these guys?It’s very easy to say “analyze costs” or “evaluate your home institution infrastructure” but if you’ve never encountered a particular software or hardware it’s difficult to be prepared for them. This is where having technologists or digital archivists involved early in the process can help. If possible during a test capture they can do a triage to determine if there are serious preservation concerns, if any forensic processing might be needed to recover damaged or deleted files. Etc.
  13. Moving on then, the next step is negotiating agreements. One of the big problems here is that there is a lack of models for agreements and appraisals. Many elements of standard agreements remain applicable in the hybrid or born-digital archive, but have different implications. It’s not the same to provide unrestricted access to paper documents in a reading room and unrestricted access to digital materials online. Furthermore, you have a much larger potential for capturing and inadvertently exposing sensitive electronic information like financial and health information, passwords and other personal data.The legal agreement with the donor needs to specify:An Agreement about copyright – either transferred to repository/institution or remain with creator/heirsUnderstanding that collecting repository will be “sole” repository of b-d material Understanding of capabilities/limits for capturing b-d material (currently)Understanding of preservation strategies and capabilitiesUnderstanding of delivery capabilities and limits (current)Understanding of what/how files will be restricted or deleted & how this will be confirmed Understanding of capabilities/limits of appraisal, viewing, description/processing of b-d materialUnderstanding of the creative process and relationship with b-d materials, computers, hand-held devices, cloud computing, etc.
  14. The final step in collection development is to prepare for processing. This may seem a little odd in a traditional sense, but what we are alluding to here is making sure that all of your technical steps for transfer, which may not be in the agreement, are planned ahead of time. Specifically, Scope and extent determinedMethod and time determinedPre-acquisition appraisal performedTest capture if neededDevelopment of new methodologies undertaken as needed/possibleEnhanced curation carried outCoordination with acquisition of analog materialThis is really the “action” step where many of the activities you have been planning prior are carried out. Overall, the steps in Collection Development help to set up later activities. By the end of the collection development step, the institution should be ready to take legal and physical custody of material. Doing this in a forward-thinking, planfull manner will help later processes go much smoother. You’ve made it to the finish line of collection development, but now we need to move on to Accessioning.
  15. Accessioning is generally understood as the set of processes wherein a repository takes physical and legal custody of records from a donor and formally documents, or "registers." the transfer. The processes have clear links to both collection development and arrangement and description, and in some cases, institutions may view them as part of those processes. However, we have situated accessioning as a primary function within the AIMS framework.Within our framework, accessioning serves a vital role to allow a collecting repository to establish physical, administrative, and intellectual control over records that have been transferred. The accessioning processes allow archivists to gather a wide variety of information that will inform and prioritize other processes, such as arrangement and description, further appraisal, and requirements for access. Accessioning also provides an environment in which archivists can document their actions and ultimately transfer the accessioned records into an environment for their storage and maintenance.The goals of accessioning therefore reflect the need to establish control over and ensure the authenticity and reliability of transferred records. Archivists must therefore be diligent during accessioning and understand that they understand the potential impact of the actions they take during these processes. If a collecting repository is unable to establish an adequate level of control over transferred electronic records, then it is likely that it has not successfully accessioned them. Accordingly, archivists with "legacy" accessions of electronic records, such as those containing computer media, may want to consider "reaccessioning" those transfers to establish a suitable level of control.
  16. The prerequisites, like the other areas of the AIMS model, broadly fall into several categories; in this case, they are policies, procedures, and infrastructure. There are many policies required to support accessioning properly. These may range from departmental preferences to requirements set at the institutional level. Procedures may account for a number of different options, such as minimal processing, accessioning of born-digital materials with paper records, deferment of digital accessioning, accessioning as resources allow, and retrospective accessioning of previously received electronic records. Infrastructure to support accessioning includes a wide variety of software and hardware, and expertise. This infrastructure will take resources to build, and archivists are urged to consider collaborative partnerships to allow for the better sharing of knowledge.  The transfer and administrative control processes in the AIMS framework are very similar to those for other formats of records. Archivists working with electronic records should be familiar with the various types for transfers and their implications. Types of transfers can include receipt of retired media formerly in use by a creator, records copied to media only used for transfer (such as external hard drives, CDs or DVDs), or a direct transfer using disk imaging software or by copying files across a network.Once the under administrative control, archivists should focus their efforts to gain physical control over records and media. Much of this work concerns identifying and potentially addressing threats preservation issues in the records, such as viruses, unknown file formats, and the physical condition of media if appropriate.Archivists next need to establish intellectual control and gather documentation that will enable further work necessary to process, maintain, or use the records. For some transfers, a listing of directories or files may be repurposed for archival description if the existing arrangement appears to be of value.Finally, the archivist should prepare the records to be maintained over time. This may include actions such as normalizing to preservation formats. Ultimately, the records should also be transferred to a secure storage location that can be monitored by the collecting repository.
  17. At Yale University, we have worked on a reaccessioning project that has allowed us to develop our thinking of how this accessioning of electronic records could best be realized for us going forward. Two repositories, Manuscripts and Archives and the Beinecke Rare Book and Manuscript Library, have worked in collaboration to implement software, hardware, and procedures that can be shared to support accessioning. In our reaccessioning project, we are working to establish better control over previously transferred accessions that contain electronic records on media such as floppy disks and CD-ROMs. These pieces of media were often received as part of a hybrid accession that also contained paper records, but in some cases we have received accessions of boxes containing only media.
  18. The goals of our reaccessioning project are fairly straightforward and relate to the three types of control discussed previously. First, we seek to establish administrative control of the media by identifying what it is and documenting its physical and logical characteristics and by assigning a unique identifier to each piece. Secondly, we are working towards gaining physical control of the media, which will allow us to mitigate the risks of media deterioration and obsolescence. Finally, we are trying to establish a basic level of intellectual control by extracting metadata about the filesystems and files contained on the media, such as file names, directory structures, and creation, access, and modification dates.
  19. Our reaccessioning workflow roughly looks like the following. We begin by retrieving the media and bringing it to the electronic records workstation, documenting its change in location within the Archivists’ Toolkit. We then assign unique identifiers to each of the media. We establish the best means by which to write-protect the media for imaging and record its identifying characteristics in a media log. We then put the media in the appropriate drive and create a forensic bit-level disk image, which includes all the files, the filesystem metadata, unused space – in other words, the entirety of the data on the media. We verify the image against the raw contents of the media and extract metadata from the disk image. Finally, we package the images and metadata and transfer the package into storage and complete the rest of the documentation.
  20. To acquire the data off media, we are using a forensic imaging process that extracts the entirety of the data off the media at the lowest level possible. To ensure that we do not intentionally or accidentally manipulate any of the data on the original media, we write-protect the media or reader. For floppy disks, we can use physical write protect tabs. For USB flash media, hard drives, and the like, we connect the drive or reader to a write-blocker, which is a piece of hardware connected to the computer that blocks low-level write signals from a computer. We use a variety of software to acquire the images, such as FTK Imager. The imaging software extracts the data from the media and calculates a cryptographic hash of the data on the media and the data within the image file. If the checksums match, the imaging is viewed as successful. [ADD FTK Imager SCREENSHOT? WRITEBLOCKER PHOTO?]
  21. This is a screenshot of FTK Imager, which we use to image media and to inspect disk images. You can see that the file listing includes regular files, slack or unused space on the disk, and deleted files, as denoted by the red X on the file icons.
  22. Our media log is a SharePoint list that contains identifying characteristics and physical and logical information about the media, such as the type of media, when it was imaged, the text of a label or writing on the media, and the type of filesystem or filesystems it contains. We assign each piece of media a unique identifier, which is a combination of theaccession number and incremental number. The media log also contains the workflow status of the accessioning process for each piece of media and whether processes succeeded or failed.
  23. The first screenshot is an overview for several pieces of media. You can see the unique media identifiers, the media format, and the workflow status.
  24. This expanded view shows all the fields, including further documentation about the disk image, the filesystem contained, and additional notes.
  25. If imaging is successful, we then extract metadata from the filesystem and files within the image. This is a software-based process that provides metadata such as file names, directory structures, creation and modification times, and approximate categorization of the types of files. This metadata can be repurposed in a variety of ways and provides a basic level of intellectual control that is comparable to a box list or other type of inventory for paper records. We are using open source software such as Sleuthkit and fiwalk to perform this extraction, but occasionally we need to rely on other tools for older or less common types of file systems.
  26. Finally, we create a transfer package using the BagIt specification as developed by the Library of Congress and the California Digital Library. To create the packages, we are using the Library of Congress-developed Bagger application. These packages contain the disk images, extracted metadata, and logs generated by the disk imaging software during the acquisition process. The BagIt packages also contain high-level information about the accession. For the time being, we are making a rough connection of one bag per accession, but we realize we may need to modify depending on the size of the accessions.
  27. This an overview of a sample bag, showing the structure and high-level metadata. Once packaged, we transfer the package to storage and verify the success of the transfer using procedures for the BagIt specification which compare the contents of the package against its manifest. If successful, we complete the rest of the documentation and record the success in the media log. We also record the storage location of the transferred package within the Archivists’ Toolkit and add the date of completion.
  28. SAA definition for description puts emphasis on minimizing the amount of handling needs to be updated to consider preservation actions due to file format obsolescence etc
  29. - reasonable return for our effort for us to describe the ‘project’ and indicative content that we held
  30. What is sensitive will vary from collection to collection information (social security; personal e-mail address/mobile no etc) - Could also be discussion behind a decision (Larkin 25 funding)
  31. As a result of experiences to tackle arrangement and description, the AIMS digital archivists' defined the requirements for a new tool - designed to work with technical and professional standards- use drag'n'drop to create intellectual arrangement, changes a relationship between digital assets (asset doesn’t move) using Fedora "sets“  - rights & permissions to single file, a discrete series or the entire collection
  32. “the systems and workflows that make processed or unprocessedmaterial and the metadata that support it available to users.”Discovery and access is also not possible without completion of many of the prior steps described in this model. The outcomes of those steps have a significant impact on what is either appropriate or achievable in terms of discovery and access. Given the impact of these prior steps on discovery and access it is crucial to consider the desired outcomes for discovery and access as early as possible — ideally during the Collection Development phase — and to continue to update and revise these plans are work on the collection progresses.
  33. Overall though, we have three major goals in discovery and access.The first is to make material available to user communities. This includes ensuring that the users can find the material, understand if it’s available, and get access to it if possibleHowever, that access must follow guidelines for access restrictions related to privacy, and intellectual property.An overarching goal of all three is to ensure that the significant properties of the material are inherent in whichever form the delivery takes.
  34. We plan to delivery Stephen Jay Gould papers in the Hypatia platform. Hypatia is a fedora platform In Hypatia, we have one EAD for the hybrid collectionSeries 6 – for born digital material. We provide a link for people to go to an interface where they can browse and perform full text search on the born digital material of the papers.
  35. Facets:SubjectsTeaching materialsBooksArticlesNSF reports
  36. Convert the files in obsolete file format such as WordPerfect to html. If not, people have to download the files and find a viewer to view the file or create an emulated environment to view the file.
  37. Discovery and access is also not possible without completion of many of the prior steps described in this model. Some institution accept certain file formats only.
  38. Researchers may also need to bookmark or label the files they found.
  39. In additional to Hypatia mentioned above. Stanford also try to use FTK the software we use for processing) to delivery born digital materials.One of the features of the FTK, which I believe will be interested by researchers, is the ability to generate Fuzzy hash.Files with the same hash are the same in contents. What about similar files?Fuzzy hash provide you the information how close files are Full text searchHow many characters mis-speltFuzzy hashing is a tool which provides the ability to compare two different files and determine a fundamental level of similarity. This similarity is expressed as score from 0-100. The higher the score reported the more similar the two pieces of data. A score of 100 would indicate that the files are close to identical. Alternatively a score of 0 would indicate no meaningful common sequence of data between the two files.
  40. I mentioned before that the goal of D&A is to ensure that the significant properties of the material are inherent in whichever form the delivery takes.For design file I believe VM is the appropriate platform.I have built a virtual machine containing some design files with the associated fonts.People want to know the exact fonts, font spacing, etc. used. They don’t have the fonts – so even they download the file, they cannot recreate the appearance of the file,Virtual machine created using Parallels Desktop.
  41. How to delivery 50,000 emails? I worked with colleague at Stanford to produce network graph of 50,000 emails. Name of the network software: Gephi is an open-source software for visualizing and analyzing large networks graphs.
  42. I am very lucky to meet Computer Science candidate at Stanford. SudheendraHangalEmail visualization tool for sentiment analysis.Psychology literature to define what words constitute happiness, love, etc. Topic analysis using software
  43. Annotation, see individual email