SlideShare a Scribd company logo
1 of 13
Download to read offline
Catherine Jones
Science and Technology Facilities Council
DPC Advanced Practitioners Course
University of Glasgow, 17th July 2013
Digital Preservation Policy
Why is it needed for SCAPE watch and planning
tools?
What is policy?
• Policy is the written representation of the aims and
objectives of an organisation.
• It sets the environment for all other activities being
undertaken.
• It is influenced by many things: political,
environmental, technical, financial and legal issues.
• It can be hard to make policy in a new & developing
area – such as Digital Preservation
2
What is digital preservation policy?
• The organisation’s aims and objectives about the
long term care of digital objects:
• Preservation strategies and acceptable actions
• Decision about the digital objects (formats, significant
properties etc)
• Who the material is being preserved for
• Resourcing
• Responsibilities
3
Part of wider policy landscape
IT
infrastructure
policy
Digital
preservation
policy
Organisational
Resourcing
policy
Collection
Management
policy
4
The role of policy in planning and watch
5
SCAPE Policy Levels
6
Guidance
High level
General objectives
Applies to all parts of
the organisation and
collections
Written in natural
language to be read
by a human being
Preservation
Procedure
More detailed level
General approaches
Written in natural
language to be read
by a human being
Control
Specific, measurable
objectives
Applies to specific
collections or formats
In two forms: natural
language and
machine readable
form (RDF)
Guidance policy
• This will be at a high level that a Director of an
organisation would understand. Topics:
• Preservation goals & strategies of an organization
• Designated Community/Stakeholders
• Digital Objects
• Metadata
• Authenticity
• Rights
• Standards
• Organisation
• Storage
7
Preservation Procedure
• Preservation Procedure: Natural language human
readable policy which may encompass the whole
organisation or may be focused on a particular
collection or material type depending on the needs
of the particular organisation
• SCAPE outcome in this area will be information and
guidance on how to construct this level of policy and what
factors need to be taken into consideration when
composing it for areas of particular interest in watch and
planning.
8
The list of suitable data formats for digital preservation will be based on the following criteria:
• Openness of the format: Is the format well described and is documentation available? Is
the format subject to any patents? Is a licence or permission required to use the format?
• Distribution of the format: Is the format used widespread? Will many programmes be able
to understand the format?
• Error tolerance of the format: Will a single bit error make the whole file unreadable? Has
the format been compressed (lossless or lossy data compression)?
• Acceptance of the format as a preservation format: How is the format evaluation on
corresponding lists of recommended formats?
• Dependency of the format of external sources of information, for example fonts or pictures
with external references.
• Ability of the format to embed data in other formats, for example embedding of video in a
pdf-file.
Based on these criteria the owner of the digital collections can add a data format to the list as
“Recommended” and “Accepted”.
Control
• These are statements derived from the Preservation
Level, which are in both a human readable and
machine-readable form.
9
Model links a particular content set
(collection) with a particular user
community (specific requirements) with
specific measurable objectives which can
be tested automatically
Stage 2: Policy statements
within the whole policy
1. Clarification of implicit
meaning
2. Identification of control
policy preservation case
3. Identification of objectives
4. Generate control statements
10
Stage 1: Whole policy activities
1. Identify the content set the
policy addresses
2. Identify the user
communities/roles required
by the policy
3. Map policy statements to
high level concepts.
Creating Control policy statements
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number
Stage 3: Review the Preservation Cases and identify any
rationalisation required
Worked Example
11
“3.1.1 All raw data will be curated in well-defined formats for which the means of
reading the data will be made available by the Facility”
Express some of the implicit information and rewrite to:
• “All data curated will be in well-defined formats”
• “Approved well-defined formats will be able to be read”
• “The reader will be supplied by at least the ISIS Facility”
Also need to express what “curated” means
Goals/Objectives:
1. File format must be of an approved format for the contentset
2. The file format should have documentation
3. Any instrument specific schema should be documented
4. There should be at least one piece of software which can read the files
5. This file reader should be available from the organisation holding the data
6. This file reader should be able to be used by the designated user
community
7. The file format should be able to be validated
8. Fixity checks should be undertaken
Using the contentset 2011 LET Calibration and a user community of domain
specific researchers
i. The file reader MUST be available to the designated user community
Using the contentset 2011 LET Calibration and a user community of ISIS data
managers
i. File format MUST be NeXus
ii. The file format MUST have documentation
iii. Any instrument specific schema MUST be documented
iv. Nexus File reader software available > 1
v. NeXus file reader MUST be located at STFC
vi. The file format MUST be able to be validated
vii. Fixity checks MUST be able to be undertaken
Conclusion
• Having explicit policy in natural language is important
• Expressing policy in machine testable ways is more
complex but can bring benefit through use of tools
• Points to note:
• natural language preservation procedure policy defining
acceptable states in statements but control level defining
measurable attributes in questions
• Written policy is at a fairly abstract level and practicalities may be
addressed in implementation plan/job procedure document or
one-off project plan
• Implicit information understood by human audience which needs
explicitly expressing for computers
12
Thank you
• Partners in the work package are Barbara Sierman
(KB & lead); Gry Elstrom (SB); Sean Bechhofer
(University of Manchester) and Catherine Jones
(STFC)
• Any further questions about SCAPE policy
Catherine.jones@stfc.ac.uk
13

More Related Content

Viewers also liked

Viewers also liked (19)

Planets, OPF & SCAPE - presentation of tools on digital preservation
Planets, OPF & SCAPE - presentation of tools on digital preservationPlanets, OPF & SCAPE - presentation of tools on digital preservation
Planets, OPF & SCAPE - presentation of tools on digital preservation
 
SCAPE Preservation Platform. Design and Deployment
SCAPE Preservation Platform. Design and DeploymentSCAPE Preservation Platform. Design and Deployment
SCAPE Preservation Platform. Design and Deployment
 
Characterisation - 101. An introduction to the identification and characteris...
Characterisation - 101. An introduction to the identification and characteris...Characterisation - 101. An introduction to the identification and characteris...
Characterisation - 101. An introduction to the identification and characteris...
 
Taverna and myExperiment. SCAPE presentation at a Hack-a-thon
Taverna and myExperiment. SCAPE presentation at a Hack-a-thonTaverna and myExperiment. SCAPE presentation at a Hack-a-thon
Taverna and myExperiment. SCAPE presentation at a Hack-a-thon
 
TAVERNA Components - Semantically annotated and sharable units of functionality
TAVERNA Components - Semantically annotated and sharable units of functionalityTAVERNA Components - Semantically annotated and sharable units of functionality
TAVERNA Components - Semantically annotated and sharable units of functionality
 
Scalable Preservation Workflows
Scalable Preservation WorkflowsScalable Preservation Workflows
Scalable Preservation Workflows
 
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
Migration of audio files using Hadoop, SCAPE Information Day, 25 June 2014
 
Matchbox tool. Quality control for digital collections – SCAPE Training event...
Matchbox tool. Quality control for digital collections – SCAPE Training event...Matchbox tool. Quality control for digital collections – SCAPE Training event...
Matchbox tool. Quality control for digital collections – SCAPE Training event...
 
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
Scape information day at BL - Using Jpylyzer and Schematron for validating JP...
 
Jpylyzer, a validation and feature extraction tool developed in SCAPE project
Jpylyzer, a validation and feature extraction tool developed in SCAPE projectJpylyzer, a validation and feature extraction tool developed in SCAPE project
Jpylyzer, a validation and feature extraction tool developed in SCAPE project
 
Presentation of SCAPE Project
Presentation of SCAPE ProjectPresentation of SCAPE Project
Presentation of SCAPE Project
 
Duplicate detection for quality assurance of document image collections
Duplicate detection for quality assurance of document image collectionsDuplicate detection for quality assurance of document image collections
Duplicate detection for quality assurance of document image collections
 
Quality assurance for document image collections in digital preservation
Quality assurance for document image collections in digital preservation Quality assurance for document image collections in digital preservation
Quality assurance for document image collections in digital preservation
 
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
Integrating the Fedora based DOMS repository with Hadoop, SCAPE Information D...
 
Audio Quality Assurance. An application of cross correlation
Audio Quality Assurance. An application of cross correlationAudio Quality Assurance. An application of cross correlation
Audio Quality Assurance. An application of cross correlation
 
PDF/A-3 for preservation. Notes on embedded files and JPEG2000
PDF/A-3 for preservation. Notes on embedded files and JPEG2000PDF/A-3 for preservation. Notes on embedded files and JPEG2000
PDF/A-3 for preservation. Notes on embedded files and JPEG2000
 
SCAPE - Building Digital Preservation Infrastructure
SCAPE - Building Digital Preservation InfrastructureSCAPE - Building Digital Preservation Infrastructure
SCAPE - Building Digital Preservation Infrastructure
 
SCAPE Information Day at BL - Large Scale Processing with Hadoop
SCAPE Information Day at BL - Large Scale Processing with HadoopSCAPE Information Day at BL - Large Scale Processing with Hadoop
SCAPE Information Day at BL - Large Scale Processing with Hadoop
 
Evolving Domains, Problems and Solutions for Long Term Digital Preservation
Evolving Domains, Problems and Solutions for Long Term Digital PreservationEvolving Domains, Problems and Solutions for Long Term Digital Preservation
Evolving Domains, Problems and Solutions for Long Term Digital Preservation
 

Similar to Digital Preservation Policies - SCAPE

Standard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptxStandard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptx
RocioMendez59
 

Similar to Digital Preservation Policies - SCAPE (20)

Control policy formulation
Control policy formulationControl policy formulation
Control policy formulation
 
Creating a Data Management Plan for your Research
Creating a Data Management Plan for your ResearchCreating a Data Management Plan for your Research
Creating a Data Management Plan for your Research
 
PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...PERICLES Policy management & ontology supported preservation - Acting on Chan...
PERICLES Policy management & ontology supported preservation - Acting on Chan...
 
ERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management WebinarERA CoBioTech Data Management Webinar
ERA CoBioTech Data Management Webinar
 
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
The Horizon 2020 Open Data Pilot - OpenAIRE webinar (Oct. 21 2014) by Sarah J...
 
H2020 data pilot openaire
H2020 data pilot openaireH2020 data pilot openaire
H2020 data pilot openaire
 
FAIR data: what it means, how we achieve it, and the role of RDA
FAIR data: what it means, how we achieve it, and the role of RDAFAIR data: what it means, how we achieve it, and the role of RDA
FAIR data: what it means, how we achieve it, and the role of RDA
 
Data management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionData management plans and planning - a gentle introduction
Data management plans and planning - a gentle introduction
 
Digital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and Requirements
 
Data Management Lab: Data management plan instructions
Data Management Lab: Data management plan instructionsData Management Lab: Data management plan instructions
Data Management Lab: Data management plan instructions
 
Results from the FAIR Expert Group Stakeholder Consultation on the FAIR Data ...
Results from the FAIR Expert Group Stakeholder Consultation on the FAIR Data ...Results from the FAIR Expert Group Stakeholder Consultation on the FAIR Data ...
Results from the FAIR Expert Group Stakeholder Consultation on the FAIR Data ...
 
Standard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptxStandard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptx
 
H2020 Open Research Data pilot
H2020 Open Research Data pilotH2020 Open Research Data pilot
H2020 Open Research Data pilot
 
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
OpenAIRE webinar on Open Research Data in H2020 (OAW2016)
 
Data management planning: the what, the why, the who, the how
Data management planning: the what, the why, the who, the howData management planning: the what, the why, the who, the how
Data management planning: the what, the why, the who, the how
 
Ariadne: Data Management Planning
Ariadne: Data Management PlanningAriadne: Data Management Planning
Ariadne: Data Management Planning
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchers
 
PERICLES Modelling Policies - Acting on Change 2016
PERICLES Modelling Policies - Acting on Change 2016PERICLES Modelling Policies - Acting on Change 2016
PERICLES Modelling Policies - Acting on Change 2016
 
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
 
2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop
 

More from SCAPE Project

More from SCAPE Project (19)

C sz z6
C sz z6C sz z6
C sz z6
 
SCAPE Information Day at BL - Characterising content in web archives with Nanite
SCAPE Information Day at BL - Characterising content in web archives with NaniteSCAPE Information Day at BL - Characterising content in web archives with Nanite
SCAPE Information Day at BL - Characterising content in web archives with Nanite
 
SCAPE Information Day at BL - Some of the SCAPE Outputs Available
SCAPE Information Day at BL - Some of the SCAPE Outputs AvailableSCAPE Information Day at BL - Some of the SCAPE Outputs Available
SCAPE Information Day at BL - Some of the SCAPE Outputs Available
 
SCAPE Information day at BL - Flint, a Format and File Validation Tool
SCAPE Information day at BL - Flint, a Format and File Validation ToolSCAPE Information day at BL - Flint, a Format and File Validation Tool
SCAPE Information day at BL - Flint, a Format and File Validation Tool
 
SCAPE Webinar: Tools for uncovering preservation risks in large repositories
SCAPE Webinar: Tools for uncovering preservation risks in large repositoriesSCAPE Webinar: Tools for uncovering preservation risks in large repositories
SCAPE Webinar: Tools for uncovering preservation risks in large repositories
 
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
SCAPE – Scalable Preservation Environments, SCAPE Information Day, 25 June 20...
 
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
Policy driven validation of JPEG 2000 files based on Jpylyzer, SCAPE Informat...
 
Hadoop and its applications at the State and University Library, SCAPE Inform...
Hadoop and its applications at the State and University Library, SCAPE Inform...Hadoop and its applications at the State and University Library, SCAPE Inform...
Hadoop and its applications at the State and University Library, SCAPE Inform...
 
Scape project presentation - Scalable Preservation Environments
Scape project presentation - Scalable Preservation EnvironmentsScape project presentation - Scalable Preservation Environments
Scape project presentation - Scalable Preservation Environments
 
LIBER Satellite Event, SCAPE by Sven Schlarb
LIBER Satellite Event, SCAPE by Sven SchlarbLIBER Satellite Event, SCAPE by Sven Schlarb
LIBER Satellite Event, SCAPE by Sven Schlarb
 
Content profiling and C3PO
Content profiling and C3POContent profiling and C3PO
Content profiling and C3PO
 
Preservation Policy in SCAPE - Training, Aarhus
Preservation Policy in SCAPE - Training, AarhusPreservation Policy in SCAPE - Training, Aarhus
Preservation Policy in SCAPE - Training, Aarhus
 
An image based approach for content analysis in document collections
An image based approach for content analysis in document collectionsAn image based approach for content analysis in document collections
An image based approach for content analysis in document collections
 
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
SCAPE - Skalierbare Langzeitarchivierung (SCAPE - scalable longterm digital p...
 
Automatic Preservation Watch
Automatic Preservation WatchAutomatic Preservation Watch
Automatic Preservation Watch
 
Policy levels in SCAPE
Policy levels in SCAPEPolicy levels in SCAPE
Policy levels in SCAPE
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation Environments
 
Large scale preservation workflows with Taverna – SCAPE Training event, Guima...
Large scale preservation workflows with Taverna – SCAPE Training event, Guima...Large scale preservation workflows with Taverna – SCAPE Training event, Guima...
Large scale preservation workflows with Taverna – SCAPE Training event, Guima...
 
Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012
Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012
Digital Preservation - The Saga Continues - SCAPE Training event, Guimarães 2012
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Digital Preservation Policies - SCAPE

  • 1. Catherine Jones Science and Technology Facilities Council DPC Advanced Practitioners Course University of Glasgow, 17th July 2013 Digital Preservation Policy Why is it needed for SCAPE watch and planning tools?
  • 2. What is policy? • Policy is the written representation of the aims and objectives of an organisation. • It sets the environment for all other activities being undertaken. • It is influenced by many things: political, environmental, technical, financial and legal issues. • It can be hard to make policy in a new & developing area – such as Digital Preservation 2
  • 3. What is digital preservation policy? • The organisation’s aims and objectives about the long term care of digital objects: • Preservation strategies and acceptable actions • Decision about the digital objects (formats, significant properties etc) • Who the material is being preserved for • Resourcing • Responsibilities 3
  • 4. Part of wider policy landscape IT infrastructure policy Digital preservation policy Organisational Resourcing policy Collection Management policy 4
  • 5. The role of policy in planning and watch 5
  • 6. SCAPE Policy Levels 6 Guidance High level General objectives Applies to all parts of the organisation and collections Written in natural language to be read by a human being Preservation Procedure More detailed level General approaches Written in natural language to be read by a human being Control Specific, measurable objectives Applies to specific collections or formats In two forms: natural language and machine readable form (RDF)
  • 7. Guidance policy • This will be at a high level that a Director of an organisation would understand. Topics: • Preservation goals & strategies of an organization • Designated Community/Stakeholders • Digital Objects • Metadata • Authenticity • Rights • Standards • Organisation • Storage 7
  • 8. Preservation Procedure • Preservation Procedure: Natural language human readable policy which may encompass the whole organisation or may be focused on a particular collection or material type depending on the needs of the particular organisation • SCAPE outcome in this area will be information and guidance on how to construct this level of policy and what factors need to be taken into consideration when composing it for areas of particular interest in watch and planning. 8 The list of suitable data formats for digital preservation will be based on the following criteria: • Openness of the format: Is the format well described and is documentation available? Is the format subject to any patents? Is a licence or permission required to use the format? • Distribution of the format: Is the format used widespread? Will many programmes be able to understand the format? • Error tolerance of the format: Will a single bit error make the whole file unreadable? Has the format been compressed (lossless or lossy data compression)? • Acceptance of the format as a preservation format: How is the format evaluation on corresponding lists of recommended formats? • Dependency of the format of external sources of information, for example fonts or pictures with external references. • Ability of the format to embed data in other formats, for example embedding of video in a pdf-file. Based on these criteria the owner of the digital collections can add a data format to the list as “Recommended” and “Accepted”.
  • 9. Control • These are statements derived from the Preservation Level, which are in both a human readable and machine-readable form. 9 Model links a particular content set (collection) with a particular user community (specific requirements) with specific measurable objectives which can be tested automatically
  • 10. Stage 2: Policy statements within the whole policy 1. Clarification of implicit meaning 2. Identification of control policy preservation case 3. Identification of objectives 4. Generate control statements 10 Stage 1: Whole policy activities 1. Identify the content set the policy addresses 2. Identify the user communities/roles required by the policy 3. Map policy statements to high level concepts. Creating Control policy statements This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number Stage 3: Review the Preservation Cases and identify any rationalisation required
  • 11. Worked Example 11 “3.1.1 All raw data will be curated in well-defined formats for which the means of reading the data will be made available by the Facility” Express some of the implicit information and rewrite to: • “All data curated will be in well-defined formats” • “Approved well-defined formats will be able to be read” • “The reader will be supplied by at least the ISIS Facility” Also need to express what “curated” means Goals/Objectives: 1. File format must be of an approved format for the contentset 2. The file format should have documentation 3. Any instrument specific schema should be documented 4. There should be at least one piece of software which can read the files 5. This file reader should be available from the organisation holding the data 6. This file reader should be able to be used by the designated user community 7. The file format should be able to be validated 8. Fixity checks should be undertaken Using the contentset 2011 LET Calibration and a user community of domain specific researchers i. The file reader MUST be available to the designated user community Using the contentset 2011 LET Calibration and a user community of ISIS data managers i. File format MUST be NeXus ii. The file format MUST have documentation iii. Any instrument specific schema MUST be documented iv. Nexus File reader software available > 1 v. NeXus file reader MUST be located at STFC vi. The file format MUST be able to be validated vii. Fixity checks MUST be able to be undertaken
  • 12. Conclusion • Having explicit policy in natural language is important • Expressing policy in machine testable ways is more complex but can bring benefit through use of tools • Points to note: • natural language preservation procedure policy defining acceptable states in statements but control level defining measurable attributes in questions • Written policy is at a fairly abstract level and practicalities may be addressed in implementation plan/job procedure document or one-off project plan • Implicit information understood by human audience which needs explicitly expressing for computers 12
  • 13. Thank you • Partners in the work package are Barbara Sierman (KB & lead); Gry Elstrom (SB); Sean Bechhofer (University of Manchester) and Catherine Jones (STFC) • Any further questions about SCAPE policy Catherine.jones@stfc.ac.uk 13