Thomas Stensitzki discusses the importance of archiving for long-term preservation. He outlines key terms like outsourcing, filing, backup and archiving. He explains the differences between document management and long-term preservation. Stensitzki also discusses why archiving is important from a legal and regulatory perspective to comply with requirements. He describes what types of documents and data should be archived, including metadata requirements. Finally, he covers technical aspects of how archiving can be done, including recommended file formats, storage media options, and other considerations.
3. Terms
Outsourcing, Filing, Backup, Archiving
Outsourcing
- Data (e.g. of a specific period) is being exported from a source system and converted (if required)
- Outsourced data is not available in the source system
- Outsourced data can be backed up or archived
- Importing of outsourced data might require conversion, when the target data structure is different
Filing
- Storage of objects in a folder of the file system
- Filed objects can be backed up or archived depended on their file location
4. Terms
Outsourcing, Filing, Backup, Archiving
Backup
- Copy of existing objects to a storage medium to be able to restore data in the case of data
corruption or accidental deletion
- Performed periodically
- Storage medium is being overwritten in time, older version of an object can therefore not be restored
- Old versions of an object can be restored for a specific period only
Archiving
- Copy of a file or document to an external storage medium
- Standardized file format (tif, jpg) (if required)
- Storage for a longer period
5. Terms
Document management vs. Long-term preservation
Document management
- Management of documents being edited using Check-In, Check-Out and Versioning
- Documents can be found by attribute value search or full-text search
- Attributes and document links are managed by DMS
- Documents are stored in the file system or a DMS database
6. Terms
Document management vs. Long-term preservation
Long-term preservation
- Auditable and unchangeable storage of completed objects for a long time
- Copy of objects (e.g. files, documents) to an external storage medium
- Files and raw data are archived in original format
- Documents are converted and archived in standardized format (black/white = TIF, colour = JPEG
or PDF/A)
- Document lookup via index
- Archived files and raw data can be provided in original format
- Archived documents can be provided using a viewer software
7. Terms
Long-term preservation
Digital archiving
- Database-driven, long-term, secure and unchangeable storage of digital information objects
which are reproducible at any time
Digital long-term preservation
- Storage of digital information for a period longer than 10 years
Auditable digital archiving
- Storage of digital business-related information of in accordance to the requirements of
- Handelsgesetzbuch § 239, § 257 HGB
- Abgabeordnung § 146, §147, § 200 AO
- GoBS
- Secure and orderly storage of business-related documents with retention periods of six to ten
years
8. Why
Sources of documents/objects
Documents, lifecycle of documents
- Creation and editing documents: in process (e.g. DMS, SharePoint)
- Completed documents: final version of a document
- Additional editing creates new version
Other documents
- Correspondence, reports, rules, pictures, films, letters, invoices, quotations, certificates from
different sources
Workflows
- Information from workflow based systems (with digital signatures)
- Final document can be created from related data as the final workflow step
IT systems
- Raw data is usually available in databases or files
9. Why
Dealing with documents/objects
Documents
- Documents in process and/or final documents are stored in DMS, SharePoint or a disk drive (local
or network share)
- Documents stored on network shares are backup automatically
- Documents in SharePoint and emails in Outlook are deleted after retention period has expired
- Deleted documents on a network share cannot be restored after the backup period as exceeded
- Final documents signed by hand are archived in paper and/or scanned to PDF and stored as file
(attached to an email)
10. Why
Dealing with documents/objects
Other documents
- Emails are deleted from the inbox automatically after retention period has expired
- Reports, images, films, invoices, quotations, certificates, etc. available as files are be considered as
documents
- Documents in paper, e.g. correspondence, letters, certificates, etc. are stored in files
11. Why
Dealing with documents/objects
Workflow vs. documents
- Information created in workflow systems is stored with data of digital signatures in databases
- All data of a finalized workflow is stored digitally within the database (usually), final document can be
created using a template
- Print-out is treated as a copy of the original digital document
- Digitally signed documents are treated equally to documents signed by hand
IT systems vs. raw data
- Raw data is stored in databases or files which grow over time
- Data can be outsourced or exported to reduce the storage size, but the data is not instantly
accessible for the application
- Software manufacturers must guarantee that release changes do not impact the capability to import
outsourced data
12. Why
Legal and regulatory requirements for archiving
Legal requirements for business documents
- Handelsgesetzbuch (HGB) § 257 regulates which business documents have to be archived
- Legal retention period for business letters is 6 years, for other documents 10 years
- Abgabenordnung (AO) §§ 146, 147 describe similar requirements for administrative regulations
- Digitally archiving of those documents must comply to the principles of proper accounting (GoB)
and GoBS which describe the requirements for process documentation
- Process documentation is the proof of correct operation of the system and describes the overall
organizational and technical process of archiving (collection, indexing, storage, retrieval,
protection against loss / corruption and reproduction of archived information)
13. Why
Legal and regulatory requirements for archiving
- Digitally signed documents are legally binding as well as conventional paper documents
- Each country has different requirements depending on the business of the company (e.g.
Sarbanes-Oxley Act regarding internal controlling)
- Subject to audits and inspections
14. Why
Legal and regulatory requirements for archiving
Industry-specific requirements for documentation / archiving
- Gefahrengutverordnung (GGAV)
- Environmental liability and product liability law
- Operational directives and regulations
- Good Practice quality guidelines and regulations
- etc.
Agree with internal departments (QS, Legal, Controlling) and maybe with authorities on the
archiving process
15. What
Retention policies for information life-cycle in Outlook and SharePoint
Recommendations
Outlook Retention period
Inbox 60 days
Other folders
Sent Items
Drafts
Outbox
2 years
Deleted items 7 days
Calendar
Tasks
2 years
Contacts Duration of
employment
Classes in SharePoint Retention period
Standard 2 years
Review 7 years
Long-Term 10 years
16. What
Which documents and data
Business units determine
- Which documents have to be archived how and for how long
(storage form, file plan, retention periods)
- Document classes (logical archive)
- Document types
- Index data
17. What
Requirements
Requirements for long-term preservation are specified by the business
- Processes, workflows, interfaces
- Documents, objects, source, meta data
- Archiving period
- Regulatory aspects
- Permissions, roles, user management, responsibilities
- Purpose of archiving (e.g. display of documents in 15 years)
- Confidentiality, data integrity, sensitive data, availability
- Capacity (data volume, number of users, performance)
- etc.
18. What
Meta data
Meta data provides structured index and search capabilities to archived objects
- Source of meta data (e.g. master data systems)
- Who maintains the master data?
- Shall meta data be selected or manually entered?
- Is meta data document-dependent?
- Is meta data transferred automatically from other systems?
- Is an audit-trail required? (Who has changed which meta-data, when, why)
Coordination of the meta data in early stages is highly recommended
19. What
Requirements
If raw data has to be archived
- Raw data is stored as is, bit-wise
- Primary goal is the ability to import raw data as 1:1 copy of the original data
- IT system generating raw data must be able to handle imported raw data even after a long time
- Format of raw data must be coordinated
- Software manufacturers must guarantee that release changes do not impact the capability to import
outsourced data
- Meta data must be defined
- Processing of long-term preserved raw data is the responsibility of the generating IT system,
not of the archiving system
20. How
Technical aspects
Selection of eligible file formats
- Should the document be displayed as original incl. embedded graphics?
- Should reproduce the original document properties (paper size, font size, header, footer, logos,
color, hand-written notes, etc.)?
- Should documents be archived in different formats but with same content (e.g. XML and graphic)?
- Legal requirements?
- Is “loss of information” acceptable when converting into graphical representations (jpeg)?
- Is the converting process revision-safe?
- Is the archived document format suitable for the archiving period?
21. How
BSI approved formats
Graphics
- TIFF, storage of screened black-white images
- JPEG, storage of colour and gray scale images
Structure formats
- XML, can be used for long-term preservation of digital documents
Schema and layout have to be archived as well
- PDF/A, subset of PDF, standardized for long-term preservation
Format with structure and layout information and graphical objects
Documents must be validated to be PDF/A compliant
Page 21
22. How
Storage media
Possible storage media
- Paper
- Microfilm
- Magnetic tapes, floppy disks
- Optical storage media (e.g. CD-R, CD-ROM, DVD, WORM)
- Hard drives
- etc.
Selected media types have a limited lifetime and durability. Long-term preserved
objects must be copied to new media unchanged, if required due to technology
related changes in the storage media.
23. How
Additional topics
- Storage of sensitive data
- Restart of the archiving system after system outage in a disaster
- Integration in current IT environment
- Migration of archived objects is expensive depending on data volume
- User management
- Usage of storage media must be regulated
- Firewall based separation of archiving system
- Long-Term archiving solution should be in use for a long time, supplier selection should be aware of
this
24. How
Pros & Cons
Pros
Single storage of documents/objects
Save storage space
Documents/objects available to
authorized persons
Documents/objects available from
every workplace
Structured search of
documents/objects
Cons
Usage of source documents must be
regulated
Personal must be trained
(end-user, administrator)
On-going maintenance costs
Complex IT system and IT
infrastructure required
25. We would be happy to help.
Do You Have
Any Questions?
http://www.granikos.eu
info@granikos.eu
@Granikos_DE