Is long-time preservation of emails and documents important?
Why should/must items be archived?
What should/must items be archived?
How can archiving be done?
2. Agenda
1 Long-Term preservation
2 Why should/must items be archived?
3 What should/must items be archived?
4 How can archiving be done?
Page ï§ 2
3. Terms
Outsourcing, Filing, Backup, Archiving
ï§ Outsourcing
- Data (e.g. of a specific period) is being exported from a source system and
converted (if required)
- Outsourced data is not available in the source system
- Outsourced data can be backed up or archived
- Importing of outsourced data might require conversion, when the target data
structure is different
ï§ Filing
- Storage of objects in a folder of the file system
- Filed objects can be backed up or archived depended on their file location
Page ï§ 3
4. Terms
Outsourcing, Filing, Backup, Archiving
ï§ Backup
- Copy of existing objects to a storage medium to be able to restore data in the
case of data corruption or accidental deletion
- Performed periodically
- Storage medium is being overwritten in time, older version of an object can
therefore not be restored
- Old versions of an object can be restored for a specific period only
ï§ Archiving
- Copy of a file or document to an external storage medium
- Standardized file format (tif, jpg) (if required)
- Storage for a longer period
Page ï§ 4
5. Terms
Document management vs. Long-term preservation
ï§ Document management
- Management of documents being edited using Check-In, Check-Out and
Versioning
- Documents can be found by attribute value search or full-text search
- Attributes and document links are managed by DMS
- Documents are stored in the file system or a DMS database
Page ï§ 5
6. Terms
Document management vs. Long-term preservation
ï§ Long-term preservation
- Auditable and unchangeable storage of completed objects for a long time
- Copy of objects (e.g. files, documents) to an external storage medium
- Files and raw data are archived in original format
- Documents are converted and archived in standardized format (black/white =
TIF, colour = JPEG or PDF/A)
- Document lookup via index
- Archived files and raw data can be provided in original format
- Archived documents can be provided using a viewer software
Page ï§ 6
7. Terms
Long-term preservation
ï§ Digital archiving
- Database-driven, long-term, secure and unchangeable storage of digital
information objects which are reproducible at any time
ï§ Digital long-term preservation
- Storage of digital information for a period longer than 10 years
ï§ Auditable digital archiving
- Storage of digital business-related information of in accordance to the
requirements of
- Handelsgesetzbuch § 239, § 257 HGB
- Abgabeordnung § 146, §147, § 200 AO
- GoBS
- Secure and orderly storage of business-related documents with retention
periods of six to ten years
Page ï§ 7
8. Why
Sources of documents/objects
ï§ Documents, lifecycle of documents
- Creation and editing documents: in process (e.g. DMS, SharePoint)
- Completed documents: final version of a document
- Additional editing creates new version
ï§ Other documents
- Correspondence, reports, rules, pictures, films, letters, invoices, quotations,
certificates from different sources
ï§ Workflows
- Information from workflow based systems (with digital signatures)
- Final document can be created from related data as the final workflow step
ï§ IT systems
- Raw data is usually available in databases or files
Page ï§ 8
9. Why
Dealing with documents/objects
ï§ Documents
- Documents in process and/or final documents are stored in DMS, SharePoint or
a disk drive (local or network share)
- Documents stored on network shares are backup automatically
- Documents in SharePoint and emails in Outlook are deleted after retention
period has expired
- Deleted documents on a network share cannot be restored after the backup
period as exceeded
- Final documents signed by hand are archived in paper and/or scanned to PDF
and stored as file (attached to an email)
Page ï§ 9
10. Why
Dealing with documents/objects
ï§ Other documents
- Emails are deleted from the inbox automatically after retention period has
expired
- Reports, images, films, invoices, quotations, certificates, etc. available as files
are be considered as documents
- Documents in paper, e.g. correspondence, letters, certificates, etc. are stored in
files
Page ï§ 10
11. Why
Dealing with documents/objects
ï§ Workflow vs. documents
- Information created in workflow systems is stored with data of digital signatures
in databases
- All data of a finalized workflow is stored digitally within the database (usually),
final document can be created using a template
- Print-out is treated as a copy of the original digital document
- Digitally signed documents are treated equally to documents signed by hand
ï§ IT systems vs. raw data
- Raw data is stored in databases or files which grow over time
- Data can be outsourced or exported to reduce the storage size, but the data is
not instantly accessible for the application
- Software manufacturers must guarantee that release changes do not impact the
capability to import outsourced data
Page ï§ 11
12. Why
Legal and regulatory requirements for archiving
ï§ Legal requirements for business documents
- Handelsgesetzbuch (HGB) § 257 regulates which business documents have
to be archived
- Legal retention period for business letters is 6 years, for other documents 10
years
- Abgabenordnung (AO) §§ 146, 147 describe similar requirements for
administrative regulations
- Digitally archiving of those documents must comply to the principles of proper
accounting (GoB) and GoBS which describe the requirements for process
documentation
- Process documentation is the proof of correct operation of the system and
describes the overall organizational and technical process of archiving
(collection, indexing, storage, retrieval, protection against loss / corruption and
reproduction of archived information)
Page ï§ 12
13. Why
Legal and regulatory requirements for archiving
- Digitally signed documents are legally binding as well as conventional paper
documents
- Each country has different requirements depending on the business of the
company (e.g. Sarbanes-Oxley Act regarding internal controlling)
- Subject to audits and inspections
Page ï§ 13
14. Why
Legal and regulatory requirements for archiving
ï§ Industry-specific requirements for documentation / archiving
- Gefahrengutverordnung (GGAV)
- Environmental liability and product liability law
- Operational directives and regulations
- Good Practice quality guidelines and regulations
- etc.
Agree with internal departments (QS, Legal, Controlling) and maybe with
authorities on the archiving process
Page ï§ 14
15. What
Retention policies for information life-cycle in Outlook and SharePoint
ï§ Recommendations
Outlook Retention period Classes in SharePoint Retention period
Inbox 60 days Standard 2 years
Other folders 2 years Review 7 years
Sent Items
Drafts Long-Term 10 years
Outbox
Deleted items 7 days
Calendar 2 years
Tasks
Contacts Duration of
employment
Page ï§ 15
16. What
Which documents and data
ï§ Business units determine
- Which documents have to be archived how and for how long
(storage form, file plan, retention periods)
- Document classes (logical archive)
- Document types
- Index data
Page ï§ 16
17. What
Requirements
ï§ Requirements for long-term preservation are specified by the
business
- Processes, workflows, interfaces
- Documents, objects, source, meta data
- Archiving period
- Regulatory aspects
- Permissions, roles, user management, responsibilities
- Purpose of archiving (e.g. display of documents in 15 years)
- Confidentiality, data integrity, sensitive data, availability
- Capacity (data volume, number of users, performance)
- etc.
Page ï§ 17
18. What
Meta data
ï§ Meta data provides structured index and search capabilities to
archived objects
- Source of meta data (e.g. master data systems)
- Who maintains the master data?
- Shall meta data be selected or manually entered?
- Is meta data document-dependent?
- Is meta data transferred automatically from other systems?
- Is an audit-trail required? (Who has changed which meta-data, when, why)
Coordination of the meta data in early stages is highly recommended
Page ï§ 18
19. What
Requirements
ï§ If raw data has to be archived
- Raw data is stored as is, bit-wise
- Primary goal is the ability to import raw data as 1:1 copy of the original data
- IT system generating raw data must be able to handle imported raw data even
after a long time
- Format of raw data must be coordinated
- Software manufacturers must guarantee that release changes do not impact the
capability to import outsourced data
- Meta data must be defined
- Processing of long-term preserved raw data is the responsibility of the
generating IT system, not of the archiving system
Page ï§ 19
20. How
Technical aspects
ï§ Selection of eligible file formats
- Should the document be displayed as original incl. embedded graphics?
- Should reproduce the original document properties (paper size, font size,
header, footer, logos, color, hand-written notes, etc.)?
- Should documents be archived in different formats but with same content (e.g.
XML and graphic)?
- Legal requirements?
- Is âloss of informationâ acceptable when converting into graphical
representations (jpeg)?
- Is the converting process revision-safe?
- Is the archived document format suitable for the archiving period?
Page ï§ 20
21. How
BSI approved formats
ï§ Graphics
- TIFF, storage of screened black-white images
- JPEG, storage of colour and gray scale images
ï§ Structure formats
- XML, can be used for long-term preservation of digital documents
Schema and layout have to be archived as well
- PDF/A, subset of PDF, standardized for long-term preservation
Format with structure and layout information and graphical objects
Documents must be validated to be PDF/A compliant
Page ï§ 21
22. How
Storage media
ï§ Possible storage media
- Paper
- Microfilm
- Magnetic tapes, floppy disks
- Optical storage media (e.g. CD-R, CD-ROM, DVD, WORM)
- Hard drives
- etc.
Selected media types have a limited lifetime and durability. Long-term
preserved objects must be copied to new media unchanged, if
required due to technology related changes in the storage media.
Page ï§ 22
23. How
Additional topics
- Storage of sensitive data
- Restart of the archiving system after system outage in a disaster
- Integration in current IT environment
- Migration of archived objects is expensive depending on data volume
- User management
- Usage of storage media must be regulated
- Firewall based separation of archiving system
- Long-Term archiving solution should be in use for a long time, supplier selection
should be aware of this
Page ï§ 23
24. How
Pros & Cons
Pros Cons
ï§ Single storage of documents/objects ï§ Usage of source documents must be
ï§ Save storage space regulated
ï§ Documents/objects available to ï§ Personal must be trained
authorized persons (end-user, administrator)
ï§ Documents/objects available from ï§ On-going maintenance costs
every workplace ï§ Complex IT system and IT
ï§ Structured search of infrastructure required
documents/objects
Page ï§ 24
25. Do You Have
Any Questions?
We would be happy to help.
http://www.sf-tools.net
Info@sf-tools.net
Page ï§ 25