While creating web sites we often see their lifespan only for up to 3 to 5 years. With every relaunch
and overhaul we are confronted with content migration and short term motives to delete maybe
valuable content. On the other hand what is the value of our content? Can we assess it
meaningfully? Do we really know in which context it is used?
Scientist stated that where as we are producing more and more digital artifacts we fail to see that
we are not keeping an eye on preserving it in a manner that will enable us to find and use it in more
that a few years in the future.
This talk will introduce you the aspects of digital preservation with a special look on how TYPO3 is
preparing to help it users to create a digital heritage.
This Talk is part of the "Concise Preservation by combining Managed Forgetting and
Contextualized Remembering" Project ForgetIT. The ForgetIT project is funded by the EC within the
7th Framework Programme under the objective "Digital Preservation" (GA 600826).
4. Agenda
Some questions to the attendees in the room
Digital Dark Age - What is the problem?
Why preservation is valuable
How to preserve
The ForgetIT project
Outlook for TYPO3 CMS
Q&A
Freitag, 31. Mai 13
5. Olivier Dobberkau
CEO and founder of dkd
45 years old
TYPO3 “Reverend Neverend”
Member of the EAB TYPO3 Association
@T3RevNeverEnd
http://www.dkd.de
Freitag, 31. Mai 13
6. Disclaimer!
I am not a data curator or preservation
specialist :-)
Freitag, 31. Mai 13
8. How old is your website?
How old is your website?
When was the time you made a backup?
Are your sure you are keeping the right stuff?
And will it still work in 5 years from now?
Freitag, 31. Mai 13
11. 300 Funston Avenue
The Internet Archive
founded in 1996 by Brewster Kahle
more than 2 petabytes of data
growing at 20 terabytes per month
http://archive.org/about/faqs.php
Freitag, 31. Mai 13
13. 77 days
75 days is the average lifetime period of a website.
Source:
http://m.guardian.co.uk/technology/2013/apr/26/
brewster-kahle-internet-archive
Freitag, 31. Mai 13
14. What is the problem?
A closer look into the digital dark age of websites
Freitag, 31. Mai 13
15. Digital Dark Age
Wikipedia says:
"The digital dark age is a possible future situation
where it will be difficult or impossible to read
historical electronic documents and multimedia,
because they have been stored in an obsolete and
obscure file format."
http://en.wikipedia.org/wiki/Digital_dark_age
Freitag, 31. Mai 13
16. Digital Dark Age
first mentioned in 1997
all digital produced data is subject to it
problems arise from different angles
storage medium (disks, tapes, DVD etc)
format of the data
availability of the software and operating
systems
possible encryption
Freitag, 31. Mai 13
17. One example
NASA Viking Mars landing 1976
Magnetic tapes in 1976
Format was not documented
Programmers left or died
Only by a high amount of reverse engineering
NASA was able to extract the images
Freitag, 31. Mai 13
18. Websites a soon extinct species?
Risk of Digital Dark Age is also given with
websites we create and maintain
Danger factors we see
Relaunch from scratch
Technical standards change
Browser usage (aka Browser wars)
Marketing expectations
The Jungle we create in daily work
...
Freitag, 31. Mai 13
20. Why is preservation valuable?
Preservation is well established in memory
institutions such as national libraries and archives
Still in infancy in most other organizations
Preservation is percepted as „strategic“ and not
as an „operational“ goal
Preservation is done sometime only because of
legal requirements
Quick wins are not reached easily
Freitag, 31. Mai 13
21. Why is preservation valuable?
Digital Data is your organizations raw material in
the future
Every one is going „Big Data“
Preserving is helping you to achieve sustainibility
within your organization
Freitag, 31. Mai 13
22. How to preserve?
There is no golden bullet for preservation in
organisations. Preservation is a long term strategic
goal.
Freitag, 31. Mai 13
27. ForgetIT Project
Consortium of 10 partners
funded by the EC
started in 2013
3 years of research & development
http://www.forgetit-project.eu/
The ForgetIT project is funded by the EC within
the 7th Framework Programme under the
objective "Digital Preservation" (GA 600826).
Freitag, 31. Mai 13
28. 3 Concepts of ForgetIT
Managed Forgetting
Contextualized Remembering
Synergetic Preservation
Freitag, 31. Mai 13
29. Managed Forgetting
Managed Forgetting models resource selection as
a function of attention and significance
dynamics.
It is inspired by the important role of forgetting in
human memory and focuses on characteristic
signals of reduction in salience. For this purpose it
relies on multi-faceted information assessment
and offers customizable preservation options
such as full preservation, removing of
redundancy, resource condensation, and also
complete digital forgetting.
Freitag, 31. Mai 13
30. Contextualized Remembering
Contextualized Remembering targets keeping
preserved content meaningful and useful.
It will be based on a process of dynamic
evolution-aware contextualization, which
combines context extraction and packaging with
evolution detection and intelligent
recontextualization.
Freitag, 31. Mai 13
31. Synergetic Preservation
Synergetic Preservation crosses the chasm that
exists between active information use and
preservation management by making intelligent
preservation processes an integral part of the
content lifecycle in information management and
by developing solutions for smooth bi-directional
transitions.
Freitag, 31. Mai 13
32. Expected Outcomes
Foundations and Models
Approaches for managed forgetting,
contextualized remembering and joint model
for synergetic preservation
Algorithms and methods
preservation-oriented summarization and
aggregation
multifaceted information assessment methods
evolution-aware contextualization and re-
contextualization
Freitag, 31. Mai 13
33. Expected Outcomes
Infrastructure and services
Flexible and extensible Preserve-or-Forget
framework, providing an extensible and
adaptable set of services for extending
information management solutions with
intelligent preservation management
Freitag, 31. Mai 13
34. Expected Outcomes
Application pilots
Personal preservation focusing on multimedia
coverage of personal events
Organizational preservation focusing on
smooth preservation in organizational content
management
Freitag, 31. Mai 13
35. Expected Outcomes
Best Practices & Adoption Blueprints
Understand opportunities and barriers for
personal preservation
Form guidelines for offering personal
preservation as a service
Freitag, 31. Mai 13
36. Partners in the ForgetIT Project
Centre for Research and Technology Hellas
dkd Internet Service GmbH
Deutsches Forschungszentrum für Künstliche
Intelligenz GmbH
EURIX Srl
Gottfried Wilhelm Leibniz Universität Hannover
IBM Israel - Science and Technology Ltd
Freitag, 31. Mai 13
37. Partners in the ForgetIT Project
Luleå Tekniska Universitet
The Chancellor, Masters and Scholars of the
University of Oxford
The University of Edinburgh
The University of Sheffield
Turk Telekomunikasyon AS
Freitag, 31. Mai 13
39. Working on the following
Content Dashboard
Metadata Directory
Semantic Layer
ForgetIt Backend Module
Feedback & Conflicts Module
Recycling & Inducing Module
CMIS integration & transposing
Freitag, 31. Mai 13
40. Open to the TYPO3 Community
We are open to the TYPO3 Community
We want to raise awareness on the matter of
preservation
We will publish our modules on open source
licenses
Want to stay informed?
http://www.forgetit-project.eu/
Freitag, 31. Mai 13
41. Slides will be avaible at
http://de.slideshare.net/olivierdo
Contact
Follow me on Twitter: @T3RevNeverEnd
email: olivier.dobberkau@dkd.de
de.linkedin.com/in/olivierdobberkau/
Freitag, 31. Mai 13