T Bahr M Lindlar Goportis Digital Preservation Pilot
1. The Goportis Digital Preservation Pilot Project
Experiences made, lessons learned
Michelle Lindlar and Thomas Bähr
Future Perfect 2012
March 26th 2012, Wellington – New Zealand
2. The Goportis Pilot project
Conducted from October 2009 – October 2011
Goportis consists of the three German national subject libraries:
the German National Library of Science and Technology (TIB)
the German National Library of Medicine (ZB MED)
the German National Library of Economics (ZBW)
Goal: To determine and evalute technological, institutional and
organisational needs for a cooperatively operated digital
preservation system.
Cooperatively operated means that all partners
can work equally in the system.
2
3. „Building a digital preservation system“
dps
processes workflows
hardware software people
organisation mandate
3
4. „Pyramid foundation“
Organisation
dps
• type: library, archive,
research institute, …
processes workflows
• level; national, state, university, …
people
• size: holdings, staff, budget, users, … hardware software
• defines your (national) position ! organisation mandate
Mandate
•Given by: act/law, superordinate
organization/institution, self-given, …
•For content: (sub)collection level, type of
content, …
•Including action:collecting, archiving,
making available, …
• defines your (national) role !
4
5. A distributed national (research) library system
national bibliography national research
literature and information
supply
Sammlung Sonder-
Deutscher sammel-
Drucke gebiete
„special
„Collection of subject
German German prints“ German National collections“
National Library 6 libraries Subject Libraries 33 libraries
5
6. Goportis
Subjects: engineering, Subjects: medicine, Subject: economics
architecture, information nutrition, environmental
technology, chemistry, science, agricultural
mathematics, physics science
Staff: 212 Staff: 122 Staff: 239
Holdings: 8.9 mio units Holdings: 1.6 mio units Holdings: 4.4 mio units
different technical infrastructures (e.g. repositories, cataloguing systems)
different digital collections (e.g. AV, 3D objects)
6
7. Pyramid foundation – lessons learned
One for all or all for one?
Partner oriented model: almost identical organizations / mandates
Service oriented model different forms of organizations / mandates
More partners mean more complextiy (communication,
documentation, methods of operation)
Think about hierarchy within your institution
digital preservation is a cross-functional task and an organisational
change process
during implementation phase it is beneficial to position dps
the digital preservation in the hierarchy as close to library
management as possible processes workflows
A permament position of digital preservation within an
hardware software people
institution will have to be found post-pilot
organisation mandate
7
8. Pyramid 1st floor
Hardware / Infrastructure
dps
•Central or decentralized
•Open infrastructure ? processes workflows
•Scalability and reliability
hardware software people
Software
•System or service organisation mandate
•custom-built or off-the peg
•commercial or open source
People
•size and structure
•Qualifications / knowhow
•Outsourcing possible?
8
9. System Choice: System vs. Service
System Service
control over your data low staff costs
decision regarding actions no hardware / software cost
institutional/organizational needs time from project start to roll out
can be met (flexibility)
time from project start to roll out no control over data
cost hardware / software actions based on service provider
more staff needed decisions
access only in pre-defined cases
9
10. System choice: Off-the-peg vs. Custom build
Custom build Off-the-peg
licensing cost low lower IT resources (development)
modularity continued development
quality time from project start to roll out
transperence central end-to-end system
community support/service
integration and development licensing cost
costs
integration of other systems
time from project start to roll-out
dependency on company
support
drawbacks in fullfillment of specific
ongoing IT costs for development needs
10
11. Pyramid 1st level – lessons learned „software“
Is the software ready for you?
high value of a user community
system is close to user needs –
but are the user needs YOUR needs?
high value of a flexible system
(in regards to configuration, integration points, …)
clear exit scenario has to be defined
dps
processes workflows
hardware software people
organisation mandate
11
12. People – qualification / know-how
strategic
Goportis
direction
Steering committee
operative
project management / level
preservation specialists
(1 FTE from each library)
library specialists IT specialists
(1 FTE for administration
(2 from each library
1 from each library
temporarily)
temporarily for implementation )
12
13. People – know-how
Preservation specialists
- Excellent understanding of formats, preservation procedures, risks, …
- Good understanding of workflow procedures
- Basic understanding of IT procedures
Library staff
- Good understanding of digital preservation
- Experts for one or more workflows
- Experts for descriptive metadata (DC, MARC, MAB, …)
IT specialists
- Good understanding of digital preservation
- Programming skills
- Database expert
13
14. Pyramid 1st level – lessons learned
Are you ready for digital preservation?
„Know-How“ is a continuous process
three pillars of knowledge: library processes, IT, preservation practise
“spread the word“ within your institution !
“community sourcing“
dps
processes workflows
hardware software people
organisation mandate
14
15. Pyramid – 2nd level
Processes dps
• Specific tasks within your institution
related to preservation processes workflows
• Organizational process
• Technological process hardware software people
• Can involve humans and/or systems
organisation mandate
• Community building
Workflows
• Combination of tasks/processes to form a meanful chain
• In library context „workflows“ usually pertain to handling materials (or
users)
• Can involve humans and/or systems
• Traditional workflows
• Digital workflows
15
16. Processes – Community Building
Value of Communities for digital preservation
- we all have similar problems
- „universal“ knowledge regarding formats, risks
- new developments often part of „projects“
- „keeping tools alive“
- standardization for digital preservation
Contributions of Communities – a few examples
- DPC Technology Watch Reports http://www.dpconline.org
- OPF Blogs and Wiki http://openplanetsfoundation.org/
- KEEP public deliverables http://www.keep-project.eu
- DPOE workshops http://www.digitalpreservation.gov/education/
- nestor working groups
http://www.langzeitarchivierung.de/eng/index.htm
- PREMIS standard http://www.loc.gov/standards/premis/
16
18. Pyramid 2nd level – Lessons learned „processes“
Processes – Community Building
- community involvement is the process to keep
your know-how up to date
- think about the level of community involvement
right for you (related to organization structure)
- try to plan how much time you can spend
on community activities
- never underestimate the role of institutional
level communities ! dps
processes workflows
hardware software people
organisation mandate
18
19. Workflows – Traditional vs. Digital workflows
Traditional workflows (e.g. cataloguing, selection for acquisition)
- basis of library procedures
- handling materials throughout their lifecycle in the library‘s holdings
- often static
- always require human interaction
Digital workflows (e.g. ingest, risk management)
- configuration within a digital system
- handling digital objects throughout their lifecycle in the library‘s digital
system(s)
- changes in the system may require changes in the workflow
- may be automated
19
20. Workflows – Ingest in the Goportis Pilot Project
manual ingest („dissertations“) automated ingest („repository“)
Files are loaded into Rosetta by librarian Files are picked up by Rosetta from a predefined
directory
Librarian enters minimal set of descriptive Minimal set of descriptive metadata is supplied by
Metadata repository with file
Objects are „validated“ (identified, characterized, Objects are „validated“ (identified, characterized,
virus check, etc.) virus check, etc.)
Problems in the validation process need to be solved Problems in the validation process need to be solved
by preservation specialist by preservation specialist
Objects are manually linked with cataloguing system Objects are automatically linked with cataloguing
system
Objects are double-checked, „approved“ and passed Objects are passed to archival storage
to archival storage
20
21. Pyramid 2nd level – Lessons learned „Workflows“
Workflows – Traditional vs. Digital
- is the main difference between „traditional“ and „digital“ a move towards
automation?
- automation is not always a technical problem
- good understanding of benefits and drawbacks of automated
processes/workflows
- think about your institutional approach towards preservation
and what should not be automated in that context dps
- just because something can be automated,
should it be? processes workflows
- your workflows need to be in-line with your overall
archival policy hardware software people
- define which sources you trust and why !
organisation mandate
21
22. You think you‘re done? Forget it!
dps
Organization
position digital preservation as a fixed
processes workflows
unit / department / … in your institution
Mandate hardware software people
compare your mandate to your digital
preservation strategy and to the legal situation organisation mandate
Hardware / Infrastructure
plan ahead for scalability and consistently check your reliability procedures
Software
consistently check your exit strategy; look for tools to help you with different
preservation tasks (e.g. migration tools)
22
23. You think you‘re done? Forget it!
People
dps
Include digital preservation as a fixed part of the
work description of all staff involved – processes workflows
on paper and in their heads!
hardware software people
Processes
organisation mandate
Integrate community activities as a
fixed slot in your instution.
Workflows
Integrate more collections into your digital preservatoin system.
Find the right balance between traditional and digital/automated
workflows for your instution/your collections.
23
A lot of people think that when we talk about „dps“ we mean software it‘s more than that Alle steine sind verbunden Die unterliegenden sind eine strikte vorraussetzung für die darüberliegende Goportis – profile and mission „ Pyramid foundation“ Technology System off the peg vs. Custom made Ressources Skills & Know-How Community Building Workflows Business lead workflows Technology lead workflows An in-between? Summary Highs & lows Experiences made & lessons learned
Dps differs greatly depending on different factors Organisation and mandate two of the basic ones Also important to look at the things you DO NOT have a mandate for Fine line between mandate and right! E.g. i can have the mandate to archive something, but am not allowed to migrate it tue to intellectual property rights. This is something very dependend on national law – will not be further elaborated on.
With one of its oldest grant programmes, the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) supports a cooperative system for providing science and academia with specialised literature: the system of Special Subject Collections (Sondersammelgebiete, SSG) in research libraries. The changes in the information world caused by the digital revolution demand adaptations that affect the entire library system. Along with that, researchers’ working methods and their expectations of an optimal information infrastructure have changed significantly. The DFG has therefore commissioned an evaluation study to assess the performance of the SSG system in terms of the scientific community’s needs and to identity opportunities for development. Special attention was given to the role of digital media http://www.dfg.de/download/pdf/dfg_im_profil/evaluation_statistik/programm_evaluation/ib02_2011en.pdf Hier auch erzählen, was die goportis partner gemeinsam haben !!!!!!!!!!!1 Unterschiedliche materialien Internationale user Etc. .
different technical infrastructures (e.g. repositories, cataloguing systems) different digital collections (e.g. AV, 3D objects)
Methods of operation = action of „how things are done“ within an organisation, e.g. Wo ist der break-even-point in der deckungsgleichheit zwischen „organisation“ und „mandat“, der ein kooperativ betriebenes system erlaubt?? Anstelle einer keine „Dienstleistung“ Unterschiedliche kommunikationswege in unterschiedlichen organisationen werden oft unterschätzt und müssen berücksichtigt werden - Nicht nur „bescheid sagen“, sondern auch sicher gehen, dass die andere seite versteht, was gemeint ist More partners = more complexity in regards to communication ( „communication“ is more than „information“ ! ) in regards to documentation ( not only the IT crowd hates to document ! ) In regards to methods of operation ( things are different than how they are presented to be ! )
Custom build to system choice comparsion of systems available
Even if developments are tailored towards user needs, individual customers still need to find creative workarounds Everybody in the library has something to contribute to digital preservation E.g. stacks how materials are handled ………… . Know-how is not just know-how of people directly involved in digital preservation team, but the know-how of the institution at large
Trusted source
Unfortunatly mandate and legal restrictions do not always go hand in hand.