Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Goobi in the Wellcome Library
1. Goobi in the Wellcome Library
Digitisation Roadshow, Linz, Feb 2013
Dave Thompson
Digital Curator, Wellcome Library
2. Goobi in the Wellcome Library
• In production March 2012.
• 6 Servers running Goobi – test & production.
• 11 staff users, some part time.
• 1.2 million images processed & available via Library website.
• Can upload maximum of <1000 objects into SDB per 24 hrs.
• Total space allocated to Goobi is 40tb.
6. A strategic approach
• Library transformation strategy, physical to digital.
• From ‘project’ to ‘production’.
• Digitisation as a sustainable end-to-end process.
• 18 month pilot/implementation project.
• Just taken into production.
7. Diverse sources of content
• In-house digitisation.
• External contractors.
• Contractors working in-house.
• External organisations digitising their content for
us.
8. Where did Goobi come from?
• Late 2010 early 2011 as plans for developing SDB
grew realised that we needed a means of mass
import of digital content.
• Began to think about high volume production &
the management of that.
• Early modelling of our systems suggested that we
needed a tool to manage production of content.
• Began looking at workflow tracking systems.
10. Perceived benefits of Goobi
• Web based distributed access to concurrent
users.
• Flexible workflow based processing, managed
through ‘Projects’.
• Workflow process enforced, ensures accuracy &
efficiency.
• Adaptable to different types of content.
• Initiates & manages esternal processes via
Intranda task manager (ITM).
• METS as basis of access & access control.
11. Rapid evolution of Goobi
• Goobi we have now quite different to what we
bought.
• Initial configuration to import MARC XML DMD &
to automate ingest into SDB.
• Initially Goobi didn’t scale to met our ambition.
• Initial install monolithic, now running Goobi as
distributed services.
• Developed new features with Intranda, e.g.
Jpylyzation.
12. Working with DMD
• Upload MARC XML DMD exported from Sierra
using standard Goobi features.
• MARC fields edited to provide a consistent Goobi
process title, e.g. using shelf mark.
• MARC Leader 6 field identifies content type, e.g
‘Archive’ or ‘Monograph’.
• Content ‘type’ used by Goobi to set default METS
access conditions.
• DMD not delivered to end user, that comes from
live catalogue.
13. Uploading content
• Content upload using the Sync2Goobi Tool for
bulk import.
• Drag ‘n drop interface.
• Can be either TIFF or JP2.
• Project based workflow templates manage either
format.
• Use Goobi Mount Tool (GMT) to access/manage
content already uploaded.
14. Using METS Editor
• Main point of human interaction with Goobi. Goobi
automates METS creation.
• METS basis for access control & usage conditions
for material.
• Basis for retrieval of content from SDB by using
SDB PUIDs.
• Goobi automates ingest of content into SDB &
receives AMD in return.
15. How we use METS
• Setting material type & default values for access
based on DMD.
• Access restrictions can be at the item level.
• DMD in METS not delivered to end user, serves
only to help a human identify content when
snagging.
16.
17. Shared development
• Wellcome Trust is not a development house. Rely
on Intranda to provide development support.
• Developed specifc requirememnts for extensions
to Goobi, e.g. Jpylyser for JPEG2000 validation.
• Development proposals from both sides. We have
idea, Intranda helps us make that idea a reality.
• Benefit from community developments
commissioned by others.
18. Additional Tools
• Lurawave for converting TIFF to JPEG2000.
• Jpylyzer for validating JPEG2000 files.
• Sync2Goobi Tool for bulk upload of content.
• Goobi Mount Tool/MS Windows File Explorer for
access to ‘Home’ folders.
19. Goobi – the future
• Built in OCR & creation of ALTO files.
• Further refinement of Sync2Goobi Tool.
• Further development/integration of validation
tools.
• Integration of ftp with Goobi for 3rd party direct
upload of content.
• Establishment of separate database server for
Goobi.
20. Lessons learned - systems
• We were ambitious but underestimated what
capacity we would require.
• Underestimated storage requirements.
• Underestimated the desirability of high levels of
automation.
• Focus human interaction at as few points as
possible.
21. Lessons learned - Intranda
• Have relied heavily on input & support from
Intranda.
• Share information with Intranda & trust them to
provide answers.
• Be prepared to share development. But be
prepared to accept some pain.
22. Lessons learned - Goobi
• In less than a year Goobi has become key to
delivering the Library’s content.
• Centralised user activities in one system – Goobi
– less to learn, more efficient.
• Streamline & automate. High volume efficient
production essential.
• Streamline other digitisation & access processes
to match Goobi.
• METS an efficient single place for access related
metadata.
23. Thank you
Questions now, questions later…?
Dave Thompson, Digital Curator
Wellcome Library
d.thompson@wellcome.ac.uk
http://wellcomelibrary.org/