SlideShare a Scribd company logo
1 of 12
Central Registry
                          for Digitized Objects:
                         Linking Production and
                          Bibliographic Control



Ralf Stockmann
Göttinger Digitization Center
As things are now
• Huge ventures in
  – Digitization
     •   Google
     •   Microsoft
     •   National programs
     •   Local centers
  – Accessibility
     •   World Digital Library
     •   European Digital Library
     •   National portals
     •   Google Book Search
As things are now
• We just face the dawn of mass digitization
  – Leaving behind the state of
    manufacturing
  – Entering industrialization
  – Scanning Robots
  – Accessible Full Text (OCR)
Lack of …
• Coordination in
  digitization activities
   – Who scans what
     where when
     in which quality
     and how will it
     be accessible
      • How is “quality” defined?
      • Do we agree on “what”?
Facing the Consequences
                                                Technical
                                                Improvements
                                                               Costs




                                 Waste of Ressources
Costs / Value




                                                               Additional
                                                               Benefit

                 Number of digitized items per volume
The Solution
• Central registry for digitized objects
• Focused on the production context (no user
  frontend)
• API driven
  – Application Programming Interface
  – Query / Ingest
  – Simple implementation into existing workflow-tools
• Batch mode (lists)
• Open Source / free service
• Matching on volume level
  – Score / probability
Implementation
                           Backend Services
                               EROMM / EDL / OCLC / …



                      Registry / Meta Data Store

                  Aggregator / Normalizer / Mapping

                                          API
                       Query

      Ingest                                      Ingest       Ingest




                      ? ? ?                                !      !      !
Present Collections             Running Project            Notice of Intent
Metadata Store
•   Bibliographic
     –   Title
     –   Author
     –   Date
     –   Place of publication      Matching / Score
     –   Number of Pages (?)       „what“
     –   Language
     –   Print / Format
     –   Edition
•   Technical
     –   Resolution
     –   Color depth
     –   File type / compression
•   Accessibility                  Additional Judging
     –   Institution               „who, where, which
     –   Persistent identifier     quality, how
     –   Rights                    accesible“
     –   URL
•   Status
     –   Digitized
     –   In Progress               Decisive Factor
     –   Intended (Timeline?)      „when“
     –   Requested?
Obstacles
• (open source) Tools for automated matching /
  scoring?
• Interface for manual comparison / decision making
• Multivolume works: low rate of uniformity (near
  50% of physical SUB stock before 1900)
• Unicode
• Transliteration tables
• Random bound books
• Reliable identifier
   – ISBN for old books?

• Anticipated rate of accuracy: 50 – 70 %
Appreciation of Values
• The goal is NOT to build a reliable database in terms of
  library standards
• But to prevent further waste of resources.
• If we manage to archive just 50% precision,
• We saved a min. 50% of founding!
Work Packages
• Define metadata model
• Set up database
• Implement mapping tools
• Define API calls
• Implement API
• Build some connectors to popular mass digitization workflow
  tools (e.g. “Goobi”)
• Establish ISBN workflow
• Harvest existing sources
• Start with a community of actual projects

• Get some (!) founding
• Estimated schedule plan: 6 months
Thank You
(stockmann@uni-goettingen.de)

More Related Content

Viewers also liked

Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Ralf Stockmann
 
DFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCRDFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCR
Ralf Stockmann
 
eAqua und europeana4D - 2009
eAqua und europeana4D - 2009eAqua und europeana4D - 2009
eAqua und europeana4D - 2009
Ralf Stockmann
 
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ralf Stockmann
 
maple , part2
maple , part2maple , part2
maple , part2
ahamidp
 
Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12
canolfanbedwyr
 
The Genocide In Rwanda
The Genocide In RwandaThe Genocide In Rwanda
The Genocide In Rwanda
bpersett
 

Viewers also liked (20)

Das materielle Objekt in der digitalen Welt
Das materielle Objekt in der digitalen WeltDas materielle Objekt in der digitalen Welt
Das materielle Objekt in der digitalen Welt
 
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
Deutsche Digitale Bibliothek - Vorstellung CeBit 2008
 
DFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCRDFG Expertenworkshop - Workflow Volltextgenerierung über OCR
DFG Expertenworkshop - Workflow Volltextgenerierung über OCR
 
eAqua und europeana4D - 2009
eAqua und europeana4D - 2009eAqua und europeana4D - 2009
eAqua und europeana4D - 2009
 
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
Ist Langzeitarchivierung finanzierbar? Präsentation Akademie Sankelmark 2008
 
Visualisierung bibliographischer Daten
Visualisierung bibliographischer DatenVisualisierung bibliographischer Daten
Visualisierung bibliographischer Daten
 
GUI-Mockups in der Softwareentwicklung
GUI-Mockups in der SoftwareentwicklungGUI-Mockups in der Softwareentwicklung
GUI-Mockups in der Softwareentwicklung
 
maple , part2
maple , part2maple , part2
maple , part2
 
Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12Lansio Cysill Ar Lein12
Lansio Cysill Ar Lein12
 
Fireside Chats
Fireside ChatsFireside Chats
Fireside Chats
 
Cyflwyniad Bloc
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloc
 
Lecture04- Use Case Diagrams
Lecture04- Use Case DiagramsLecture04- Use Case Diagrams
Lecture04- Use Case Diagrams
 
Out of comfort zone, into the adventure
Out of comfort zone, into the adventure Out of comfort zone, into the adventure
Out of comfort zone, into the adventure
 
Il processo di cambiamento in un'Azienda Sanitaria
Il processo di cambiamento in un'Azienda SanitariaIl processo di cambiamento in un'Azienda Sanitaria
Il processo di cambiamento in un'Azienda Sanitaria
 
Visioning the vision
Visioning the visionVisioning the vision
Visioning the vision
 
C'è un nuovo mondo del lavoro ?!
C'è un nuovo mondo del lavoro ?!C'è un nuovo mondo del lavoro ?!
C'è un nuovo mondo del lavoro ?!
 
The Genocide In Rwanda
The Genocide In RwandaThe Genocide In Rwanda
The Genocide In Rwanda
 
Perchè qualcuno dovrebbe darti un lavoro ?
Perchè qualcuno dovrebbe darti un lavoro ?Perchè qualcuno dovrebbe darti un lavoro ?
Perchè qualcuno dovrebbe darti un lavoro ?
 
Gli s-vantaggi della relazione
Gli s-vantaggi della relazioneGli s-vantaggi della relazione
Gli s-vantaggi della relazione
 
Be unique
Be unique Be unique
Be unique
 

Similar to Central Registry for Digitized Objects: Linking Production and Bibliographic Control (2007)

Workflows in the Virtual Observatory
Workflows in the Virtual ObservatoryWorkflows in the Virtual Observatory
Workflows in the Virtual Observatory
Jose Enrique Ruiz
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Open Analytics
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
Christopher Whitaker
 
Crossmedia Workflows
Crossmedia WorkflowsCrossmedia Workflows
Crossmedia Workflows
Dwight Kelly
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011
DLFCLIR
 
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
Robert H. McDonald
 

Similar to Central Registry for Digitized Objects: Linking Production and Bibliographic Control (2007) (20)

Workflows in the Virtual Observatory
Workflows in the Virtual ObservatoryWorkflows in the Virtual Observatory
Workflows in the Virtual Observatory
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
 
Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014Liferay & Big Data Dev Con 2014
Liferay & Big Data Dev Con 2014
 
Wordware 2011: Lingoport i18n Planning & Static Analysis
Wordware 2011: Lingoport i18n Planning & Static AnalysisWordware 2011: Lingoport i18n Planning & Static Analysis
Wordware 2011: Lingoport i18n Planning & Static Analysis
 
Designing and Implementing Search Solutions
Designing and Implementing Search SolutionsDesigning and Implementing Search Solutions
Designing and Implementing Search Solutions
 
Caliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial MgsreeCaliber 2009 Tutorial Mgsree
Caliber 2009 Tutorial Mgsree
 
Cassandra eu
Cassandra euCassandra eu
Cassandra eu
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
 
Open Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for LibrariesOpen Source Web Content Management Technologies for Libraries
Open Source Web Content Management Technologies for Libraries
 
BI on Cloud Computing
BI on Cloud ComputingBI on Cloud Computing
BI on Cloud Computing
 
32 cc 3_a_l-drumheller
32 cc 3_a_l-drumheller32 cc 3_a_l-drumheller
32 cc 3_a_l-drumheller
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platforms
 
Crossmedia Workflows
Crossmedia WorkflowsCrossmedia Workflows
Crossmedia Workflows
 
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year OnKuali OLE: A Look at our Software Deliverables Roadmap One Year On
Kuali OLE: A Look at our Software Deliverables Roadmap One Year On
 
E meyer lamp2012
E meyer lamp2012E meyer lamp2012
E meyer lamp2012
 
Digitization in theory and practice
Digitization in theory and practiceDigitization in theory and practice
Digitization in theory and practice
 
Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011Katherine Kott Slides for DLF PM Group 2011
Katherine Kott Slides for DLF PM Group 2011
 
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
What Your Library Needs to Know About Kuali Open Library Environment (OLE) an...
 

More from Ralf Stockmann

Freiräume schaffen - im Social Intranet
Freiräume schaffen - im Social IntranetFreiräume schaffen - im Social Intranet
Freiräume schaffen - im Social Intranet
Ralf Stockmann
 
Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)
Ralf Stockmann
 
Grundlagen Digitaler Mediengestaltung
Grundlagen Digitaler MediengestaltungGrundlagen Digitaler Mediengestaltung
Grundlagen Digitaler Mediengestaltung
Ralf Stockmann
 
Visually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an BibliothekenVisually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an Bibliotheken
Ralf Stockmann
 
Goobi Rollen Und Rechte
Goobi Rollen Und RechteGoobi Rollen Und Rechte
Goobi Rollen Und Rechte
Ralf Stockmann
 
Kooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich DigitalisierungKooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Ralf Stockmann
 

More from Ralf Stockmann (17)

Freiräume schaffen - im Social Intranet
Freiräume schaffen - im Social IntranetFreiräume schaffen - im Social Intranet
Freiräume schaffen - im Social Intranet
 
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
Die Bibliothek als Wolkenfabrik - Cloud-Dienste als Plattformen für digitale ...
 
Wie man vom Intranet aus die Welt verbessern kann
Wie man vom Intranet aus die Welt verbessern kannWie man vom Intranet aus die Welt verbessern kann
Wie man vom Intranet aus die Welt verbessern kann
 
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
Die Revolution vergisst ihre Kinder - Drei Szenarien, wie Bibliotheken in 15 ...
 
Der Zauberlehrling 
war nicht als
 Anleitung gemeint
Der Zauberlehrling 
war nicht als
 Anleitung gemeintDer Zauberlehrling 
war nicht als
 Anleitung gemeint
Der Zauberlehrling 
war nicht als
 Anleitung gemeint
 
BibliothekarInnen gestalten digitale Wissensräume
BibliothekarInnen gestalten digitale WissensräumeBibliothekarInnen gestalten digitale Wissensräume
BibliothekarInnen gestalten digitale Wissensräume
 
Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)Fit für die digitale Bibliothek? (2007)
Fit für die digitale Bibliothek? (2007)
 
Grundlagen Digitaler Mediengestaltung
Grundlagen Digitaler MediengestaltungGrundlagen Digitaler Mediengestaltung
Grundlagen Digitaler Mediengestaltung
 
Was Wissenschaftler wirklich Wollen
Was Wissenschaftler wirklich WollenWas Wissenschaftler wirklich Wollen
Was Wissenschaftler wirklich Wollen
 
Was tun mit den Ergebnissen der OCR?
Was tun mit den Ergebnissen der OCR?Was tun mit den Ergebnissen der OCR?
Was tun mit den Ergebnissen der OCR?
 
Keynote Studip Zukunftsworkshop
Keynote Studip ZukunftsworkshopKeynote Studip Zukunftsworkshop
Keynote Studip Zukunftsworkshop
 
Zukunft der E Books
Zukunft der E BooksZukunft der E Books
Zukunft der E Books
 
Visually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an BibliothekenVisually Lossless Kompression für die Digitalisierung an Bibliotheken
Visually Lossless Kompression für die Digitalisierung an Bibliotheken
 
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
Was wir HEUTE beachten müssen um die Wissenschaftler in 10 Jahren nicht zu en...
 
Goobi Rollen Und Rechte
Goobi Rollen Und RechteGoobi Rollen Und Rechte
Goobi Rollen Und Rechte
 
Persitent Identifier in Goobi
Persitent Identifier in GoobiPersitent Identifier in Goobi
Persitent Identifier in Goobi
 
Kooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich DigitalisierungKooperative Angebote von GBV und GDZ im Bereich Digitalisierung
Kooperative Angebote von GBV und GDZ im Bereich Digitalisierung
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Central Registry for Digitized Objects: Linking Production and Bibliographic Control (2007)

  • 1. Central Registry for Digitized Objects: Linking Production and Bibliographic Control Ralf Stockmann Göttinger Digitization Center
  • 2. As things are now • Huge ventures in – Digitization • Google • Microsoft • National programs • Local centers – Accessibility • World Digital Library • European Digital Library • National portals • Google Book Search
  • 3. As things are now • We just face the dawn of mass digitization – Leaving behind the state of manufacturing – Entering industrialization – Scanning Robots – Accessible Full Text (OCR)
  • 4. Lack of … • Coordination in digitization activities – Who scans what where when in which quality and how will it be accessible • How is “quality” defined? • Do we agree on “what”?
  • 5. Facing the Consequences Technical Improvements Costs Waste of Ressources Costs / Value Additional Benefit Number of digitized items per volume
  • 6. The Solution • Central registry for digitized objects • Focused on the production context (no user frontend) • API driven – Application Programming Interface – Query / Ingest – Simple implementation into existing workflow-tools • Batch mode (lists) • Open Source / free service • Matching on volume level – Score / probability
  • 7. Implementation Backend Services EROMM / EDL / OCLC / … Registry / Meta Data Store Aggregator / Normalizer / Mapping API Query Ingest Ingest Ingest ? ? ? ! ! ! Present Collections Running Project Notice of Intent
  • 8. Metadata Store • Bibliographic – Title – Author – Date – Place of publication Matching / Score – Number of Pages (?) „what“ – Language – Print / Format – Edition • Technical – Resolution – Color depth – File type / compression • Accessibility Additional Judging – Institution „who, where, which – Persistent identifier quality, how – Rights accesible“ – URL • Status – Digitized – In Progress Decisive Factor – Intended (Timeline?) „when“ – Requested?
  • 9. Obstacles • (open source) Tools for automated matching / scoring? • Interface for manual comparison / decision making • Multivolume works: low rate of uniformity (near 50% of physical SUB stock before 1900) • Unicode • Transliteration tables • Random bound books • Reliable identifier – ISBN for old books? • Anticipated rate of accuracy: 50 – 70 %
  • 10. Appreciation of Values • The goal is NOT to build a reliable database in terms of library standards • But to prevent further waste of resources. • If we manage to archive just 50% precision, • We saved a min. 50% of founding!
  • 11. Work Packages • Define metadata model • Set up database • Implement mapping tools • Define API calls • Implement API • Build some connectors to popular mass digitization workflow tools (e.g. “Goobi”) • Establish ISBN workflow • Harvest existing sources • Start with a community of actual projects • Get some (!) founding • Estimated schedule plan: 6 months