SlideShare ist ein Scribd-Unternehmen logo
1 von 43
a centre of expertise in data curation and preservation




Experience is a hard teacher…

           Curation and the Digital Record
                  Chris Rusbridge
              Endeavor EndUser 2006
                                                                                                       Funded by:
 This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5
 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-
 nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San
 Francisco, California, 94105, USA.
a centre of expertise in data curation and preservation




      "Experience is a hard teacher
     because she gives the test first,
         the lesson afterwards”
 • Vernon Sanders Law, ex baseball player

 • (Or perhaps, in the case of digital preservation,
   the test occurs long after you are dead?)




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                        Contents
   •   Curation
   •   Sustainability
   •   Data resources
   •   Preservation & curation issues
   •   OAIS Review




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                        Curation
   • Data increasingly important as evidence
      • Experimental verifiability (the basis of science)
      • Unrepeatable observations & experiments
        (particularly environmental in broadest sense)
      • Legal, compliance & transactions
      • Cultural resources


   • For evidential value, data must be curated


Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                        Curation
   • “Maintaining and adding value to a trusted
     body of digital information for current and
     future use”




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                   Lynch remarks
   • Closing the 2005 Curation Conference
   • 3 views of digital curation
      • Collection as a living thing
      • Whole life process, evolving object(s)
      • Finite process, handover to preservation




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




         •This is what you do!




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




     Sustainability and exit strategy
   • Most critical resource for curation: present
     and future money supply!
   • Plan for the long term, but have a succession
     plan
   • Sustained approach not project mentality




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




     Sustainability and exit strategy
   • Most critical resource for curation: present
     and future money supply!
   • Plan for the long term, but have a succession
     plan
   • Sustained approach not project mentality


        •This is what you do!
Endeavor EndUser 2006
a centre of expertise in data curation and preservation




      Some illustrations: UK census
   • 1881 census (UKDA)
      • Hand-written individual return forms: data conversion issue
        (reference form available): digitisation and access issues
   • 1961 census (TNA/NDAD)
      • First using computers to analyse (first major UK-wide
        computer project?); individual returns closed until 2062: data
        preservation issue!!!
   • 2001 census (ONS/CDU)
      • Data corrections and adjustments: curation issue




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                Curation of emails
    Lots of metadata and context (RFC 822)
    Often highly distributed
    Split conversations
    Unknown numbers of copies
    Personal choice of clients

   • Legal requirements!
   • Controlled filing and controlled deletion
     needed…

Endeavor EndUser 2006
a centre of expertise in data curation and preservation




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




   Online Public Access Catalogues
   • Long term, curated databases
   • Often high quality (not always)
   • Well known interchange standards (MARC),
     classification standards (several), name
     authorities…
   • Still significant problems combining sources




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                                                      Pre 2000



                  © Chris Rusbridge
                                      … The database provides indexing of 527
                                       … The database provides indexing of 527
                                      key international English-language business
                                       key international English-language business
                                      periodicals including Business Week,
                                       periodicals including Business Week,
                                      Forbes, The Wall Street Journal, The New
                                       Forbes, The Wall Street Journal, The New
                                      York Times and more. Also included are
                                       York Times and more. Also included are
                                      product reviews, interviews, biographical
                                       product reviews, interviews, biographical
    Post 2000                         sketches, corporate profiles, reports of
                                       sketches, corporate profiles, reports of
                                      associations, societies and conferences.
                                       associations, societies and conferences.
                                      Broad areas of coverage include
                                       Broad areas of coverage include
                                      accounting, acquisitions and mergers,
                                       accounting, acquisitions and mergers,
                                      advertising, banking, chemicals, …
                                       advertising, banking, chemicals, …
Endeavor EndUser 2006                                                      •Peter Buneman
a centre of expertise in data curation and preservation


                                                 • Storage
                                                     – Redundant, Distributed
                                                     – Persistent
                                                     – Readable
                                                 • Clear standards for citation
                                                 • Historical record (old data is useful)
                                                 • Well understood ownership/IP

                       © Chris Rusbridge   … The database provides indexing of 527
                                            … The database provides indexing of 527
                                           key international English-language business
                                            key international English-language business
•Storage                                   periodicals including Business Week,
                                            periodicals including Business Week,
    –Single-source                         Forbes, The Wall Street Journal, The New
                                            Forbes, The Wall Street Journal, The New
                                           York Times and more. Also included are
    –Volatile                               York Times and more. Also included are
                                           product reviews, interviews, biographical
                                            product reviews, interviews, biographical
    –Centralised                           sketches, corporate profiles, reports of
                                            sketches, corporate profiles, reports of
    –Internal DBMS format                  associations, societies and conferences.
                                            associations, societies and conferences.
•No standards for citation                 Broad areas of coverage include
                                            Broad areas of coverage include
•No historical record                      accounting, acquisitions and mergers,
                                            accounting, acquisitions and mergers,
•Mind-boggling legal issues                advertising, banking, chemicals, …
                                            advertising, banking, chemicals, …
  Endeavor EndUser 2006                                                         •Peter Buneman
a centre of expertise in data curation and preservation




                                                     TWOMASS (Infrared)
                        SDSS (Visual)
Endeavor EndUser 2006                             Slide from Rajendra Bose
a centre of expertise in data curation and preservation




Endeavor EndUser 2006                      Slide from Rajendra Bose
a centre of expertise in data curation and preservation




                          Example…
   • National Virtual Observatory
      • Johns Hopkins press release: “Scientists working to create the
        NVO, an online portal for astronomical research unifying dozens of
        large astronomical databases, confirmed discovery of [a] new
        brown dwarf recently. The star emerged from a computerized
        search of information on millions of astronomical objects in two
        separate astronomical databases. Thanks to an NVO prototype,
        that search, formerly an endeavor requiring weeks or months of
        human attention, took approximately two minutes.”




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                        Context
   • Data meaningless without context
      • Linkage
      • Metadata of many kinds
      • Workflow!
   • Provenance
      • Computational lineage
      • Authenticity




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




               Access and re-use
   • Ethics and rights control access
      • Weak in expressing this long-term
   • Collaboration tools
      • Annotation, discussion, review
      • Re-use leading to change and development
   • “Publication”
      • Not just in “print”
      • Underlying data should be “published”, too
   • Citation…
Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                              Citation
   • Needs a stable resource to cite…

       OWL Web Ontology Language
       Reference
       W3C Proposed Recommendation 15 December 2003
       This version:
       http://www.w3.org/TR/2003/PR-owl-ref-20031215/
       Latest version:
       http://www.w3.org/TR/owl-ref/
       Previous version:
       http://www.w3.org/TR/2003/CR-owl-ref-2003081



Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                        Citation…
   • The date alone (as in common web citation
     approaches) is not enough!
               •[6] The CIA World Factbook.
               •www.cia.gov/cia/publications/factbook/.
               •Retrieved on 8 Jan 2006.
      • Cited object likely to have changed…
      • Citation should link to the cited object as it was!




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                   Citation needs…
   • An efficient way to reference and access “archived”
     past states of a changing dataset (work in progress,
     Buneman et al)
   • Less important for original observations
      • Don’t mess with those data
   • Less important for incremental datasets
      • Later stuff should not invalidate earlier
   • Very important for revisable datasets
      • Eg Genomics… datasets that result from the combined work
        of curators, or contain opinions or facts likely to change


Endeavor EndUser 2006
a centre of expertise in data curation and preservation


                                                       XML Archive at time t - 1
XMLArch: System Architecture
              time t
              Relational
                                                               XML Archiver
              XML Snapshot at
              Database




                                                             Pre-processor

                                                                Version
                                                                Merger
                   Data Extractor




                                                         XML Archive at time t
Endeavor EndUser 2006                                                  •Carwyn Edwards
a centre of expertise in data curation and preservation




           Preservation & curation
   • Use preserves
   • Money preserves
   • Redundancy good, monoculture bad?
      • LOCKSS-type & other approaches…
   • Bits are fragile and robust
      • Don’t rely on portable media
      • Look after them well
   • Technology changes…
      • How fast? What impact?
   • Metadata matters! (Know what you’ve got)
Endeavor EndUser 2006
a centre of expertise in data curation and preservation




     Formats, migration, significant
             properties…
   • “We MUST preserve the look and feel!”

   • Well…
      • Think about a book like “Kenilworth” by Walter
        Scott
      • Think about the BBC Domesday emulation


   • You may be better with a preserved
     “desiccated” version… than nothing at all!

Endeavor EndUser 2006
a centre of expertise in data curation and preservation


 The Project Gutenberg EBook of Kenilworth, by Sir Walter Scott

 This eBook is for the use of anyone anywhere at no cost and with
 almost no restrictions whatsoever. You may copy it, give it away or
 re-use it under the terms of the Project Gutenberg License included
 with this eBook or online at www.gutenberg.org

 Title: Kenilworth
 Author: Sir Walter Scott
 Release Date: February 21, 2006 [EBook #1606]
 Language: English
 Character set encoding: ASCII

 *** START OF THIS PROJECT GUTENBERG EBOOK KENILWORTH ***
 Produced by An Anonymous Volunteer and David Widger

 KENILWORTH.

 by Sir Walter Scott, Bart.

 INTRODUCTION

 A certain degree of success, real or supposed, in the delineation of
 Queen Mary, naturally induced the author to attempt something similar
 respecting "her sister and her foe," the celebrated Elizabeth. He
 will not, however, pretend to have approached the task with the same
 feelings; for the candid Robertson himself confesses having felt the
 prejudices with which a Scottishman is tempted to regard the subject;
 …




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                                                                                     •But there
                                                                                     •ARE limits!




                         QuickTime™ and a
                    TIFF (LZW) decompressor
                 are needed to see this picture.




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




          Preservation is not cheap
   • But it’s not expensive…




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




          Preservation is not cheap
   • But it’s not expensive…
      • Compared with the alternative!




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




          Preservation is not cheap
   • But it’s not expensive…
      • Compared with the alternative!




                                                              •Postcard
                                                              •Sent to me
                                                              •anonymously
Endeavor EndUser 2006
a centre of expertise in data curation and preservation




          Curation: whose job is it?
   • Yours!
      • With your archivists
      • And your Records Managers
      • And your scientists and scholars…




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




           Preservation & curation
   • We can’t do it alone
      • Collective responsibility
   • We can’t rely on anyone else
      • Institutional responsibility




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                 It’s about time…
   • From the very short
      • Good management (don’t under-estimate but don’t
        over-estimate)
   • Through the medium term
      • Curation: use it or lose it
      • Gather ye metadata while ye may!
      • Preservation relay
   • To the very long term
      • High commitment, high cost, high risk
      • Harder to do en masse

Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                    Supplier role?
   • Work together with libraries…
      • Multi-supplier, Multi-platform
      • Open source mix
      • The library is not simple any more
   • Library 2.0?
      • Power of crowds, economy of attention,
        generation X…
      • Wikicat?



Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                                 QuickTime™ and a
                            TIFF (LZW) decompressor
                         are needed to see this picture.




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                     Supplier role?
   • Work together with libraries…
      • Multi-supplier, Multi-platform
      • Open source mix
      • The library is not simple any more
   • Library 2.0?
      • Power of crowds, economy of attention, generation X…
      • Wikicat?
   • Web 2.0?
      • Mix, mashup
      • What you see is… not there?


Endeavor EndUser 2006
a centre of expertise in data curation and preservation
                        BEWARE WEB 2.0!!!




                                   QuickTime™ and a
                        TIFF (Uncompressed) decompressor
                           are needed to see this picture.




Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                              OAIS
   • “Announcement of a Comment Period for the Five
     Year Review of the Reference Model for an Open
     Archival Information System (OAIS) Standard”
      • “… must be reviewed every five years and a determination
        made to reaffirm, modify, or withdraw the existing standard.”
      • “…any revision must remain backward compatible with
        regard to major terminology and concepts.”
      • “… we do not plan to expand the general level of detail”
      • “… reduce ambiguities and fill in any missing or weak
        concepts”
   • Make suggestions and express interest until 30/10/06
      • OAIS-support@delight.gsfc.nasa.gov


Endeavor EndUser 2006
a centre of expertise in data curation and preservation




                        To close…
   • Your library is currently taking the curation
     test…
   • Your children will learn the answer!
   • But




Endeavor EndUser 2006

Weitere ähnliche Inhalte

Ähnlich wie Dcc endeavour-2006

Presentation on the Warsaw Conference on National Bibliographies August 2012
Presentation on the Warsaw Conference on National Bibliographies August 2012Presentation on the Warsaw Conference on National Bibliographies August 2012
Presentation on the Warsaw Conference on National Bibliographies August 2012nw13
 
Metadata: Standards Basics for the Independent Publishing Community, with Gra...
Metadata: Standards Basics for the Independent Publishing Community, with Gra...Metadata: Standards Basics for the Independent Publishing Community, with Gra...
Metadata: Standards Basics for the Independent Publishing Community, with Gra...bisg
 
An Introduction to Providence Information center
An Introduction to Providence Information centerAn Introduction to Providence Information center
An Introduction to Providence Information centersreejatunnu
 
Creation of LSE Digital Library
Creation of LSE Digital LibraryCreation of LSE Digital Library
Creation of LSE Digital LibraryEd Fay
 
Building a Digital Library
Building a Digital LibraryBuilding a Digital Library
Building a Digital LibraryEd Fay
 
Moving the repository upstream
Moving the repository upstreamMoving the repository upstream
Moving the repository upstreamChris Rusbridge
 
Institutional Repositories: Dealing with Data Challenges
Institutional Repositories: Dealing with Data ChallengesInstitutional Repositories: Dealing with Data Challenges
Institutional Repositories: Dealing with Data ChallengesChris Okiki
 
User Participation in Digital Library Development
User Participation in Digital Library DevelopmentUser Participation in Digital Library Development
User Participation in Digital Library DevelopmentEd Fay
 
BEA2014 - Understanding New Developments in Metadata
BEA2014 - Understanding New Developments in MetadataBEA2014 - Understanding New Developments in Metadata
BEA2014 - Understanding New Developments in MetadataBookExpoAmerica
 
BEA 2014--Understanding New Developments in Metadata
BEA 2014--Understanding New Developments in MetadataBEA 2014--Understanding New Developments in Metadata
BEA 2014--Understanding New Developments in MetadataBowker
 
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...Beniamino Murgante
 
Supporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many FrontsSupporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many FrontsJohn Kunze
 
IWMW 2006: Archiving the Web What can Institutions learn from National and In...
IWMW 2006: Archiving the Web What can Institutions learn from National and In...IWMW 2006: Archiving the Web What can Institutions learn from National and In...
IWMW 2006: Archiving the Web What can Institutions learn from National and In...IWMW
 
Webinar: Designing Storage and Apps to Enable Data Monetization
Webinar: Designing Storage and Apps to Enable Data MonetizationWebinar: Designing Storage and Apps to Enable Data Monetization
Webinar: Designing Storage and Apps to Enable Data MonetizationStorage Switzerland
 
Risking all you have, for what you can’t leave behind by Adam Cooke, Yancoal ...
Risking all you have, for what you can’t leave behind by Adam Cooke, Yancoal ...Risking all you have, for what you can’t leave behind by Adam Cooke, Yancoal ...
Risking all you have, for what you can’t leave behind by Adam Cooke, Yancoal ...AVEVA Group plc
 
Applying Traditional Principles of Authenticity and Trust to Digital Archives...
Applying Traditional Principles of Authenticity and Trust to Digital Archives...Applying Traditional Principles of Authenticity and Trust to Digital Archives...
Applying Traditional Principles of Authenticity and Trust to Digital Archives...Ed Fay
 

Ähnlich wie Dcc endeavour-2006 (20)

Presentation on the Warsaw Conference on National Bibliographies August 2012
Presentation on the Warsaw Conference on National Bibliographies August 2012Presentation on the Warsaw Conference on National Bibliographies August 2012
Presentation on the Warsaw Conference on National Bibliographies August 2012
 
Metadata: Standards Basics for the Independent Publishing Community, with Gra...
Metadata: Standards Basics for the Independent Publishing Community, with Gra...Metadata: Standards Basics for the Independent Publishing Community, with Gra...
Metadata: Standards Basics for the Independent Publishing Community, with Gra...
 
An Introduction to Providence Information center
An Introduction to Providence Information centerAn Introduction to Providence Information center
An Introduction to Providence Information center
 
Creation of LSE Digital Library
Creation of LSE Digital LibraryCreation of LSE Digital Library
Creation of LSE Digital Library
 
Building a Digital Library
Building a Digital LibraryBuilding a Digital Library
Building a Digital Library
 
Moving the repository upstream
Moving the repository upstreamMoving the repository upstream
Moving the repository upstream
 
NISO Webinar: What to Expect When You're Expecting a Platform Change: Perspec...
NISO Webinar: What to Expect When You're Expecting a Platform Change: Perspec...NISO Webinar: What to Expect When You're Expecting a Platform Change: Perspec...
NISO Webinar: What to Expect When You're Expecting a Platform Change: Perspec...
 
Institutional Repositories: Dealing with Data Challenges
Institutional Repositories: Dealing with Data ChallengesInstitutional Repositories: Dealing with Data Challenges
Institutional Repositories: Dealing with Data Challenges
 
Ch 1 intro_dw
Ch 1 intro_dwCh 1 intro_dw
Ch 1 intro_dw
 
User Participation in Digital Library Development
User Participation in Digital Library DevelopmentUser Participation in Digital Library Development
User Participation in Digital Library Development
 
Managing an Increasingly Complex and Interconnected World of Content
Managing an Increasingly Complex and Interconnected World of Content	Managing an Increasingly Complex and Interconnected World of Content
Managing an Increasingly Complex and Interconnected World of Content
 
BEA2014 - Understanding New Developments in Metadata
BEA2014 - Understanding New Developments in MetadataBEA2014 - Understanding New Developments in Metadata
BEA2014 - Understanding New Developments in Metadata
 
BEA 2014--Understanding New Developments in Metadata
BEA 2014--Understanding New Developments in MetadataBEA 2014--Understanding New Developments in Metadata
BEA 2014--Understanding New Developments in Metadata
 
The future of the DCC
The future of the DCCThe future of the DCC
The future of the DCC
 
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...Claudia Bauzer Medeiros  Digital preservation – caring for our data to foster...
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
 
Supporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many FrontsSupporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many Fronts
 
IWMW 2006: Archiving the Web What can Institutions learn from National and In...
IWMW 2006: Archiving the Web What can Institutions learn from National and In...IWMW 2006: Archiving the Web What can Institutions learn from National and In...
IWMW 2006: Archiving the Web What can Institutions learn from National and In...
 
Webinar: Designing Storage and Apps to Enable Data Monetization
Webinar: Designing Storage and Apps to Enable Data MonetizationWebinar: Designing Storage and Apps to Enable Data Monetization
Webinar: Designing Storage and Apps to Enable Data Monetization
 
Risking all you have, for what you can’t leave behind by Adam Cooke, Yancoal ...
Risking all you have, for what you can’t leave behind by Adam Cooke, Yancoal ...Risking all you have, for what you can’t leave behind by Adam Cooke, Yancoal ...
Risking all you have, for what you can’t leave behind by Adam Cooke, Yancoal ...
 
Applying Traditional Principles of Authenticity and Trust to Digital Archives...
Applying Traditional Principles of Authenticity and Trust to Digital Archives...Applying Traditional Principles of Authenticity and Trust to Digital Archives...
Applying Traditional Principles of Authenticity and Trust to Digital Archives...
 

Mehr von Chris Rusbridge

Cautious Optimism: Cultivate your Garden
Cautious Optimism: Cultivate your GardenCautious Optimism: Cultivate your Garden
Cautious Optimism: Cultivate your GardenChris Rusbridge
 
Create, curate, re-use: the expanding life course of digital research data
Create, curate, re-use: the expanding life course of digital research dataCreate, curate, re-use: the expanding life course of digital research data
Create, curate, re-use: the expanding life course of digital research dataChris Rusbridge
 
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stageChris Rusbridge
 
Curation of scientifica data: Challenges for repositories
Curation of scientifica data: Challenges for repositoriesCuration of scientifica data: Challenges for repositories
Curation of scientifica data: Challenges for repositoriesChris Rusbridge
 
LOCKSS UK, with a focus on reporting experience
LOCKSS UK, with a focus on reporting experienceLOCKSS UK, with a focus on reporting experience
LOCKSS UK, with a focus on reporting experienceChris Rusbridge
 
Saving private data, sharing Open Data? Role of libraries and institutional r...
Saving private data, sharing Open Data? Role of libraries and institutional r...Saving private data, sharing Open Data? Role of libraries and institutional r...
Saving private data, sharing Open Data? Role of libraries and institutional r...Chris Rusbridge
 
Curating data for integrated science
Curating data for integrated scienceCurating data for integrated science
Curating data for integrated scienceChris Rusbridge
 
Trust and repository audit: can repository managers assure trustworthiness?
Trust and repository audit: can repository managers assure trustworthiness?Trust and repository audit: can repository managers assure trustworthiness?
Trust and repository audit: can repository managers assure trustworthiness?Chris Rusbridge
 
Disciplinary dimensions of digital curation: introduction and synthesis
Disciplinary dimensions of digital curation: introduction and synthesisDisciplinary dimensions of digital curation: introduction and synthesis
Disciplinary dimensions of digital curation: introduction and synthesisChris Rusbridge
 
Reference Model for Economically Sustainable Digital Curation
Reference Model for Economically Sustainable Digital CurationReference Model for Economically Sustainable Digital Curation
Reference Model for Economically Sustainable Digital CurationChris Rusbridge
 
Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...Chris Rusbridge
 
Blue Ribbon Task Force on Sustainable Digital Preservation
Blue Ribbon Task Force on Sustainable Digital PreservationBlue Ribbon Task Force on Sustainable Digital Preservation
Blue Ribbon Task Force on Sustainable Digital PreservationChris Rusbridge
 
Sustainable Digital Preservation and Access
Sustainable Digital Preservation and AccessSustainable Digital Preservation and Access
Sustainable Digital Preservation and AccessChris Rusbridge
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositoriesChris Rusbridge
 

Mehr von Chris Rusbridge (15)

Cautious Optimism: Cultivate your Garden
Cautious Optimism: Cultivate your GardenCautious Optimism: Cultivate your Garden
Cautious Optimism: Cultivate your Garden
 
Create, curate, re-use: the expanding life course of digital research data
Create, curate, re-use: the expanding life course of digital research dataCreate, curate, re-use: the expanding life course of digital research data
Create, curate, re-use: the expanding life course of digital research data
 
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage
 
Curation of scientifica data: Challenges for repositories
Curation of scientifica data: Challenges for repositoriesCuration of scientifica data: Challenges for repositories
Curation of scientifica data: Challenges for repositories
 
LOCKSS UK, with a focus on reporting experience
LOCKSS UK, with a focus on reporting experienceLOCKSS UK, with a focus on reporting experience
LOCKSS UK, with a focus on reporting experience
 
Saving private data, sharing Open Data? Role of libraries and institutional r...
Saving private data, sharing Open Data? Role of libraries and institutional r...Saving private data, sharing Open Data? Role of libraries and institutional r...
Saving private data, sharing Open Data? Role of libraries and institutional r...
 
Curating data for integrated science
Curating data for integrated scienceCurating data for integrated science
Curating data for integrated science
 
Dcc jsr phase 3
Dcc jsr phase 3Dcc jsr phase 3
Dcc jsr phase 3
 
Trust and repository audit: can repository managers assure trustworthiness?
Trust and repository audit: can repository managers assure trustworthiness?Trust and repository audit: can repository managers assure trustworthiness?
Trust and repository audit: can repository managers assure trustworthiness?
 
Disciplinary dimensions of digital curation: introduction and synthesis
Disciplinary dimensions of digital curation: introduction and synthesisDisciplinary dimensions of digital curation: introduction and synthesis
Disciplinary dimensions of digital curation: introduction and synthesis
 
Reference Model for Economically Sustainable Digital Curation
Reference Model for Economically Sustainable Digital CurationReference Model for Economically Sustainable Digital Curation
Reference Model for Economically Sustainable Digital Curation
 
Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...
 
Blue Ribbon Task Force on Sustainable Digital Preservation
Blue Ribbon Task Force on Sustainable Digital PreservationBlue Ribbon Task Force on Sustainable Digital Preservation
Blue Ribbon Task Force on Sustainable Digital Preservation
 
Sustainable Digital Preservation and Access
Sustainable Digital Preservation and AccessSustainable Digital Preservation and Access
Sustainable Digital Preservation and Access
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositories
 

Kürzlich hochgeladen

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Kürzlich hochgeladen (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Dcc endeavour-2006

  • 1. a centre of expertise in data curation and preservation Experience is a hard teacher… Curation and the Digital Record Chris Rusbridge Endeavor EndUser 2006 Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by- nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
  • 2. a centre of expertise in data curation and preservation "Experience is a hard teacher because she gives the test first, the lesson afterwards” • Vernon Sanders Law, ex baseball player • (Or perhaps, in the case of digital preservation, the test occurs long after you are dead?) Endeavor EndUser 2006
  • 3. a centre of expertise in data curation and preservation Contents • Curation • Sustainability • Data resources • Preservation & curation issues • OAIS Review Endeavor EndUser 2006
  • 4. a centre of expertise in data curation and preservation Curation • Data increasingly important as evidence • Experimental verifiability (the basis of science) • Unrepeatable observations & experiments (particularly environmental in broadest sense) • Legal, compliance & transactions • Cultural resources • For evidential value, data must be curated Endeavor EndUser 2006
  • 5. a centre of expertise in data curation and preservation Curation • “Maintaining and adding value to a trusted body of digital information for current and future use” Endeavor EndUser 2006
  • 6. a centre of expertise in data curation and preservation Lynch remarks • Closing the 2005 Curation Conference • 3 views of digital curation • Collection as a living thing • Whole life process, evolving object(s) • Finite process, handover to preservation Endeavor EndUser 2006
  • 7. a centre of expertise in data curation and preservation Endeavor EndUser 2006
  • 8. a centre of expertise in data curation and preservation •This is what you do! Endeavor EndUser 2006
  • 9. a centre of expertise in data curation and preservation Sustainability and exit strategy • Most critical resource for curation: present and future money supply! • Plan for the long term, but have a succession plan • Sustained approach not project mentality Endeavor EndUser 2006
  • 10. a centre of expertise in data curation and preservation Sustainability and exit strategy • Most critical resource for curation: present and future money supply! • Plan for the long term, but have a succession plan • Sustained approach not project mentality •This is what you do! Endeavor EndUser 2006
  • 11. a centre of expertise in data curation and preservation Some illustrations: UK census • 1881 census (UKDA) • Hand-written individual return forms: data conversion issue (reference form available): digitisation and access issues • 1961 census (TNA/NDAD) • First using computers to analyse (first major UK-wide computer project?); individual returns closed until 2062: data preservation issue!!! • 2001 census (ONS/CDU) • Data corrections and adjustments: curation issue Endeavor EndUser 2006
  • 12. a centre of expertise in data curation and preservation Curation of emails  Lots of metadata and context (RFC 822)  Often highly distributed  Split conversations  Unknown numbers of copies  Personal choice of clients • Legal requirements! • Controlled filing and controlled deletion needed… Endeavor EndUser 2006
  • 13. a centre of expertise in data curation and preservation Endeavor EndUser 2006
  • 14. a centre of expertise in data curation and preservation Online Public Access Catalogues • Long term, curated databases • Often high quality (not always) • Well known interchange standards (MARC), classification standards (several), name authorities… • Still significant problems combining sources Endeavor EndUser 2006
  • 15. a centre of expertise in data curation and preservation Endeavor EndUser 2006
  • 16. a centre of expertise in data curation and preservation Endeavor EndUser 2006
  • 17. a centre of expertise in data curation and preservation Pre 2000 © Chris Rusbridge … The database provides indexing of 527 … The database provides indexing of 527 key international English-language business key international English-language business periodicals including Business Week, periodicals including Business Week, Forbes, The Wall Street Journal, The New Forbes, The Wall Street Journal, The New York Times and more. Also included are York Times and more. Also included are product reviews, interviews, biographical product reviews, interviews, biographical Post 2000 sketches, corporate profiles, reports of sketches, corporate profiles, reports of associations, societies and conferences. associations, societies and conferences. Broad areas of coverage include Broad areas of coverage include accounting, acquisitions and mergers, accounting, acquisitions and mergers, advertising, banking, chemicals, … advertising, banking, chemicals, … Endeavor EndUser 2006 •Peter Buneman
  • 18. a centre of expertise in data curation and preservation • Storage – Redundant, Distributed – Persistent – Readable • Clear standards for citation • Historical record (old data is useful) • Well understood ownership/IP © Chris Rusbridge … The database provides indexing of 527 … The database provides indexing of 527 key international English-language business key international English-language business •Storage periodicals including Business Week, periodicals including Business Week, –Single-source Forbes, The Wall Street Journal, The New Forbes, The Wall Street Journal, The New York Times and more. Also included are –Volatile York Times and more. Also included are product reviews, interviews, biographical product reviews, interviews, biographical –Centralised sketches, corporate profiles, reports of sketches, corporate profiles, reports of –Internal DBMS format associations, societies and conferences. associations, societies and conferences. •No standards for citation Broad areas of coverage include Broad areas of coverage include •No historical record accounting, acquisitions and mergers, accounting, acquisitions and mergers, •Mind-boggling legal issues advertising, banking, chemicals, … advertising, banking, chemicals, … Endeavor EndUser 2006 •Peter Buneman
  • 19. a centre of expertise in data curation and preservation TWOMASS (Infrared) SDSS (Visual) Endeavor EndUser 2006 Slide from Rajendra Bose
  • 20. a centre of expertise in data curation and preservation Endeavor EndUser 2006 Slide from Rajendra Bose
  • 21. a centre of expertise in data curation and preservation Example… • National Virtual Observatory • Johns Hopkins press release: “Scientists working to create the NVO, an online portal for astronomical research unifying dozens of large astronomical databases, confirmed discovery of [a] new brown dwarf recently. The star emerged from a computerized search of information on millions of astronomical objects in two separate astronomical databases. Thanks to an NVO prototype, that search, formerly an endeavor requiring weeks or months of human attention, took approximately two minutes.” Endeavor EndUser 2006
  • 22. a centre of expertise in data curation and preservation Context • Data meaningless without context • Linkage • Metadata of many kinds • Workflow! • Provenance • Computational lineage • Authenticity Endeavor EndUser 2006
  • 23. a centre of expertise in data curation and preservation Access and re-use • Ethics and rights control access • Weak in expressing this long-term • Collaboration tools • Annotation, discussion, review • Re-use leading to change and development • “Publication” • Not just in “print” • Underlying data should be “published”, too • Citation… Endeavor EndUser 2006
  • 24. a centre of expertise in data curation and preservation Citation • Needs a stable resource to cite… OWL Web Ontology Language Reference W3C Proposed Recommendation 15 December 2003 This version: http://www.w3.org/TR/2003/PR-owl-ref-20031215/ Latest version: http://www.w3.org/TR/owl-ref/ Previous version: http://www.w3.org/TR/2003/CR-owl-ref-2003081 Endeavor EndUser 2006
  • 25. a centre of expertise in data curation and preservation Citation… • The date alone (as in common web citation approaches) is not enough! •[6] The CIA World Factbook. •www.cia.gov/cia/publications/factbook/. •Retrieved on 8 Jan 2006. • Cited object likely to have changed… • Citation should link to the cited object as it was! Endeavor EndUser 2006
  • 26. a centre of expertise in data curation and preservation Citation needs… • An efficient way to reference and access “archived” past states of a changing dataset (work in progress, Buneman et al) • Less important for original observations • Don’t mess with those data • Less important for incremental datasets • Later stuff should not invalidate earlier • Very important for revisable datasets • Eg Genomics… datasets that result from the combined work of curators, or contain opinions or facts likely to change Endeavor EndUser 2006
  • 27. a centre of expertise in data curation and preservation XML Archive at time t - 1 XMLArch: System Architecture time t Relational XML Archiver XML Snapshot at Database Pre-processor Version Merger Data Extractor XML Archive at time t Endeavor EndUser 2006 •Carwyn Edwards
  • 28. a centre of expertise in data curation and preservation Preservation & curation • Use preserves • Money preserves • Redundancy good, monoculture bad? • LOCKSS-type & other approaches… • Bits are fragile and robust • Don’t rely on portable media • Look after them well • Technology changes… • How fast? What impact? • Metadata matters! (Know what you’ve got) Endeavor EndUser 2006
  • 29. a centre of expertise in data curation and preservation Formats, migration, significant properties… • “We MUST preserve the look and feel!” • Well… • Think about a book like “Kenilworth” by Walter Scott • Think about the BBC Domesday emulation • You may be better with a preserved “desiccated” version… than nothing at all! Endeavor EndUser 2006
  • 30. a centre of expertise in data curation and preservation The Project Gutenberg EBook of Kenilworth, by Sir Walter Scott This eBook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this eBook or online at www.gutenberg.org Title: Kenilworth Author: Sir Walter Scott Release Date: February 21, 2006 [EBook #1606] Language: English Character set encoding: ASCII *** START OF THIS PROJECT GUTENBERG EBOOK KENILWORTH *** Produced by An Anonymous Volunteer and David Widger KENILWORTH. by Sir Walter Scott, Bart. INTRODUCTION A certain degree of success, real or supposed, in the delineation of Queen Mary, naturally induced the author to attempt something similar respecting "her sister and her foe," the celebrated Elizabeth. He will not, however, pretend to have approached the task with the same feelings; for the candid Robertson himself confesses having felt the prejudices with which a Scottishman is tempted to regard the subject; … Endeavor EndUser 2006
  • 31. a centre of expertise in data curation and preservation •But there •ARE limits! QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Endeavor EndUser 2006
  • 32. a centre of expertise in data curation and preservation Preservation is not cheap • But it’s not expensive… Endeavor EndUser 2006
  • 33. a centre of expertise in data curation and preservation Preservation is not cheap • But it’s not expensive… • Compared with the alternative! Endeavor EndUser 2006
  • 34. a centre of expertise in data curation and preservation Preservation is not cheap • But it’s not expensive… • Compared with the alternative! •Postcard •Sent to me •anonymously Endeavor EndUser 2006
  • 35. a centre of expertise in data curation and preservation Curation: whose job is it? • Yours! • With your archivists • And your Records Managers • And your scientists and scholars… Endeavor EndUser 2006
  • 36. a centre of expertise in data curation and preservation Preservation & curation • We can’t do it alone • Collective responsibility • We can’t rely on anyone else • Institutional responsibility Endeavor EndUser 2006
  • 37. a centre of expertise in data curation and preservation It’s about time… • From the very short • Good management (don’t under-estimate but don’t over-estimate) • Through the medium term • Curation: use it or lose it • Gather ye metadata while ye may! • Preservation relay • To the very long term • High commitment, high cost, high risk • Harder to do en masse Endeavor EndUser 2006
  • 38. a centre of expertise in data curation and preservation Supplier role? • Work together with libraries… • Multi-supplier, Multi-platform • Open source mix • The library is not simple any more • Library 2.0? • Power of crowds, economy of attention, generation X… • Wikicat? Endeavor EndUser 2006
  • 39. a centre of expertise in data curation and preservation QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture. Endeavor EndUser 2006
  • 40. a centre of expertise in data curation and preservation Supplier role? • Work together with libraries… • Multi-supplier, Multi-platform • Open source mix • The library is not simple any more • Library 2.0? • Power of crowds, economy of attention, generation X… • Wikicat? • Web 2.0? • Mix, mashup • What you see is… not there? Endeavor EndUser 2006
  • 41. a centre of expertise in data curation and preservation BEWARE WEB 2.0!!! QuickTime™ and a TIFF (Uncompressed) decompressor are needed to see this picture. Endeavor EndUser 2006
  • 42. a centre of expertise in data curation and preservation OAIS • “Announcement of a Comment Period for the Five Year Review of the Reference Model for an Open Archival Information System (OAIS) Standard” • “… must be reviewed every five years and a determination made to reaffirm, modify, or withdraw the existing standard.” • “…any revision must remain backward compatible with regard to major terminology and concepts.” • “… we do not plan to expand the general level of detail” • “… reduce ambiguities and fill in any missing or weak concepts” • Make suggestions and express interest until 30/10/06 • OAIS-support@delight.gsfc.nasa.gov Endeavor EndUser 2006
  • 43. a centre of expertise in data curation and preservation To close… • Your library is currently taking the curation test… • Your children will learn the answer! • But Endeavor EndUser 2006

Hinweis der Redaktion

  1. Initially we have concentrated on data extracted from relational databases, mainly because this is where the IUPHAR data is. 1) Extract to XML (friendly hierarchical format). 2) Next we want to merge with the archive containing the previous versions. 3) Process and Merge 4) New archive with latest version added. Demo ....