SlideShare ist ein Scribd-Unternehmen logo
1 von 22
a centre of expertise in data curation and preservation




          The future of the DCC

                 Chris Rusbridge
          E-Science Workshop April 2009

                                                                                                      Funded by:
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5
UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-
nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San
Francisco, California, 94105, USA.
a centre of expertise in data curation and preservation




             Contents
•   Curation & integrated science
•   Poetry & Philosophy of D H Rumsfeld
•   Designated Community & Knowledge Base
•   DCC services
•   Future of the DCC




                E-Science Workshop
a centre of expertise in data curation and preservation




                   Curation
• Wikipedia
   • Curator: a content specialist responsible for an institution's
     collections and, together with a publications specialist, their
     associated collections catalogs.
   • Digital Curation: the curation, preservation, maintenance,
     collection and archiving of digital assets
   • Sheer curation: an approach to digital curation where
     curation activities are quietly integrated into the normal work
     flow of those creating and managing data and other digital
     assets.
• DCC: Digital curation is maintaining and adding value
  to a trusted body of digital information for current and
  future use.

                      E-Science Workshop
a centre of expertise in data curation and preservation




       Integrated Science
• The application of multiple scientific
  disciplines to one or more core scientific
  challenges

• Examples of integrated sciences?
   • Archaeology
   • Environmental sciences




                 E-Science Workshop
a centre of expertise in data curation and preservation




Integrated Science implications
 • Scientists will be using unfamiliar data,
   therefore
 • Data curators and managers must make their
   data available for unfamiliar users!




   • And now for something unfamiliar?



                  E-Science Workshop
a centre of expertise in data curation and preservation




Poetry & Philosophy of D H
        Rumsfeld
Hart Seely, April 2, 2003,
SLATE http://www.slate.com/id/2081042/




                             E-Science Workshop
a centre of expertise in data curation and preservation




           A Confession
‘Once in a while,
I'm standing here, doing something.
And I think,
"What in the world am I doing here?"
It's a big surprise.’
—May 16, 2001, interview with the New York Times




                 E-Science Workshop
a centre of expertise in data curation and preservation




                    Clarity
‘I think what you'll find,
I think what you'll find is,
Whatever it is we do substantively,
There will be near-perfect clarity
As to what it is.

‘And it will be known,
And it will be known to the Congress,
And it will be known to you,
Probably before we decide it,
But it will be known.’
—Feb. 28, 2003, Department of Defense briefing


                     E-Science Workshop
a centre of expertise in data curation and preservation




             The Unknown
‘As we know,
There are known knowns.
There are things we know we know.
We also know
There are known unknowns.
That is to say
We know there are some things
We do not know.
But there are also unknown unknowns,
The ones we don't know
We don't know.’
—Feb. 12, 2002, Department of Defense news briefing


                     E-Science Workshop
a centre of expertise in data curation and preservation




      The 4th Rumsfeld?
• 3 epistemological classes (???)
  • Known knowns
  • Known unknowns
  • Unknown unknowns
• 4th class?
  • Uknown knowns?
  • Critical issue for cross-disciplinary sciences




                  E-Science Workshop
a centre of expertise in data curation and preservation




    Some OAIS Concepts?
• Knowledge Base: allows a consumer to understand
  something
• Designated Community: the set of consumers for
  whom the archive curates something
• Representation Information: helps you interpret a
  data object yielding an information object
   • The amount and nature of RepInfo required is dependent on
     the Knowledge Base of the Designated Community
   • If you curate for project colleagues in the short term, little if
     any RepInfo required
   • If you curate for those unfamiliar with the data, more RepInfo
     is needed
   • (All broadly interpreted!) ••CCSDS (2002). Reference Model for an Open Archival Information System (OAIS).
                                  Retrieved. from http://public.ccsds.org/publications/archive/650x0b1.pdf.

                                E-Science Workshop
a centre of expertise in data curation and preservation




                     Time
• KB is f1(DC, t)
• DC is f2(t)
• RepInfo needed is f3(f1(DC, t), f2(t))
   • (but none of these concepts can be precisely defined!)


• If DC is small and t is short (months to year or so),
  then both may be ignored, and RepInfo be assumed
  part of the KB
• If DC is extensive (eg cross-discipline) and t is long (5
  years to 25 plus), then RepInfo must be articulated
• If t is very long, most bets are off (post-hoc
  reconstruction likely to be needed)
                     E-Science Workshop
a centre of expertise in data curation and preservation




What might RepInfo include
• Structure information: file format definitions, etc
• Semantic information: data dictionaries, code books etc
• Robust methods (working code?)
• Not to mention many kinds of metadata, provenance,
  documentation of hidden assumptions, etc
• Cross-domain schemas one approach to articulating
  RepInfo?
    • (Never perfect, of course)




                      E-Science Workshop
a centre of expertise in data curation and preservation




  What about Rumsfeld 4?
• Biggest concern with unfamiliar user is
  clashing concepts, eg different baselines,
  units, geographies, granularity
  • Especially where terms are ambiguous or
    differently interpreted
  • The KBs of two DCs conflict, potentially silently
  • Happens all the time, of course
• The unspoken: tacit knowledge, unknown
  knowns!

                  E-Science Workshop
a centre of expertise in data curation and preservation




                Timing
• Curation starts before creation
  • Before project proposal!
• Data acquisition should not happen at the end
  • Continuous acquisition much better?
• Enforcement… or credit for data?




                 E-Science Workshop
a centre of expertise in data curation and preservation




Other curation issues of concern
  •   Sustainability (work on your survival)
  •   Succession (what happens to your data if you don’t)
  •   Data audit (know what you’ve got)
  •   Data risk assessment (assess your chances of loss)
  •   Repository external audit???
  •   Provenance & computational lineage
  •   Archiving database changes
  •   Community proxy roles: help your communities
      develop data standards & data practices

  • DCC has tools & support for some of these…
                      E-Science Workshop
a centre of expertise in data curation and preservation




 … and Research Outputs?
• Need more semantically aware texts to
  support cross-community understanding
• Coded up (cf microformats, RDFa)
  •   People
  •   Citations & references
  •   Science features (eg chemicals, reactions)
  •   Graphs, spectra, tables linking to
  •   Supplementary data
• PDF is pretty bad at this


                   E-Science Workshop
a centre of expertise in data curation and preservation




             DCC Phase 3
•   Post January 2010?
•   Smaller (2/3 budget if we’re lucky)
•   Joint planning with JISC
•   More tightly managed (hub and spoke)
•   No development (says JISC)
•   Core services plus optional additional services
•   1st draft seen by JSR
•   Evaluation reported to JISC
•   Feedback session next week



                     E-Science Workshop
a centre of expertise in data curation and preservation




   Proposed core services
• Reference Resources and Exemplars
• Training and Staff Development
• Expertise, Advice, Consultancy and Hands-on
  Support
• Community-building and Information-sharing
  activities
• Data Management and Sharing Plans
• Policy and Strategic Development
• Providing Access to Tools and Toolkits

               E-Science Workshop
a centre of expertise in data curation and preservation




Possible additional services
• Development of Tools, Toolkits, Wizards and
  Templates
• Infrastructure Services
• Model licences for data
• Data citation guidelines




               E-Science Workshop
a centre of expertise in data curation and preservation




  Relationship to UKRDS?
• Overlap of territory
• Aiming for complementarity rather than
  conflict
• DCC becomes core part of UKRDS
• Some issues about the vision, though




                E-Science Workshop
a centre of expertise in data curation and preservation




What do you want from the DCC?




          E-Science Workshop

Weitere ähnliche Inhalte

Was ist angesagt?

DuraSpace is OPEN, OR2016
DuraSpace is OPEN, OR2016DuraSpace is OPEN, OR2016
DuraSpace is OPEN, OR2016DuraSpace
 
ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides DuraSpace
 
Research Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesResearch Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesCelia Emmelhainz
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersJez Cope
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projectszsrlibrary
 
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?Incremental Project
 
A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...Jenny Mitcham
 
Data Management Planning
Data Management PlanningData Management Planning
Data Management PlanningSarah Jones
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Research Data in the Arts and Humanities: A Few Tricky Questions
Research Data in the Arts and Humanities: A Few Tricky QuestionsResearch Data in the Arts and Humanities: A Few Tricky Questions
Research Data in the Arts and Humanities: A Few Tricky QuestionsMartin Donnelly
 
Research Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesResearch Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesMarieke Guy
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsAaron Collie
 
Open Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the HumanitiesOpen Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the HumanitiesOpen Knowledge Maps
 

Was ist angesagt? (20)

Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
 
DuraSpace is OPEN, OR2016
DuraSpace is OPEN, OR2016DuraSpace is OPEN, OR2016
DuraSpace is OPEN, OR2016
 
ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides
 
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
 
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
 
Research Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesResearch Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social Sciences
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projects
 
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
 
A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...
 
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
 
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
 
Data Management Planning
Data Management PlanningData Management Planning
Data Management Planning
 
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data ServicesNISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
 
Data management plans
Data management plansData management plans
Data management plans
 
Research Data in the Arts and Humanities: A Few Tricky Questions
Research Data in the Arts and Humanities: A Few Tricky QuestionsResearch Data in the Arts and Humanities: A Few Tricky Questions
Research Data in the Arts and Humanities: A Few Tricky Questions
 
Digital Destiny
Digital DestinyDigital Destiny
Digital Destiny
 
Research Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesResearch Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford Brookes
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering Students
 
Open Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the HumanitiesOpen Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the Humanities
 

Andere mochten auch

Curation of scientifica data: Challenges for repositories
Curation of scientifica data: Challenges for repositoriesCuration of scientifica data: Challenges for repositories
Curation of scientifica data: Challenges for repositoriesChris Rusbridge
 
Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...Chris Rusbridge
 
Sustainable Digital Preservation and Access
Sustainable Digital Preservation and AccessSustainable Digital Preservation and Access
Sustainable Digital Preservation and AccessChris Rusbridge
 
Moving the repository upstream
Moving the repository upstreamMoving the repository upstream
Moving the repository upstreamChris Rusbridge
 
Saving private data, sharing Open Data? Role of libraries and institutional r...
Saving private data, sharing Open Data? Role of libraries and institutional r...Saving private data, sharing Open Data? Role of libraries and institutional r...
Saving private data, sharing Open Data? Role of libraries and institutional r...Chris Rusbridge
 
Cautious Optimism: Cultivate your Garden
Cautious Optimism: Cultivate your GardenCautious Optimism: Cultivate your Garden
Cautious Optimism: Cultivate your GardenChris Rusbridge
 
Sandinista revolution in nicaragua
Sandinista revolution in nicaraguaSandinista revolution in nicaragua
Sandinista revolution in nicaraguaPaul Treadwell
 

Andere mochten auch (7)

Curation of scientifica data: Challenges for repositories
Curation of scientifica data: Challenges for repositoriesCuration of scientifica data: Challenges for repositories
Curation of scientifica data: Challenges for repositories
 
Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...
 
Sustainable Digital Preservation and Access
Sustainable Digital Preservation and AccessSustainable Digital Preservation and Access
Sustainable Digital Preservation and Access
 
Moving the repository upstream
Moving the repository upstreamMoving the repository upstream
Moving the repository upstream
 
Saving private data, sharing Open Data? Role of libraries and institutional r...
Saving private data, sharing Open Data? Role of libraries and institutional r...Saving private data, sharing Open Data? Role of libraries and institutional r...
Saving private data, sharing Open Data? Role of libraries and institutional r...
 
Cautious Optimism: Cultivate your Garden
Cautious Optimism: Cultivate your GardenCautious Optimism: Cultivate your Garden
Cautious Optimism: Cultivate your Garden
 
Sandinista revolution in nicaragua
Sandinista revolution in nicaraguaSandinista revolution in nicaragua
Sandinista revolution in nicaragua
 

Ähnlich wie Expertise in data curation and preservation

Issues in long-term knowledge retention in engineering
Issues in long-term knowledge retention in engineeringIssues in long-term knowledge retention in engineering
Issues in long-term knowledge retention in engineeringChris Rusbridge
 
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅kulibrarians
 
The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management EcosystemJohn Kunze
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Managementdancrane_open
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data LocallyErin D. Foster
 
Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016IzzyChad
 
Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Jeroen Rombouts
 
Guy avoiding-dat apocalypse
Guy avoiding-dat apocalypseGuy avoiding-dat apocalypse
Guy avoiding-dat apocalypseENUG
 
RDAP13 John Kunze: The Data Management Ecosystem
RDAP13 John Kunze: The Data Management EcosystemRDAP13 John Kunze: The Data Management Ecosystem
RDAP13 John Kunze: The Data Management EcosystemASIS&T
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationMANENDRASINGH30
 
Managing active data: storage, access, academic dropbox services
Managing active data: storage, access, academic dropbox servicesManaging active data: storage, access, academic dropbox services
Managing active data: storage, access, academic dropbox servicesMarieke Guy
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...Projeto RCAAP
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycleMarieke Guy
 
New Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsNew Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsJohn Kunze
 
Supporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many FrontsSupporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many FrontsJohn Kunze
 
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...datacite
 
“Filling the digital preservation gap” an update from the Jisc Research Data ...
“Filling the digital preservation gap”an update from the Jisc Research Data ...“Filling the digital preservation gap”an update from the Jisc Research Data ...
“Filling the digital preservation gap” an update from the Jisc Research Data ...Jenny Mitcham
 
Managing Research Data in the Life Sciences
Managing Research Data in the Life SciencesManaging Research Data in the Life Sciences
Managing Research Data in the Life Sciencesalwerhane
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Sarah Anna Stewart
 

Ähnlich wie Expertise in data curation and preservation (20)

Issues in long-term knowledge retention in engineering
Issues in long-term knowledge retention in engineeringIssues in long-term knowledge retention in engineering
Issues in long-term knowledge retention in engineering
 
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
 
The Data Management Ecosystem
The Data Management EcosystemThe Data Management Ecosystem
The Data Management Ecosystem
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Management
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data Locally
 
Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016
 
Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10
 
Guy avoiding-dat apocalypse
Guy avoiding-dat apocalypseGuy avoiding-dat apocalypse
Guy avoiding-dat apocalypse
 
RDAP13 John Kunze: The Data Management Ecosystem
RDAP13 John Kunze: The Data Management EcosystemRDAP13 John Kunze: The Data Management Ecosystem
RDAP13 John Kunze: The Data Management Ecosystem
 
What is-rdm
What is-rdmWhat is-rdm
What is-rdm
 
Impact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and EducationImpact of Covid-19 on Learning and Education
Impact of Covid-19 on Learning and Education
 
Managing active data: storage, access, academic dropbox services
Managing active data: storage, access, academic dropbox servicesManaging active data: storage, access, academic dropbox services
Managing active data: storage, access, academic dropbox services
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
 
New Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data CitationsNew Metaphors: Data Papers and Data Citations
New Metaphors: Data Papers and Data Citations
 
Supporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many FrontsSupporting Data-Rich Research on Many Fronts
Supporting Data-Rich Research on Many Fronts
 
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
 
“Filling the digital preservation gap” an update from the Jisc Research Data ...
“Filling the digital preservation gap”an update from the Jisc Research Data ...“Filling the digital preservation gap”an update from the Jisc Research Data ...
“Filling the digital preservation gap” an update from the Jisc Research Data ...
 
Managing Research Data in the Life Sciences
Managing Research Data in the Life SciencesManaging Research Data in the Life Sciences
Managing Research Data in the Life Sciences
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 

Mehr von Chris Rusbridge

The Distributed National Electronic Resource and the Electronic Libraries Pro...
The Distributed National Electronic Resource and the Electronic Libraries Pro...The Distributed National Electronic Resource and the Electronic Libraries Pro...
The Distributed National Electronic Resource and the Electronic Libraries Pro...Chris Rusbridge
 
JISC Digital Library initiatives
JISC Digital Library initiativesJISC Digital Library initiatives
JISC Digital Library initiativesChris Rusbridge
 
Practical steps towards digital preservation at institutional levels
Practical steps towards digital preservation at institutional levelsPractical steps towards digital preservation at institutional levels
Practical steps towards digital preservation at institutional levelsChris Rusbridge
 
Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...Chris Rusbridge
 
Create, curate, re-use: the expanding life course of digital research data
Create, curate, re-use: the expanding life course of digital research dataCreate, curate, re-use: the expanding life course of digital research data
Create, curate, re-use: the expanding life course of digital research dataChris Rusbridge
 
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stageChris Rusbridge
 
LOCKSS UK, with a focus on reporting experience
LOCKSS UK, with a focus on reporting experienceLOCKSS UK, with a focus on reporting experience
LOCKSS UK, with a focus on reporting experienceChris Rusbridge
 
Trust and repository audit: can repository managers assure trustworthiness?
Trust and repository audit: can repository managers assure trustworthiness?Trust and repository audit: can repository managers assure trustworthiness?
Trust and repository audit: can repository managers assure trustworthiness?Chris Rusbridge
 
Disciplinary dimensions of digital curation: introduction and synthesis
Disciplinary dimensions of digital curation: introduction and synthesisDisciplinary dimensions of digital curation: introduction and synthesis
Disciplinary dimensions of digital curation: introduction and synthesisChris Rusbridge
 
Reference Model for Economically Sustainable Digital Curation
Reference Model for Economically Sustainable Digital CurationReference Model for Economically Sustainable Digital Curation
Reference Model for Economically Sustainable Digital CurationChris Rusbridge
 
Blue Ribbon Task Force on Sustainable Digital Preservation
Blue Ribbon Task Force on Sustainable Digital PreservationBlue Ribbon Task Force on Sustainable Digital Preservation
Blue Ribbon Task Force on Sustainable Digital PreservationChris Rusbridge
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositoriesChris Rusbridge
 

Mehr von Chris Rusbridge (15)

The Distributed National Electronic Resource and the Electronic Libraries Pro...
The Distributed National Electronic Resource and the Electronic Libraries Pro...The Distributed National Electronic Resource and the Electronic Libraries Pro...
The Distributed National Electronic Resource and the Electronic Libraries Pro...
 
JISC Digital Library initiatives
JISC Digital Library initiativesJISC Digital Library initiatives
JISC Digital Library initiatives
 
Practical steps towards digital preservation at institutional levels
Practical steps towards digital preservation at institutional levelsPractical steps towards digital preservation at institutional levels
Practical steps towards digital preservation at institutional levels
 
The Licence Trap
The Licence TrapThe Licence Trap
The Licence Trap
 
Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...Frequently-asked questions on Freedom of Information and Environmental Inform...
Frequently-asked questions on Freedom of Information and Environmental Inform...
 
Dcc endeavour-2006
Dcc endeavour-2006Dcc endeavour-2006
Dcc endeavour-2006
 
Create, curate, re-use: the expanding life course of digital research data
Create, curate, re-use: the expanding life course of digital research dataCreate, curate, re-use: the expanding life course of digital research data
Create, curate, re-use: the expanding life course of digital research data
 
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage
"Tomorrow, and tomorrow, and tomorrow": the players on the curation stage
 
LOCKSS UK, with a focus on reporting experience
LOCKSS UK, with a focus on reporting experienceLOCKSS UK, with a focus on reporting experience
LOCKSS UK, with a focus on reporting experience
 
Dcc jsr phase 3
Dcc jsr phase 3Dcc jsr phase 3
Dcc jsr phase 3
 
Trust and repository audit: can repository managers assure trustworthiness?
Trust and repository audit: can repository managers assure trustworthiness?Trust and repository audit: can repository managers assure trustworthiness?
Trust and repository audit: can repository managers assure trustworthiness?
 
Disciplinary dimensions of digital curation: introduction and synthesis
Disciplinary dimensions of digital curation: introduction and synthesisDisciplinary dimensions of digital curation: introduction and synthesis
Disciplinary dimensions of digital curation: introduction and synthesis
 
Reference Model for Economically Sustainable Digital Curation
Reference Model for Economically Sustainable Digital CurationReference Model for Economically Sustainable Digital Curation
Reference Model for Economically Sustainable Digital Curation
 
Blue Ribbon Task Force on Sustainable Digital Preservation
Blue Ribbon Task Force on Sustainable Digital PreservationBlue Ribbon Task Force on Sustainable Digital Preservation
Blue Ribbon Task Force on Sustainable Digital Preservation
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositories
 

Kürzlich hochgeladen

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Kürzlich hochgeladen (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

Expertise in data curation and preservation

  • 1. a centre of expertise in data curation and preservation The future of the DCC Chris Rusbridge E-Science Workshop April 2009 Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by- nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
  • 2. a centre of expertise in data curation and preservation Contents • Curation & integrated science • Poetry & Philosophy of D H Rumsfeld • Designated Community & Knowledge Base • DCC services • Future of the DCC E-Science Workshop
  • 3. a centre of expertise in data curation and preservation Curation • Wikipedia • Curator: a content specialist responsible for an institution's collections and, together with a publications specialist, their associated collections catalogs. • Digital Curation: the curation, preservation, maintenance, collection and archiving of digital assets • Sheer curation: an approach to digital curation where curation activities are quietly integrated into the normal work flow of those creating and managing data and other digital assets. • DCC: Digital curation is maintaining and adding value to a trusted body of digital information for current and future use. E-Science Workshop
  • 4. a centre of expertise in data curation and preservation Integrated Science • The application of multiple scientific disciplines to one or more core scientific challenges • Examples of integrated sciences? • Archaeology • Environmental sciences E-Science Workshop
  • 5. a centre of expertise in data curation and preservation Integrated Science implications • Scientists will be using unfamiliar data, therefore • Data curators and managers must make their data available for unfamiliar users! • And now for something unfamiliar? E-Science Workshop
  • 6. a centre of expertise in data curation and preservation Poetry & Philosophy of D H Rumsfeld Hart Seely, April 2, 2003, SLATE http://www.slate.com/id/2081042/ E-Science Workshop
  • 7. a centre of expertise in data curation and preservation A Confession ‘Once in a while, I'm standing here, doing something. And I think, "What in the world am I doing here?" It's a big surprise.’ —May 16, 2001, interview with the New York Times E-Science Workshop
  • 8. a centre of expertise in data curation and preservation Clarity ‘I think what you'll find, I think what you'll find is, Whatever it is we do substantively, There will be near-perfect clarity As to what it is. ‘And it will be known, And it will be known to the Congress, And it will be known to you, Probably before we decide it, But it will be known.’ —Feb. 28, 2003, Department of Defense briefing E-Science Workshop
  • 9. a centre of expertise in data curation and preservation The Unknown ‘As we know, There are known knowns. There are things we know we know. We also know There are known unknowns. That is to say We know there are some things We do not know. But there are also unknown unknowns, The ones we don't know We don't know.’ —Feb. 12, 2002, Department of Defense news briefing E-Science Workshop
  • 10. a centre of expertise in data curation and preservation The 4th Rumsfeld? • 3 epistemological classes (???) • Known knowns • Known unknowns • Unknown unknowns • 4th class? • Uknown knowns? • Critical issue for cross-disciplinary sciences E-Science Workshop
  • 11. a centre of expertise in data curation and preservation Some OAIS Concepts? • Knowledge Base: allows a consumer to understand something • Designated Community: the set of consumers for whom the archive curates something • Representation Information: helps you interpret a data object yielding an information object • The amount and nature of RepInfo required is dependent on the Knowledge Base of the Designated Community • If you curate for project colleagues in the short term, little if any RepInfo required • If you curate for those unfamiliar with the data, more RepInfo is needed • (All broadly interpreted!) ••CCSDS (2002). Reference Model for an Open Archival Information System (OAIS). Retrieved. from http://public.ccsds.org/publications/archive/650x0b1.pdf. E-Science Workshop
  • 12. a centre of expertise in data curation and preservation Time • KB is f1(DC, t) • DC is f2(t) • RepInfo needed is f3(f1(DC, t), f2(t)) • (but none of these concepts can be precisely defined!) • If DC is small and t is short (months to year or so), then both may be ignored, and RepInfo be assumed part of the KB • If DC is extensive (eg cross-discipline) and t is long (5 years to 25 plus), then RepInfo must be articulated • If t is very long, most bets are off (post-hoc reconstruction likely to be needed) E-Science Workshop
  • 13. a centre of expertise in data curation and preservation What might RepInfo include • Structure information: file format definitions, etc • Semantic information: data dictionaries, code books etc • Robust methods (working code?) • Not to mention many kinds of metadata, provenance, documentation of hidden assumptions, etc • Cross-domain schemas one approach to articulating RepInfo? • (Never perfect, of course) E-Science Workshop
  • 14. a centre of expertise in data curation and preservation What about Rumsfeld 4? • Biggest concern with unfamiliar user is clashing concepts, eg different baselines, units, geographies, granularity • Especially where terms are ambiguous or differently interpreted • The KBs of two DCs conflict, potentially silently • Happens all the time, of course • The unspoken: tacit knowledge, unknown knowns! E-Science Workshop
  • 15. a centre of expertise in data curation and preservation Timing • Curation starts before creation • Before project proposal! • Data acquisition should not happen at the end • Continuous acquisition much better? • Enforcement… or credit for data? E-Science Workshop
  • 16. a centre of expertise in data curation and preservation Other curation issues of concern • Sustainability (work on your survival) • Succession (what happens to your data if you don’t) • Data audit (know what you’ve got) • Data risk assessment (assess your chances of loss) • Repository external audit??? • Provenance & computational lineage • Archiving database changes • Community proxy roles: help your communities develop data standards & data practices • DCC has tools & support for some of these… E-Science Workshop
  • 17. a centre of expertise in data curation and preservation … and Research Outputs? • Need more semantically aware texts to support cross-community understanding • Coded up (cf microformats, RDFa) • People • Citations & references • Science features (eg chemicals, reactions) • Graphs, spectra, tables linking to • Supplementary data • PDF is pretty bad at this E-Science Workshop
  • 18. a centre of expertise in data curation and preservation DCC Phase 3 • Post January 2010? • Smaller (2/3 budget if we’re lucky) • Joint planning with JISC • More tightly managed (hub and spoke) • No development (says JISC) • Core services plus optional additional services • 1st draft seen by JSR • Evaluation reported to JISC • Feedback session next week E-Science Workshop
  • 19. a centre of expertise in data curation and preservation Proposed core services • Reference Resources and Exemplars • Training and Staff Development • Expertise, Advice, Consultancy and Hands-on Support • Community-building and Information-sharing activities • Data Management and Sharing Plans • Policy and Strategic Development • Providing Access to Tools and Toolkits E-Science Workshop
  • 20. a centre of expertise in data curation and preservation Possible additional services • Development of Tools, Toolkits, Wizards and Templates • Infrastructure Services • Model licences for data • Data citation guidelines E-Science Workshop
  • 21. a centre of expertise in data curation and preservation Relationship to UKRDS? • Overlap of territory • Aiming for complementarity rather than conflict • DCC becomes core part of UKRDS • Some issues about the vision, though E-Science Workshop
  • 22. a centre of expertise in data curation and preservation What do you want from the DCC? E-Science Workshop