SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Wisconsin Cyberinfrastructure Days
November 5, 2010
Dorothea Salo & Brad Houston
 Document describing data (and/or digital
materials) that have been or will be gathered in
a study or project.
 Often includes details on how data will be
organized, preserved, and accessed
 Facilitates re-use of data sets by either PI or
other researchers
 Required component of grants for MANY
agencies (NSF and NIH)
 Starting January 2011 for NEW, non-
collaborative proposals
 Not voluntary – “integral part” of proposal
 Data Management Plans for all data resulting
from any level of NSF funding
 Supplementary 2-page document (max)
 Optional: Also part of 15-page (max) Project
Description
 Must address both physical and digital data
 “Efficiency and effectiveness” of the DMP will
be considered by NSF and disciplinary division
or directorate
 Must include sufficient information that peer
reviewers and project monitors can assess
present proposal and past performance
Such dissemination of data is necessary for the
community to stimulate new advancesstimulate new advances as quickly as
possible and to allow prompt evaluationallow prompt evaluation of the results
by the scientific community. “ – NSF (italics mine)
Part of Openness trend in federal government
(data.gov - Open Government Initiative)
NIH Public Access Policy (2008)
Public access to federally funded research hearings
- Information Policy, Census and National Archives
Subcommittee of U.S. Congress (July, 2010)
 It makes your research easier!
 Data available in case you need it later
 Helps avoid accusations of fraud or bad science
 To share it for others to use and learn from
 To get credit for producing it
 To keep from drowning in irrelevant stuff
 ... especially at grant/project end
Gene expression microarray data: “Publicly
available data was significantly (p=0.006)
associated with a 69% increase in citations,
independently of journal impact factor, date of
publication, and author country of origin.”
 Piwowar, Heather et al. “Sharing detailed research
data is associated with increased citation rate.” PLoS
One 2010. DOI: 10.1371/journal.pone.0000308
 Maybe there’s an advantage here!
 Discuss specific requirements for NSF
Data Management plans
 Suggest ways to manage, share, and
archive data more effectively
 Provide resources for more information
Requirements, retention, and planning
 What data are you collecting or making?
 Can it be recreated? How much would that cost?
 How much of it? How fast is it growing? Does it
change?
 What file format(s)?
 What’s your infrastructure for data collection and
 storage like?
 How do you find it, or find what you’re looking for
in it?
 How easy is it to get new people up to speed? Or
share data with others?
 Who are the audiences for your data?
 You (including Future You), your lab colleagues
(including future ones), your PIs
 Disciplinary colleagues, at your institution or at others
 Colleagues in allied disciplines
 The world!
 What are your obligations to others?
 Funder requirements
 Confidentiality issues
 IP questions
 Security
 How do you and your lab get from where you
are to where you need to be?
 Document, document, document all decisions and
all processes!
 Secret sauce: the more you strategize upfront,
the less angst and panic later.
 “Make it up as you go along” is very bad practice!
 But the best-laid plans go agley... so be flexible.
 And watch your field! Best practices are still in flux.
 All submitted plans must include, at
minimum:
1. Expected Data: types, physical/electronic collections,
materials to be produced
2. Standards for data and metadata format and content
3. Policies for access and sharing, including provisions for
appropriate protection of privacy, confidentiality,
security, intellectual property, etc.
4. Policies and provisions for re-use, re-distribution, and
the production of derivatives
5. Plans for archiving data, samples, and other research
products, and for preservation of access to them
 Four kinds of data defined by OMB:
 Observational
 Examples: Sensor data, telemetry, survey data, sample
data, neuroimages.
 Experimental
 Examples: gene sequences, chromatograms, toroid
magnetic field data.
 Simulation
 Examples: climate models, economic models.
 Derived or compiled
 Examples: text and data mining, compiled database, 3D
models, data gathered from public documents.
 Preliminary analyses
 Raw data is included in this definition
 Drafts of scientific papers
 Plans for future research
 Peer reviews or communications with
colleagues
 Physical objects, such as gel samples
 As early as possible, but no later than
guidelines laid down by relevant Directorate
 Engineering Section: “no later than the acceptance
for publication of the main findings of the final data”
 Earth Sciences: “No later than two (2) years after the
data were collected.”
 Social and Economic Sciences: “within one year after
the expiration of an award”
 Be aware of concerns that may require earlier
or later disclosure
 FERPA? Human Subjects data? HIPAA?
 Again, specific retention periods will depend
on the type of data and the Directorate
 Example: Engineering Section suggests retention
period of “three years after either completion of the
grant project or public release of research data,
whichever is later”
 Certain types of data will need to be retained
longer
 Patent data, longitudinal data sets, etc.
 Ask: is your data of permanent value?
 Analyzed data (incl. images, tables and tables of
numbers used for making graphs)
 Metadata that defines how data was generated,
such as experiment descriptions, computer code,
and computer-calculation input
 Investigators are expected to preserve/share
primary data, samples, physical collections, &
supporting materials
 Provide easily accessible information about data
holdings, including quality assessments and
guidance/finding aids
 Data may be made available through submission to
national data center, publication in journal, book, or
accessible website of institutional archives
 Data Management Plans are required even if a
project is not expected to generate data that
requires sharing
 DMP should clearly explain non-sharing in
light of COI standards (peer review)
 Between the lines: Not sharing will require
justification and close scrutiny by NSF
 Sharing is preferred
Preparing, sharing, and archiving your data sets
 Think about where you will put your data
 Local? Network drive? Online data management
system?
 Think about how you (or others) will find your
data
 Think about how others may use your data, when
found
 Think about how to store your data in the long
term (or if to store it long-term at all)
 Will anybody be able to read these files at the
end of your time horizon?
 Where possible, prefer file formats that are:
 Open, standardized
 Documented
 In wide use
 Easy to data-mine, transform, recast
 If you need to transform data for durability,
do it now, not later.
 Fundamental question: What would someone
unfamiliar with your data need in order to
find, evaluate, understand, and reuse them?
 Consider the differences between someone
inside your lab, someone outside your lab but
in your field, and someone outside your field.
 Two parts: metadata and methods
 About the project
 Title, people, key dates, funders and grants
 About the data
 Title, key dates, creator(s), subjects, rights, included
files, format(s),versions, checksums
 Interpretive aids: codebooks, data dictionaries,
algorithms, code
 Keep this with the data
 Reason #1 for not reusing someone else’s data: “I
don’t know enough about how it was gathered to
trust it.”
 Document what you did. (A published article may
or may not be enough.)
 Document any limitations of what you did.
 If you ran code on the data, document the code and
keep it with the data.
 Need a codebook? Or a data dictionary?
 If I can’t identify at sight what each bit of your dataset
means, yes, you do need a codebook or data dictionary.
 DO NOT FORGET UNITS!
 Your own drive (PC, server, flash drive, etc.)
 And if you lose it? Or it breaks?
 Somebody else’s drive
 Departmental or campus drive
 “Cloud” drive
 Do they care as much about your data as you do?
 What about versioning?
 Library motto: Lots Of Copies Keeps Stuff Safe.
 Two onsite copies, one offsite copy.
 Keep confidentiality and security requirements in
mind, of course
 If data need to persist beyond project end, you have to
deal with a new kind of risk: organizational risk.
 Servers come and go. So do labs. So do entire departments.
 This is especially important if you share data! Don’t let it 404!
 You need to find a trustworthy partner.
 On campus: try the library or your campus research office. (No,
campus IT is usually not good enough.)
 Off campus: look for a disciplinary data repository, or a journal
that accepts data. (It’s a good idea to do this as part of your
planning process.)
 Let somebody else worry! You have new projects to get
on with.

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Lab Notebooks as Data Management (SLA Winter Virtual Conference 2012)
Lab Notebooks as Data Management (SLA Winter Virtual Conference 2012)Lab Notebooks as Data Management (SLA Winter Virtual Conference 2012)
Lab Notebooks as Data Management (SLA Winter Virtual Conference 2012)
 
Managing your research data
Managing your research dataManaging your research data
Managing your research data
 
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
 
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
Preparing Your Research Material for the Future - 2016-11-16 - Humanities Div...
 
No Free Lunch: Metadata in the life sciences
No Free Lunch:  Metadata in the life sciencesNo Free Lunch:  Metadata in the life sciences
No Free Lunch: Metadata in the life sciences
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
 
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
 
Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
Preparing Your Research Material for the Future - 2018-06-08 - Humanities Div...
 
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
Preparing Your Research Material for the Future - 2017-02-22 - Humanities Div...
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data Management
 
Data Management Planning
Data Management PlanningData Management Planning
Data Management Planning
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
Introduction to Research Data Management - 2017-02-15 - MPLS Division, Univer...
 
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
Research Data Management Plan: How to Write One - 2017-02-01 - University of ...
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
 
Responsible Conduct of Research: Data Management
Responsible Conduct of Research: Data ManagementResponsible Conduct of Research: Data Management
Responsible Conduct of Research: Data Management
 
Preparing Your Research Data for the Future - 2015-03-02 - University of Oxfo...
Preparing Your Research Data for the Future - 2015-03-02 - University of Oxfo...Preparing Your Research Data for the Future - 2015-03-02 - University of Oxfo...
Preparing Your Research Data for the Future - 2015-03-02 - University of Oxfo...
 
The Timescapes Archive
The Timescapes ArchiveThe Timescapes Archive
The Timescapes Archive
 
How and Why to Share Your Data
How and Why to Share Your DataHow and Why to Share Your Data
How and Why to Share Your Data
 

Andere mochten auch (6)

RSS, Yoselin Luque
RSS, Yoselin LuqueRSS, Yoselin Luque
RSS, Yoselin Luque
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
We can rebuild it
We can rebuild itWe can rebuild it
We can rebuild it
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Everyone's A Mechanic
Everyone's A MechanicEveryone's A Mechanic
Everyone's A Mechanic
 
SYNGAS production
SYNGAS productionSYNGAS production
SYNGAS production
 

Ähnlich wie Data management plans

Data management plans
Data management plansData management plans
Data management plans
Brad Houston
 
Funder requirements for Data Management Plans
Funder requirements for Data Management PlansFunder requirements for Data Management Plans
Funder requirements for Data Management Plans
Sherry Lake
 

Ähnlich wie Data management plans (20)

Data management plans
Data management plansData management plans
Data management plans
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Introduction to Data Management and Sharing
Introduction to Data Management and SharingIntroduction to Data Management and Sharing
Introduction to Data Management and Sharing
 
Support Your Data, Kyoto University
Support Your Data, Kyoto UniversitySupport Your Data, Kyoto University
Support Your Data, Kyoto University
 
Data management
Data management Data management
Data management
 
Research-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhDResearch-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhD
 
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLANINCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
 
Data Management Planning for Researchers - 2016-02-08 - University of Oxford
Data Management Planning for Researchers - 2016-02-08 - University of OxfordData Management Planning for Researchers - 2016-02-08 - University of Oxford
Data Management Planning for Researchers - 2016-02-08 - University of Oxford
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)
 
Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
 Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un... Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
Data Management Planning for Researchers - An Introduction - 2015-11-04 - Un...
 
RDM for trainee physicians
RDM for trainee physiciansRDM for trainee physicians
RDM for trainee physicians
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositories
 
Research data management : [part of] PROOF course Finding and controlling sci...
Research data management : [part of] PROOF course Finding and controlling sci...Research data management : [part of] PROOF course Finding and controlling sci...
Research data management : [part of] PROOF course Finding and controlling sci...
 
You down with dmp yeah you know me!
You down with dmp  yeah you know me!You down with dmp  yeah you know me!
You down with dmp yeah you know me!
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
 
Data Management Planning for Researchers - An Introduction - 2015-02-18 - Un...
Data Management Planning for Researchers -  An Introduction - 2015-02-18 - Un...Data Management Planning for Researchers -  An Introduction - 2015-02-18 - Un...
Data Management Planning for Researchers - An Introduction - 2015-02-18 - Un...
 
Data management plan format
Data management plan formatData management plan format
Data management plan format
 
Intro to RDM
Intro to RDMIntro to RDM
Intro to RDM
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data management
 
Funder requirements for Data Management Plans
Funder requirements for Data Management PlansFunder requirements for Data Management Plans
Funder requirements for Data Management Plans
 

Mehr von Brad Houston

Mehr von Brad Houston (11)

Personal digital archiving
Personal digital archivingPersonal digital archiving
Personal digital archiving
 
Email Management for Office 365 and Beyond
Email Management for Office 365 and BeyondEmail Management for Office 365 and Beyond
Email Management for Office 365 and Beyond
 
Reading the Library General Records Schedule
Reading the Library General Records ScheduleReading the Library General Records Schedule
Reading the Library General Records Schedule
 
Legal Issues in Records Management
Legal Issues in Records ManagementLegal Issues in Records Management
Legal Issues in Records Management
 
Interpreting The Personnel General Records Schedule
Interpreting The Personnel General Records ScheduleInterpreting The Personnel General Records Schedule
Interpreting The Personnel General Records Schedule
 
Electronic Records Management
Electronic Records ManagementElectronic Records Management
Electronic Records Management
 
Finding and Reading General Records Schedules
Finding and Reading General Records SchedulesFinding and Reading General Records Schedules
Finding and Reading General Records Schedules
 
The Basics of UWM Records Management
The Basics of UWM Records ManagementThe Basics of UWM Records Management
The Basics of UWM Records Management
 
E-Mail Management
E-Mail ManagementE-Mail Management
E-Mail Management
 
"Delete it Or Keep It?"
"Delete it Or Keep It?""Delete it Or Keep It?"
"Delete it Or Keep It?"
 
The Basics of UWM Records Management
The Basics of UWM Records ManagementThe Basics of UWM Records Management
The Basics of UWM Records Management
 

Kürzlich hochgeladen

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Kürzlich hochgeladen (20)

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 

Data management plans

  • 1. Wisconsin Cyberinfrastructure Days November 5, 2010 Dorothea Salo & Brad Houston
  • 2.  Document describing data (and/or digital materials) that have been or will be gathered in a study or project.  Often includes details on how data will be organized, preserved, and accessed  Facilitates re-use of data sets by either PI or other researchers  Required component of grants for MANY agencies (NSF and NIH)
  • 3.  Starting January 2011 for NEW, non- collaborative proposals  Not voluntary – “integral part” of proposal  Data Management Plans for all data resulting from any level of NSF funding  Supplementary 2-page document (max)  Optional: Also part of 15-page (max) Project Description
  • 4.  Must address both physical and digital data  “Efficiency and effectiveness” of the DMP will be considered by NSF and disciplinary division or directorate  Must include sufficient information that peer reviewers and project monitors can assess present proposal and past performance
  • 5. Such dissemination of data is necessary for the community to stimulate new advancesstimulate new advances as quickly as possible and to allow prompt evaluationallow prompt evaluation of the results by the scientific community. “ – NSF (italics mine) Part of Openness trend in federal government (data.gov - Open Government Initiative) NIH Public Access Policy (2008) Public access to federally funded research hearings - Information Policy, Census and National Archives Subcommittee of U.S. Congress (July, 2010)
  • 6.  It makes your research easier!  Data available in case you need it later  Helps avoid accusations of fraud or bad science  To share it for others to use and learn from  To get credit for producing it  To keep from drowning in irrelevant stuff  ... especially at grant/project end
  • 7. Gene expression microarray data: “Publicly available data was significantly (p=0.006) associated with a 69% increase in citations, independently of journal impact factor, date of publication, and author country of origin.”  Piwowar, Heather et al. “Sharing detailed research data is associated with increased citation rate.” PLoS One 2010. DOI: 10.1371/journal.pone.0000308  Maybe there’s an advantage here!
  • 8.  Discuss specific requirements for NSF Data Management plans  Suggest ways to manage, share, and archive data more effectively  Provide resources for more information
  • 10.  What data are you collecting or making?  Can it be recreated? How much would that cost?  How much of it? How fast is it growing? Does it change?  What file format(s)?  What’s your infrastructure for data collection and  storage like?  How do you find it, or find what you’re looking for in it?  How easy is it to get new people up to speed? Or share data with others?
  • 11.  Who are the audiences for your data?  You (including Future You), your lab colleagues (including future ones), your PIs  Disciplinary colleagues, at your institution or at others  Colleagues in allied disciplines  The world!  What are your obligations to others?  Funder requirements  Confidentiality issues  IP questions  Security
  • 12.  How do you and your lab get from where you are to where you need to be?  Document, document, document all decisions and all processes!  Secret sauce: the more you strategize upfront, the less angst and panic later.  “Make it up as you go along” is very bad practice!  But the best-laid plans go agley... so be flexible.  And watch your field! Best practices are still in flux.
  • 13.  All submitted plans must include, at minimum: 1. Expected Data: types, physical/electronic collections, materials to be produced 2. Standards for data and metadata format and content 3. Policies for access and sharing, including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, etc. 4. Policies and provisions for re-use, re-distribution, and the production of derivatives 5. Plans for archiving data, samples, and other research products, and for preservation of access to them
  • 14.  Four kinds of data defined by OMB:  Observational  Examples: Sensor data, telemetry, survey data, sample data, neuroimages.  Experimental  Examples: gene sequences, chromatograms, toroid magnetic field data.  Simulation  Examples: climate models, economic models.  Derived or compiled  Examples: text and data mining, compiled database, 3D models, data gathered from public documents.
  • 15.  Preliminary analyses  Raw data is included in this definition  Drafts of scientific papers  Plans for future research  Peer reviews or communications with colleagues  Physical objects, such as gel samples
  • 16.  As early as possible, but no later than guidelines laid down by relevant Directorate  Engineering Section: “no later than the acceptance for publication of the main findings of the final data”  Earth Sciences: “No later than two (2) years after the data were collected.”  Social and Economic Sciences: “within one year after the expiration of an award”  Be aware of concerns that may require earlier or later disclosure  FERPA? Human Subjects data? HIPAA?
  • 17.  Again, specific retention periods will depend on the type of data and the Directorate  Example: Engineering Section suggests retention period of “three years after either completion of the grant project or public release of research data, whichever is later”  Certain types of data will need to be retained longer  Patent data, longitudinal data sets, etc.  Ask: is your data of permanent value?
  • 18.  Analyzed data (incl. images, tables and tables of numbers used for making graphs)  Metadata that defines how data was generated, such as experiment descriptions, computer code, and computer-calculation input
  • 19.  Investigators are expected to preserve/share primary data, samples, physical collections, & supporting materials  Provide easily accessible information about data holdings, including quality assessments and guidance/finding aids  Data may be made available through submission to national data center, publication in journal, book, or accessible website of institutional archives
  • 20.  Data Management Plans are required even if a project is not expected to generate data that requires sharing  DMP should clearly explain non-sharing in light of COI standards (peer review)  Between the lines: Not sharing will require justification and close scrutiny by NSF  Sharing is preferred
  • 21. Preparing, sharing, and archiving your data sets
  • 22.  Think about where you will put your data  Local? Network drive? Online data management system?  Think about how you (or others) will find your data  Think about how others may use your data, when found  Think about how to store your data in the long term (or if to store it long-term at all)
  • 23.  Will anybody be able to read these files at the end of your time horizon?  Where possible, prefer file formats that are:  Open, standardized  Documented  In wide use  Easy to data-mine, transform, recast  If you need to transform data for durability, do it now, not later.
  • 24.  Fundamental question: What would someone unfamiliar with your data need in order to find, evaluate, understand, and reuse them?  Consider the differences between someone inside your lab, someone outside your lab but in your field, and someone outside your field.  Two parts: metadata and methods
  • 25.  About the project  Title, people, key dates, funders and grants  About the data  Title, key dates, creator(s), subjects, rights, included files, format(s),versions, checksums  Interpretive aids: codebooks, data dictionaries, algorithms, code  Keep this with the data
  • 26.  Reason #1 for not reusing someone else’s data: “I don’t know enough about how it was gathered to trust it.”  Document what you did. (A published article may or may not be enough.)  Document any limitations of what you did.  If you ran code on the data, document the code and keep it with the data.  Need a codebook? Or a data dictionary?  If I can’t identify at sight what each bit of your dataset means, yes, you do need a codebook or data dictionary.  DO NOT FORGET UNITS!
  • 27.  Your own drive (PC, server, flash drive, etc.)  And if you lose it? Or it breaks?  Somebody else’s drive  Departmental or campus drive  “Cloud” drive  Do they care as much about your data as you do?  What about versioning?  Library motto: Lots Of Copies Keeps Stuff Safe.  Two onsite copies, one offsite copy.  Keep confidentiality and security requirements in mind, of course
  • 28.  If data need to persist beyond project end, you have to deal with a new kind of risk: organizational risk.  Servers come and go. So do labs. So do entire departments.  This is especially important if you share data! Don’t let it 404!  You need to find a trustworthy partner.  On campus: try the library or your campus research office. (No, campus IT is usually not good enough.)  Off campus: look for a disciplinary data repository, or a journal that accepts data. (It’s a good idea to do this as part of your planning process.)  Let somebody else worry! You have new projects to get on with.