The Codex of Business Writing Software for Real-World Solutions 2.pptx
Hands-On Data Management Planning
1. Hands-On Data Management
Planning for Engineering
Bill Corey
Data Management Consultant
University of Virginia Library
wtc2h@virginia.edu
Sherry Lake
Data Management Consultant
University of Virginia Library
shLake@virginia.edu
Data Life Cycle
Re-Purpose
Re-Use Deposit
Data
Collection
Data
Analysis
Data
Sharing
Proposal
Planning
Writing
Data
Discovery
End of
Project
Data
Archive
Project
Start Up
2. Goals for the workshop
• Learn about data management planning
• Learn about available resources
• Develop rough draft of a data management
plan for a grant
• Gain peer and expert feedback
3. (Good) Data Management…
…helps research to be:
Replicated and verified
Preserved for future use
Linked with other research products
Shared and reused
…helps researchers:
Meet funding requirements
Increase visibility of research
Save time and effort (avoid data loss)
Deal with an ever-increasing amount of data
http://www.healthcare-informatics.com/article/guest-blog-data-management-challenge-
unlocking-value-clinical-data-many-times-requires-enter
5. Recent News
• Memo released February 22, 2013
• Direct results of federally funded scientific
research are made available…
• Federal research agencies funding more than
$100M/year must develop plan to make the
results (papers and data) of federally funded
research available to the public within one year
of publication
http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf
6. Require a Data Management Plan (DMP) Require Sharing of Results – per a Data
Policy
• National Science Foundation
• National Institutes of Health
• National Oceanographic and
Atmospheric Research (NOAA)
• Institute of Museum and Library
Services (IMLS)
• National Endowment of Humanities
– office of digital humanities (NEH)
• NASA
• NEH – Preservation & Access
Who’s Requiring Data Management?
This list is not inclusive.
7. Example: National Science Foundation
• Data Sharing Policy: Awards & Administration
Guide Chapter IV.D.4
• Data Management Plan requirement: Grant
Proposal Guide Chapter II.C.2.j
• Additional requirements from individual
Directorates and Divisions (e.g., BIO, CISE,
EHR, GEO, MPS, SBE): Dissemination and
Sharing of Results
8. NSF: Dissemination & Sharing of Research Results:
“Investigators are expected to share with
other researchers, at no more than
incremental cost and within a reasonable
time, the primary data, samples, physical
collections and other supporting materials
created or gathered in the course of work
under NSF grants. Grantees are expected
to encourage and facilitate such sharing.”
Award & Administration Guide (AAG) Chapter
VI.D.4
8
9. Plans for Data Management & Sharing
Since January 18, 2011:
• Proposals must include a supplementary
document of no more than two pages labeled:
“Data Management Plan”
• Document should describe how the proposal with
conform to NSF sharing policy
NSF: Grant Proposal Guide (GPG) Chapter II.C.2.j
of the Products of Research
10. Parts of a (Generic) NSF Data Management Plan
I. Products of the Research: The types of data, samples, physical
collections, software, curriculum materials, and other materials to be
produced in the course of the project.
II. Data Formats: The standards to be used for data and metadata format and
content (where existing standards are absent or deemed inadequate, this
should be documented along with any proposed solutions or remedies).
III. Access to Data and Data Sharing Practices and Policies: Policies for
access and sharing including provisions for appropriate protection of
privacy, confidentiality, security, intellectual property, or other rights or
requirements.
IV. Policies for Re-Use, Re-Distribution, and Production of Derivatives.
V. Archiving of Data: Plans for archiving data, samples, and other research
products, and for preservation of access to them.
10
Grant Proposal Guide (GPG) Chapter II.C.2.j
http://www.nsf.gov/pubs/policydocs/pappguide/nsf13001/gpg_2.jsp#dmp
11. I. Types of Data & Other Information
• Types of data produced
• Relationship to existing data
• How/when/where will the data be captured or
created?
• How will the data be processed?
• Quality assurance & quality control measures
• Security: version control, backing up
• Who will be responsible for data
management during/after project?
biology.kenyon.edu
http://en.wikipedia.org/wiki/File:
eCmmt002.svg
Images by Antti-Pekka Hynninen
12. Wired.com
II. Data & Metadata Standards
• Identify the formats of data files created over the
course of the project
• What metadata are needed to make the data
meaningful?
• How will you create or capture these metadata?
• Why have you chosen particular standards and
approaches for metadata?
13. III. Policies for Access & Sharing
• Are you under any obligation to share data?
• How, when, & where will you make the data available?
• What is the process for gaining access to the data?
• Who owns the copyright and/or intellectual property?
• Will you retain rights before opening data to wider use? How
long?
• Are permission restrictions necessary?
• Embargo periods for political/commercial/patent reasons?
• Ethical and privacy issues?
• Who are the foreseeable data users?
• How should your data be cited?
IV. Policies for Re-use & Re-distribution
14. V. Plans for Archiving & Preservation
• What data will be preserved for the long term? For how
long?
• Where will data be preserved?
• What data transformations need to occur before
preservation?
From Flickr by theManWhoSurfedTooMuch
• What metadata will be submitted
alongside the datasets?
• Who will be responsible for preparing
data for preservation?Who will be the
main contact person for the archived
data?
15. Which NSF Requirement to Use?
Which Guideline Should I follow?
First: follow the requirements laid out in the
specific solicitation, if any.
Second: follow the guidelines published by the
appropriate NSF directorate and/or division. If
there is a conflict, the latter takes precedence.
Third: follow the more general guidelines.
Interdisciplinary Proposals
Use guidelines appropriate to the lead program (if
there are specific guidelines)
16. Data Management Planning Resources
http://dmptool.org – Helps you create a
data management plan to meet grant
requirements and identify UVA support
resources and policies
http://databib.org – Helps you find an
appropriate place to deposit your data
http://libra.virginia.edu - Helps UVA
faculty, graduate students, and staff by
providing a place to deposit and share
datasets
17. Step-by-step wizard for generating DMP
Create | edit | re-use | share | save | generate
Open to community
Links to institutional resources
Directorate information & updates
http://dmptool.org
18. Goals of the DMPTool
I. To provide researchers a simple way to
create a DMP for their funding agency
• Questions asked by the agency
• Additional explanation/context provided by the
agency
• Links to the agency website for
policies, help, guidance
19. Goals of the DMPTool
II. To provide researchers with DMP
information from their home institution
• Resources and services to help them manage data
• Help text for specific questions
• Suggested answers to questions; easy to cut-N-
paste
• News & events related to data management on
campus
20. What is a Data Management Plan?
• A comprehensive plan of how you will
manage your research data throughout the
lifecycle of your research project
AND
• Brief description of how you will comply with
funder’s data sharing policy
• Reviewed as part of a grant application
21. Data Management Plans
• Grant Driven
– Requirements
– Sharing and public access to research
• Operational
– Research continuity
– Avoiding data loss
– Efficiency
22. Team Exercise
30 minutes
1. Identify a grant that you have or might apply for
2. Locate the requirements for that grant in the
DMPTool: http://dmptool.org
3. Go through the sections in the DMPTool workflow
to produce draft plan
Be sure to address metadata, access policies, repositories .
4. Identify solutions and available support through
DMPTool sections or ask for guidance
5. Record issues and questions for discussion
23. Presentation of Draft DMPs
15 minutes
• Identify grant
• Describe project briefly
• Explain requirements
• Describe planned solutions
– Must address metadata, access policies, and
repositories.
25. Follow-up
• Contact the Data Management Consulting
Group for help with DMP preparation
Grant driven and operational:
http://dmconsult.library.virginia.edu/plan/
Email: DMConsult@virginia.edu
Hinweis der Redaktion
If have journal article, have record of what you did stored in journals,..But the data underlying the results are really important,funders careColleagues – potential collaboratorsInstitutions (not shown here)Tenure committees more in the future.You: need to care you might need to go back to it in a few years… need good description.Future scientists – potentially use your data to discover important things. Need to be thinking about the future. (providing data for them)Slide from Carly Strasser http://www.slideshare.net/carlystrasser
In addition to addressing the issue of public access to scientific publications, the memorandum requires that agencies start to address the need to improve upon the management and sharing of scientific data produced with Federal funding. Strengthening these policies will promote entrepreneurship and jobs growth in addition to driving scientific progress. Also requires researchers to better account for and manage dataDATA: OMB circular A-110:“data” = digital recorded factual material commonly accepted in the scientific community as necessary to validate research findingsincluding data sets used to support scholarly publications, but does not include laboratory notebooks, preliminary analyses, drafts of scientific papers, plans for future research, peer review reports, communications with colleagues, or physical objects, such as laboratory specimens
Read calls for proposals carefully and ask program director about specific data management requirements. Build time into your proposal development to formulate a data management plan!Private & public – in the US, UK and other countriesOther agencies require sharing, but do not explicitly require a DMP as part of a proposal – NASA, NEH access & preservationNEH Sustainability of project deliverables and datasets – long term preservationDissemination – sharingNew NSF as of Jan. 2013 – Bio Sketch can include products of research
The rest of this talk will focus on the Data Management Plan requirements for the NSFHas been in the Grant Policy Manual since 2002.Even though this “sharing” requirement was in the Admin Guide, there had been little if any enforcement. There was only a “check box” in the Fast Lane system.
Types: experimental, observational, raw or derived, physical collections, models, simulations, curriculum materials, software etc.How will data be collected?Are there tools or software needed to create/process/visualize the data?SciDaC Tip: Describe in general any descriptive or analytical statistics that will run on the data. ORCould include data generated by computer, data collected from sensors or instruments, images, audio files, video files, reports, surveys, patient records, and or other.Qualityassurance & quality control measuresSecurity: version control, backing upWho will be responsible for data management during/after project?
Data documentation (metadata) explains:How data was createdWhat the data meanWhat the content & structure isWhat manipulations have taken placeIt ensures data understanding in the long-termData documentation includes information on:The ProjectData Collection MethodsStructure of the data filesData sources usedAt the data-level, information on:Labels and descriptions for variables & recordsCodes and classificationsDerived data algorithmsWhat metadata are needed to make the data meaningful?How will you create or capture these metadata? Why have you chosen particular standards and approaches for metadata?
Embargo period:Does the original data collector/creator/principal investigator retain the right to use the data before opening it up to wider use?Are you under any obligationto share data? What is the process for gaining access to the data? How should your data be cited?Question we have been asking…. Who owns the data you collect during your research grant?See SciDaC guidelines….. Data Rights and Responsibilities GuidanceOn our web pageCover copyright, licensing if required.Who owns the copyrightand/or intellectual property?Will youretain rights before opening data to wider use? How long?Embargoperiodsfor political/commercial/patent reasons? Ethicaland privacy issues?If you are planning on restricting access, use or dissemination of the data, you must explain in this section how you will codify and communicate these restrictions. Who are the foreseeable data users?
UVA policy states that “data will be preserved for a minimum of five years upon completion of the project” – explain if you’ll be preserving the data longer than five yearsPolicy: Laboratory Notebook and Recordkeeping https://policy.itc.virginia.edu/policy/policydisplay?id=RES-002Places to archive your data:The University of Virginia is developing an institutional repository (Libra), which will serve as an ideal long-term storage facility for digital research data. Deposit in discipline specific repositoryDeposit in Institutional RepositoryMake accessible on online project web page Make accessible on institutional web siteInformally on a peer-to-peer basisSubmitting to a journalWhat datatransformationsneed to occur before preservation?What metadata will be submitted alongside the datasets?Who will be responsible for preparing data for preservation? Who will be the main contact person for the archived data?
With differing guidelines, which one should you use? Guidelines should be followed in this order:First, follow the requirements laid out in the specific solicitation, if any. These can generally be found in a section entitled "Proposal Preparation Instructions." Contact the program officer with any questions. Second, follow the guidelines published by the appropriate NSF directorate and/or division. Not all directorates and divisions have published data management guidelines; check the NSF's page on Dissemination and Sharing of Research Results for updates (1st link in handout) Third, follow the more general guidelines in the Grant Proposal Guide.
The DMPTool was launched in October 2011.Online Data Management Plan creation toolHelps researchers meet requirements of NSF and other U.S. funding agenciesStep-by-step wizard for generating a DMP Open to anyone, even those not affiliated with an institution. Links to institutional resources Directorate information & updates
Helps researchers meet requirements of NSF and other US Funding agencies.Guides researchers thru the process of creating a DMP
Tool is for multi-institutionsIt is available to everyone, even those not affiliated with an institution.Provides additional help for researchers @UVa and Virginia Tech.