Presentation for ITCE Spatial Camp at University of Utah. Discuss importance of research data management, challenges, some best practices. Includes slide of tools and resources for learning more. Some slides and information borrowed from Carly Strasser presentation on the same topic, as well as curriculum from Reproducible Science Workshop presented at iDigBio 2015.
6. What Do I
Do?
• Data Management Plans
(DMPs)
• Courses
• Consultations
• Research Projects
• DataONE, RDA, eScience
Institute
• Institutional Data
Repository (DRUW)
27. “The best thing to do with your data will be
thought of by someone else.”
“We need open data because we don’t just want
to use a car we want to poke around in the
engine, see how it works and then rebuild it.”
~ Rufus Pollock
Founder and President of Open Knowledge Foundation (www.okfn.org)
29. WICHERTS JM, BAKKER M, MOLENAAR D (2011) WILLINGNESS TO SHARE RESEARCH DATA IS RELATED TO THE STRENGTH OF THE EVIDENCE AND THE QUALITY OF REPORTING OF
STATISTICAL RESULTS. PLOS ONE 6(11): E26828. DOI:10.1371/JOURNAL.PONE.0026828
HTTP://127.0.0.1:8081/PLOSONE/ARTICLE?ID=INFO:DOI/10.1371/JOURNAL.PONE.0026828
31. Data planning is more efficient than data forensics.
DATA MANAGEMENT PLANNING
•What will be collected
•Methods
•Standards
•Sharing/access
•Long-term storage
34. AVOID
• spaces
• punctuation
• special characters
• case sensitivity
20130503_DOEProject_DesignDocument_Smith_v2-01.docx
20130709_DOEProject_MasterData_Jones_v1-00.xlsx
20130825_DOEProject_Ex1Test1_Data_Gonzalez_v3-03.xlsx
20130825_DOEProject_Ex1Test1_Documentation_Gonzalez_v3-03.xlsx
20131002_DOEProject_Ex1Test2_Data_Gonzalez_v1-01.xlsx
20141023_DOEProject_ProjectMeetingNotes_Kramer_v1-00.docx
Eaffinis_nanaimo_2010_counts.xls
Site
name
Year
What was
measured
Study
organism
36. NOBLE, WILLIAM S. (2009) "A QUICK GUIDE TO ORGANIZING COMPUTATIONAL BIOLOGY PROJECTS."
PLOS COMPUTATIONAL BIOLOGY. 5(7): DOI/10.1371/JOURNAL.PCBI.1000424
• Pick a method that works for you and stick to it
• DOCUMENT IT!
38. Digital context
• Name of the data set
• The name(s) of the data file(s) in the
data set
• Date the data set was last modified
• Example data file records for each
data type file
• Pertinent companion files
• List of related or ancillary data sets
• Software (including version number)
used to prepare/read the data set
• Data processing that was performed
Personnel & stakeholders
• Who collected
• Who to contact with questions
• Funders
Scientific context
• Scientific reason why the data were
collected
• What data were collected
• What instruments (including model & serial
number) were used
• Environmental conditions during collection
• Temporal & spatial resolution
• Standards or calibrations used
Information about parameters
• How each was measured or produced
• Units of measure
• Format used in the data set
• Precision & accuracy if known
Information about data
• Definitions of codes used
• Quality assurance & control measures
• Known problems that limit data use (e.g.
uncertainty, sampling problems)
39. Temperature
data
Salinity
data
Data import into Excel
Analysis: mean, SD
Graph production
Quality control &
data cleaning
“Clean” T
& S data
Summary
statistics
Data in
spread-
sheet
Simple: Flow chart
WORKFLOW
43. BACKING UP: 3 places, 3 ways
From Flickr by lippo
From Flickr by see phar
Original
Near
Far
What software?
What hardware?
What personnel?
How often?
Set up reminders!
Test system