This document is a learning material for a workshop on research data management. It introduces key concepts in research data management (RDM) such as defining RDM and digital curation. It discusses the research process and challenges around managing large amounts of diverse research data. It also covers drivers for taking RDM seriously such as funder mandates, benefits of open data, and the strategic context of RDM as an emerging field. Learners are guided through reflection activities to relate RDM concepts to their own research interests and roles as information professionals.
1. The basics
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
Research Data Management
Workshop 1.1
2. Learning outcomes
At the end of Workshop 1 you will be able to:
• Discuss the definition of ‘Research Data Management’ and ‘Digital
curation’
• Outline the research process and reflect on the nature of research
data
• Be able to compare different models of the data lifecycle
• Describe the content of a data management plan (DMP)
• Describe the strategic context within which RDM has appeared on
the agenda and the key drivers and issues for researchers
• Reflect on the potential of the area for your interests/ career
• Know where to find out more
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
May-15
3. Session 1.1 overview
• What is research like?
• What is data?
• The RDM challenge
• What is research data management?
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
4. WHAT IS
RESEARCH LIKE ?
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
5. Activity 1
• What is your understanding of the nature of
“research”?
• What is your experience with it?
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
May-15
6. Conceptualising
and networking
Proposal writing
and research
design
Collecting and
analysing data
Infrastructuring
Documenting
and describing
Publishing and
reporting
Engaging and
translating
The research cycle
(RIN, 2010)
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
7. Features of research
• Cyclic
• Iterative
• Non-linear
• Complex through collaboration
– Large scale
– Remote collaborators
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
9. Activity 2
• Name some examples of research data!
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
10. A list we came up with earlier...
• Weather measurements
• Photographs
• Results from experiments
• Government records
• GIS data
• Simulation data
• Log data
• Field notes
• Software
• Images (e.g. brain scans)
• Quantitative data (e.g.
household survey data)
• Historical documents
• Moving images
• Physical objects: such as
bones or blood samples
• Digitised photos / born
digital photos
• Social media data: tweets
• Metadata
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
May-15
11. What is data?
• Some researchers use other terms, eg
“sources”
• Complex: data can be produced from other
data
• “Volume, Variety, Velocity”
• Fragile
• What is the data? The sound files of
interviews, the transcripts, summaries of
interviews, notes on interviews???
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
May-15
12. Definitions of data
• “The data, records, files or other evidence, irrespective
of their content or form (eg in print, digital, physical or
other forms), that comprise a research project’s
observations findings or outcomes, including primary
materials and analysed data” (Monash University,
2010)
• “Qualitative or quantitative statements or numbers
that are (or assumed to be) factual. Data may be raw or
primary data (eg direct from measurement), or
derivative of primary data, but are not yet the product
of analysis or interpretation other than calculation”
(Royal Society, 2012: 12)
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
14. Imagine
If you went round researchers’ offices talking to
them about their data:
• How much they have?
• How they store and back it up?
• Can they always refind it?
• Whether they share it?
• Who owns it
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
15. Duffy (2013) on scale of the data issue
at University of Birmingham
• 3000 items in institutional repository
• 50,000 items in special collections
• 75,000 publications for REF
• 2,700,000 items in library
• 700,000,000 folders in top 100 accounts
• Perhaps 1,000,000,000 folders for the whole
university
May-15
16. Complexity of information practices
• Information flow maps for life science research
(RIN, 2009) e.g. in neuroscience illustrate
– Multiple data sources, of different types
• Visual images, quantitative data, secondary data
– Storage devices
– Multiple analytic tools
• Some requiring grid power
– Supporting complex scholarly communication
• Different communities do things differently, eg in
terms of file types, tools used
May-15
17. A short (incomplete) history
of research data policy in UK
• National data centres have existed for a number of decades
• 1990s Growing interest in “digital curation” (Higgins, 2011)
• Late 90s cyber-science, e-science, e-research
• 2004 DCC founded
• 2004, 2007 OECD “principles and guidelines”
• 2005 - UK Research funders first phase of policy
• 2009 UKRDS not funded; first JISC MRD programme
• 2010 UK general election
• 2011 new RCUK joint statement and EPSRC policy framework and
expectations
– Harmonisation, shift from curation to sharing, more detail in policy (Jones,
2012)
• Institutional policies; second JISC MRD programme
• 2012 Royal Society’s “Science as an open enterprise”
18. Mandating good RDM
• Funders’ mandates
– Research Councils UK Common Principles on Data
Policy:
http://www.rcuk.ac.uk/research/Pages/DataPolicy.a
spx
– EPSRC principles and expectations:
http://www.epsrc.ac.uk/about/standards/research
data/Pages/default.aspx
May-15
19. Activity 3
• Read your or another institution’s research
data policy:
– What are the two most important points you pick
up from this document?
– According to this policy, what are the incentives to
take Research Data Management seriously?
• You can find research data policies at
http://www.dcc.ac.uk/resources/policy-and-
legal/institutional-data-policies
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
May-15
20. Science as an open enterprise
• Data is not a private
preserve
• Credit for data
communication – an open
data culture
• Common standards
• Scientific journals require
data communication
• More data scientists
• New software tools
• “legitimate boundaries”:
– Commercial value
– Privacy
– Safety
– Security
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
“Open inquiry is at the heart of the scientific enterprise”
21. What should “data communication” be
like?
• Accessible – can be found
• Intelligible – must be understandable to other
researchers
• Assessable – potential to be evaluated
• Usable – should be in form for reuse
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
22. What is data sharing?
• With future self
• With collaborators
• With collaborators
beyond the institution
• By request
• Linked to a publication
• Open data in a
repository
• Link to “open access”
agenda?
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
24. WHAT IS
RESEARCH DATA MANAGEMENT?
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
May-15
25. RDM: definition
• “Research data management concerns the
organisation of data, from its entry to the
research cycle through to the dissemination
and archiving of valuable results.” (Whyte &
Tedds, 2011)
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
May-15
26. Digital curation
• “Digital curation, broadly interpreted, is about maintaining and
adding value to a trusted body of digital information for current and
future use.” (DCC, n.d.: 6)
– Managing digital material from the point it is created
– Adding value so that it can be used and re-used
– Includes the destruction of data
– Beyond archiving and preservation
• “Digital curation is concerned with actively managing data for as
long as it continues to be of scholarly, scientific, research and/or
administrative interest, with the aim of supporting reproducibility of
results, reuse of and adding value to that data, managing it from its
point of creation until it is determined not to be useful, and
ensuring its long-term accessibility and preservation, authenticity
and integrity.”
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
May-15
27. Practical RDM
• Store data securely
• Back data up
• Use filename conventions and version control
– objective
– meaningful
– concise
– standardised
• Dispose of data
• Understand legal issues (e.g. Data Protection Act,
Freedom of Information Act), copyright and licensing
issues
Data loss stories:
https://code.soundsoftware.ac.uk/projects/sodamat/wiki/Evidence_Promoting_Good_Data_Management
28. What might you be asked?
• Where to locate data for reuse in research
• How to complete a DMP for a research
proposal
• How to write an ethics proposal to ensure that
can produce open research data
• How to cite data
• How to store data in the short or long run
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
30. Activity 4: Reflection
• Which aspects of support to research are you
most interested in, and why?
• How do they fit into your future role as an
information professional?
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
May-15
32. Images
• Slide 31:
– Jones, S., Pryor, G. & Whyte, A. (2013). ‘How to
Develop Research Data Management Services - a
guide for HEIs’. DCC How-to Guides. Edinburgh:
Digital Curation Centre. Available online:
http://www.dcc.ac.uk/resources/how-guides
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose
33. References
• DCC (n.d.). DC 101: What is digital curation? Edinburgh: Digital Curation Centre. Retrieved from
http://www.dcc.ac.uk/webfm_send/437.
• Duffy, S. (2013) Managing research data in an open access world RLUK AGM April,
http://www.rluk.ac.uk/events/rluk-agm-2013-exeter/
• Higgins, S. (2011). Digital Curation: the Emergence of a New Discipline. The International Journal of
Digital Curation, 6(2), 78-88.
• Jones, S. (2012) Developments in Research Funder Data Policy. International Journal of Digital
Curation 7 (1), 114-125
• Monash University (2010) Monash University Research Data Policy.
• RIN. (2009). Patterns of information use and exchange : case studies of researchers in the life
sciences. London. Retrieved from http://rinarchive.jisc-collections.ac.uk/our-work/using-and-
accessing-information-resources/patterns-information-use-and-exchange-case-studie
• RIN. (2010). Open to All? Case Studies of Openness in Research. London. Retrieved from
http://rinarchive.jisc-collections.ac.uk/our-work/data-management-and-curation/open-science-
case-studies.
• The Royal Society. (2012). Science as an open enterprise. Retrieved from
https://royalsociety.org/policy/projects/science-public-enterprise/Report/.
• Whyte, A., & Tedds, J. (2011). Making the case for Research Data Management. Edinburgh: Digital
Curation Centre. Retrieved from http://www.dcc.ac.uk/webfm_send/487.
May-15
Learning material produced by RDMRose
http://www.sheffield.ac.uk/is/research/projects/rdmrose