Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Open Data - strategies for research data management & impact of best practices
1. Facilitate Open Science Training for European Research
Open Data: Strategies for Research Data Management,
and impact of best practices?
Martin Donnelly
Digital Curation Centre
University of Edinburgh
NCP Academy Webinar
16 June 2017
2. Overview
1. Background
2. Context: Open Access + Open Data (+ Open Source) =
Open Science (or Open Research)
3. What is good RDM practice?
4. What are the benefits of good RDM?
5. What are the risks of poor RDM?
6. A step by step approach
7. Do’s and don’ts / Rules of thumb
8. About the FOSTER project
9. About the DCC / contact details
3. Overview
1. Background
2. Context: Open Access + Open Data (+ Open Source) =
Open Science (or Open Research)
3. What is good RDM practice?
4. What are the benefits of good RDM?
5. What are the risks of poor RDM?
6. A step by step approach
7. Do’s and don’ts / Rules of thumb
8. About the FOSTER project
9. About the DCC / contact details
4. Background (me)
• Academic background in cultural heritage computing…
• Which led me to work in digital preservation…
• Which led to my current involvement in research data
management and the broader topic of Open Science
• I’ve been involved to various degrees in the development
of early DMP resources (DCC Checklist, DMPonline,
DMPTool, book chapter on DMP…)
• Member of the original FOSTER consortium
• Also involved in consultancy, advocacy, events, training
etc, e.g. as external expert reviewer of Horizon 2020
DMPs
5. Overview
1. Background
2. Context: Open Access + Open Data (+ Open Source)
= Open Science (or Open Research)
3. What is good RDM practice?
4. What are the benefits of good RDM?
5. What are the risks of poor RDM?
6. A step by step approach
7. Do’s and don’ts / Rules of thumb
8. About the FOSTER project
9. About the DCC / contact details
6. Open Access + Open Data = Open Science
• Openness in research is situated within a context of ever
greater transparency, accessibility and accountability
• As Open Access to publications became normal (if not yet
ubiquitous), the scholarly community turned its attention to the
data which underpins the research outputs, and eventually to
consider it a first-class output in its own right. The development
of the OA and research data management (RDM) agendas are
closely linked as part of a broader trend in research, sometimes
termed ‘Open Science’ or ‘Open Research’
• “The European Commission is now moving beyond open access towards
the more inclusive area of open science. Elements of open science will
gradually feed into the shaping of a policy for Responsible Research and
Innovation and will contribute to the realisation of the European
Research Area and the Innovation Union, the two main flagship
initiatives for research and innovation”
http://ec.europa.eu/research/swafs/index.cfm?pg=policy&lib=science
7. Overview
1. Background
2. Context: Open Access + Open Data (+ Open Source) =
Open Science (or Open Research)
3. What is good RDM practice?
4. What are the benefits of good RDM?
5. What are the risks of poor RDM?
6. A step by step approach
7. Do’s and don’ts / Rules of thumb
8. About the FOSTER project
9. About the DCC / contact details
8. Good practice in RDM
RDM is “the active
management and appraisal
of data over the lifecycle of
scholarly and scientific
interest”
What sorts of activities?
- Planning and describing data-
related work before it takes
place
- Documenting your data so that
others can find and understand
it
- Storing it safely during the
project
- Depositing it in a trusted
archive at the end of the
project
- Linking publications to the
datasets that underpin them
9. Overview
1. Background
2. Context: Open Access + Open Data (+ Open Source) =
Open Science (or Open Research)
3. What is good RDM practice?
4. What are the benefits of good RDM?
5. What are the risks of poor RDM?
6. A step by step approach
7. Do’s and don’ts / Rules of thumb
8. About the FOSTER project
9. About the DCC / contact details
10. Benefits
• IMPACT and LONGEVITY: Open data (and publications) receive
more citations, over longer periods
• SPEED: The research process becomes faster
• ACCESSIBILITY: Interested third parties can (where
appropriate) access and build upon publicly-funded research
outputs with minimal barriers to access
• EFFICIENCY: Data collection can be funded once, and used
many times for a variety of purposes
• TRANSPARENCY and QUALITY: The evidence that underpins
research can be made open for anyone to scrutinise, and
attempt to replicate findings. This leads to a more robust
scholarly record, and reduces academic fraud for example
• DURABILITY: simply put, fewer important datasets will be lost
11. “In genomics research, a large-scale
analysis of data sharing shows that
studies that made data available in
repositories received 9% more
citations, when controlling for other
variables; and that whilst self-reuse
citation declines steeply after two
years, reuse by third parties
increases even after six years.”
(Piwowar and Vision, 2013)
Van den Eynden, V. and Bishop, L.
(2014). Incentives and motivations for
sharing research data, a researcher’s
perspective. A Knowledge Exchange
Report,
http://repository.jisc.ac.uk/5662/1/KE
_report-incentives-for-sharing-
researchdata.pdf
Benefits: Impact and Longevity
12. “Data is necessary for
reproducibility of
computational research”
Victoria Stodden, “Innovation and Growth
through Open Access to Scientific Research:
Three Ideas for High-Impact Rule Changes” in
Litan, Robert E. et al. Rules for Growth:
Promoting Innovation and Growth Through Legal
Reform. SSRN Scholarly Paper. Rochester, NY:
Social Science Research Network, February 8,
2011. http://papers.ssrn.com/abstract=1757982.
Benefits: Quality
13. Baker, M. (2016)
“1,500 scientists
lift the lid on
reproducibility”,
Nature,
533:7604,
http://www.nat
ure.com/news/1
-500-scientists-
lift-the-lid-on-
reproducibility-
1.19970
14. “Conservatively, we estimate that the value of data in
Australia’s public research to be at least $1.9 billion
and possibly up to $6 billion a year at current levels of
expenditure and activity. Research data curation and
sharing might be worth at least $1.8 billion and possibly
up to $5.5 billion a year, of which perhaps $1.4 billion to
$4.9 billion annually is yet to be realized.”
• “Open Research Data”, Report to the Australian National Data Service (ANDS),
November 2014 - John Houghton, Victoria Institute of Strategic Economic
Studies & Nicholas Gruen, Lateral Economics
Benefits: Financial
15. J. Manyika et al. "Open data: Unlocking innovation
and performance with liquid information" McKinsey
Global Institute, October 2013
16. “If we are going to wait
five years for data to
be released, the Arctic
is going to be a very
different place.”
Bryn Nelson, Nature, 10 Sept 2009
http://www.nature.com/nature/jour
nal/v461/n7261/index.html
Benefits: Speed
https://www.flickr.com/photos/gsfc/7348953774/
- CC-BY
17. Benefits: Durability
Vines et al. “examined the availability of data from 516 studies between 2 and 22 years
old”
- The odds of a data set being reported as extant fell by 17% per year
- Broken e-mails and obsolete storage devices were the main obstacles to data sharing
- Policies mandating data archiving at publication are clearly needed
“The current system of leaving data with authors means that almost all of it is lost
over time, unavailable for validation of the original results or to use for entirely new
purposes” according to Timothy Vines, one of the researchers. This underscores the
need for intentional management of data from all disciplines and opened our
conversation on potential roles for librarians in this arena. (“80 Percent of Scientific
Data Gone in 20 Years” HNGN, Dec. 20, 2013,
http://www.hngn.com/articles/20083/20131220/80-percent-of-scientific-data-gone-in-
20-years.htm.)
Vines et al., The Availability of Research Data Declines Rapidly with Article Age,
Current Biology (2014), http://dx.doi.org/10.1016/j.cub.2013.11.014
18. Overview
1. Background
2. Context: Open Access + Open Data (+ Open Source) =
Open Science (or Open Research)
3. What is good RDM practice?
4. What are the benefits of good RDM?
5. What are the risks of poor RDM?
6. A step by step approach
7. Do’s and don’ts / Rules of thumb
8. About the FOSTER project
9. About the DCC / contact details
19. Risks of getting this wrong
• Legal – sensitive data is protected by law (and contracts)
and needs to be protected
• Financial – non-compliance with funder policies can lead
to reduced access to income streams
• Scientific – potential discoveries may be hidden away in
drawers, on USB
• Opportunity cost – reduced visibility for research > lost
opportunities for collaboration
• Quality – the scholarly record becomes less robust
• Reputational – responsible data management is
increasingly considered a core element of good scholarly
practice in the 21st century
20. Growing momentum and ubiquity…
Data management
is a part of good
research practice.
- RCUK Policy and Code of
Conduct on the
Governance of Good
Research Conduct
21. Overview
1. Background
2. Context: Open Access + Open Data (+ Open Source) =
Open Science (or Open Research)
3. What is good RDM practice?
4. What are the benefits of good RDM?
5. What are the risks of poor RDM?
6. A step by step approach
7. Do’s and don’ts / Rules of thumb
8. About the FOSTER project
9. About the DCC / contact details
22. Step 1. Be clear about who is involved
• RDM is a hybrid activity, involving multiple stakeholder groups…
• The researchers themselves
• Research support personnel
• Partners based in other institutions, funders, data centres, commercial
partners, etc
• No single person does everything, and it makes no sense to duplicate
effort or reinvent wheels
• Data Management Planning (DMP) underpins and pulls together
different strands of data management activities. DMP is the process
of planning, describing and communicating the activities carried
out during the research lifecycle in order to…
• Keep sensitive data safe
• Maximise data’s re-use potential
• Support longer-term preservation
• Data Management Plans are a means of communication, with
contemporaries and future re-users alike
23. Step 2. Write things down
• In a data management plan / record
• In metadata to describe the data and help others to
understand it
• In workflows and README files
• In version management
• In justifying decisions re. access, embargo, selection
and appraisal… the list can be very long…
Communication is crucial!
24. Step 3. Don’t try to do everything yourself
• See Step 1 ;)
25. RDM / Open Data in practice: key points
1. Understand your funder’s policies (and perhaps national policy
initiatives – see recent SPARC-Europe reports)
2. Create a data management plan (e.g. with DMPonline)
3. Decide which data to preserve (e.g. using the DCC How-To
guide and checklist, “Five Steps to Decide what Data to Keep”)
4. Identify a long-term home for your data (e.g. via re3data.org)
5. Link your data to your publications with a persistent identifier
(e.g. via DataCite)
• N.B. Many archives, including Zenodo, will do this for you
6. Investigate EU infrastructure services and resources
26. Overview
1. Background
2. Context: Open Access + Open Data (+ Open Source) =
Open Science (or Open Research)
3. What is good RDM practice?
4. What are the benefits of good RDM?
5. What are the risks of poor RDM?
6. A step by step approach
7. Do’s and don’ts / Rules of thumb
8. About the FOSTER project
9. About the DCC / contact details
27. A few do’s and don’ts for RDM
DO DON’T
Have a plan for your data Make it up as you go along
Keep backups. Make this easy with automated
syncing services like Dropbox, provided your
data isn’t too sensitive
Carry the only copy around on a memory
card, your laptop, your phone, etc
Describe your data as you collect it. This
makes it possible for others to interpret it,
and for you to do the same a few years down
the line
Leave this till the end. The quality of
metadata decreases with time, and the
best metadata is created at the moment of
data capture
Save your work in open file formats, where
possible, and use accepted metadata
standards to enable like-with-like comparison
Invent new ‘standards’ where community
norms already exist
Deposit your data in a data centre or
repository, and link it to your publications
Be afraid to ask for help. This will exist
both within your institution, and via
national / European support organisations
28. Rules of thumb
• Without intervention, data + time = no data
• See Vines, above
• Prioritise: could anyone die or go to jail?
• Legal issues (e.g. protecting vulnerable subjects) are the most
important
• Storage is not the same as management
• Think of data as plants and the servers as a greenhouse
• The plants still need to be fed, watered, pruned, etc… and
sometimes disposed of
• Management is not the same as sharing
• Not all data should be shared
• Approach: “As open as possible, as closed as necessary”
29. Overview
1. Background
2. Context: Open Access + Open Data (+ Open Source) =
Open Science (or Open Research)
3. What is good RDM practice?
4. What are the benefits of good RDM?
5. What are the risks of poor RDM?
6. A step by step approach
7. Do’s and don’ts / Rules of thumb
8. About the FOSTER project
9. About the DCC / contact details
30. • Phase 1 (2014-2016): Spread
the Seeds of Open Science and
Open Access
• Creation of Open Science
Taxonomy
• 2000+ training materials,
categorized in the FOSTER
Portal
• More than 100 f2f training
events in 28 countries and 25
online courses, totalling more
than 6300 participants
FacilitateOpenScienceTrainingforEuropeanResearch
The project
http://fosteropenscience.eu
31. • Phase 2 (2017-2019): Let the Flowers of Open Science Bloom
• Focus on:
• Training for the practical implementation of Open Science (face to face
and online) including RDM and Open Data
• Developing intermediate/advanced level/discipline-specific training
resources in collaboration with three disciplinary communities (and
related RIs): Life Sciences (ELIXIR), Social Sciences (CESSDA) and
Humanities (DARIAH)
• Update the FOSTER Portal to support moderated learning, badges and
gamification
• In concrete terms:
• 150 new training resources
• Over 50 training events (outcome-oriented, providing participants with
tangible skills) and 20 e-learning courses
• Multi-module Open Science Toolkit
• Trainers Network, Open Science Bootcamp, Open Science Training
Handbook, and more…
FacilitateOpenScienceTrainingforEuropeanResearch
The project
http://fosteropenscience.eu
32. Overview
1. Background
2. Context: Open Access + Open Data (+ Open Source) =
Open Science (or Open Research)
3. What is good RDM practice?
4. What are the benefits of good RDM?
5. What are the risks of poor RDM?
6. A step by step approach
7. Do’s and don’ts / Rules of thumb
8. About the FOSTER project
9. About the DCC / contact details
33. The Digital Curation Centre (DCC)
• UK national centre of expertise in digital preservation
and data management, est. 2004
• Principal audience is the UK higher education sector, but
we increasingly work further afield (continental Europe,
North America, South Africa, Asia…)
• Provide guidance, training, tools (e.g. DMPonline) and
other services on all aspects of research data
management and Open Science
• Tailored consultancy/training
• Organise national and international events and webinars
(International Digital Curation Conference, Research
Data Management Forum)
34. Contact details
• For more information about the
FOSTER project:
• Website: www.fosteropenscience.eu
• Principal investigator: Eloy Rodrigues
(eloy@sdum.uminho.pt)
• General enquiries: Gwen Franck
(gwen.franck@eifl.net)
• Twitter: @fosterscience
• My contact details:
• Email: martin.donnelly@ed.ac.uk
• Twitter: @mkdDCC
• Slideshare:
http://www.slideshare.net/martindo
nnelly
This work is licensed under the
Creative Commons Attribution
2.5 UK: Scotland License.