Supporting Research Data Management at the University of Stirling
1. Because good research needs good data
Supporting Research Data Management
at the University of Stirling
Graham Pryor and Martin Donnelly
Digital Curation Centre
27 April 2012
Funded by
This work is licensed under a Creative Commons Attribution 2.5 UK: Scotland License
2. The Digital Curation Centre is
• a consortium comprising units from the Universities of
Bath (UKOLN), Edinburgh (DCC Centre) and Glasgow
(HATII)
• launched 1st March 2004 as a national centre for
solving challenges in digital curation that could not be
tackled by any single institution or discipline
• funded by JISC
• with additional HEFCE funding from 2011 for
• the provision of support to national cloud services
• targeted institutional development
3. The DCC Mission
Helping to build capacity,
capability and skills in data
management and curation
across the UK’s higher
education research community
– DCC Phase 3
Business Plan
4. DCC institutional stakeholders
University managers
Researchers
• University libraries
Research support
• IT services
staff with a role to play
• The research and
in data management, innovation office
particularly those from • Digital repositories
5. Why manage research data?
The impact of e-Science and the global network
• “Research data is a form of infrastructure, the basis
for data intensive research across many domains” –
EC Riding the Wave report, 2010
• “Funders expect research to be international in
scope. A third of all articles published are
internationally collaborative” – Royal Society, 2011
The governmental and funder imperative
• “Publicly-funded research data must be made
available for secondary scientific research” – ESRC
research data policy
6. Why manage research data?
The researcher incentive
• “By making their data available via licensed
platforms researchers stand to improve their
status as researchers through the mandatory
citing and attribution of their original work”
– Mark Hahnel, FigShare, IDCC 2011
7. Why manage research data?
The researcher incentive
• “By making their data available via licensed
platforms researchers stand to improve their
status as researchers through the mandatory
citing and attribution of their original work”
– Mark Hahnel, FigShare, IDCC 2011
The same demanding, sometimes competing
community of perspectives that the Digital Curation
Centre was created to unravel…
8. Where is the data in research?
The six datacentric phases of the research lifecycle
13. “For science to effectively function,
and for society to reap the full
benefits from scientific endeavours,
http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/publications.htm
it is crucial that science data be l#november-2009
made open”
14. Open to all? Case studies of openness
in research
Choices are made according to context, with
degrees of openness reached according to:
• The kinds of data to be made available
• The stage in the research process
• The groups to whom data will be made
available
• On what terms and conditions it will be
provided
Default position of most:
• YES to protocols, software, analysis tools,
methods and techniques
• NO to making research data content freely
available to everyone
After all, where is the incentive? Angus Whyte, RIN/NESTA, 2010
15. “Data
sharing was
“While many researchers are more readily
positive about sharing data indiscussed by
principle, they are almost early career
universally reluctant in researchers.”
practice. ..... using these
data to publish results before
anyone else is the
primary way of gaining
prestige in nearly all
disciplines.” INCREMENTAL Project
16. Rules and regulations…
Compliance
Data Protection Act
1998
• Rights, Exemptions, Enforcement
Freedom of • Climategate, Tree Rings, Tobacco
Information Act 2000 and…(what’s next?)
Computer Misuse Act
1980
• etc. etc. etc………..
17. Policy
• Public good
• Preservation
• Discovery
• Confidentiality
• First use
• Recognition
• Public funding
18. RCUK Policy and Code of Conduct on the
Governance of Good Research Conduct (updated Oct 2011)
UNACCEPTABLE RESEARCH CONDUCT includes mismanagement or
inadequate preservation of data and/or primary materials, including failure
to:
keep clear and accurate records of the research procedures followed
and the results obtained, including interim results;
hold records securely in paper or electronic form;
make relevant primary data and research evidence accessible to
others for reasonable periods after the completion of the research:
data should normally be preserved and accessible for 10 yrs (in some
cases 20 yrs or longer);
manage data according to the research funder’s data policy and all
relevant legislation;
wherever possible, deposit data permanently within a national
collection.
Responsibility for proper management and preservation of data and primary
materials is shared between the researcher and the research organisation.
19.
20. EPSRC’s nine expectations and
a roadmap - implications for HEIs
http://www.epsrc.ac.uk/about/standards/researchdata/Pages/expectations.aspx
22. Regulation, regulation…
…….addressing where
European copyright and
database law poses flaws and
obstacles to the access to
research data
Intellectual Property Rights and Digital Preservation
21.11.2011 at the Clifton Hill House, Bristol University
“a poor fit between technology, processes and
regulations constrains preservation actions and
significantly inhibits the benefits which long-term
access ought to deliver”
24. Management – infrastructure and
data storage challenges...
Scaleable
Cost-effective (rent on-demand)
Secure (privacy and IPR)
Robust and resilient
Low entry barrier / ease-of-use
Has data-handling / transfer /
analysis capability
Cloud services?
The case for cloud computing in genome
informatics. Lincoln D Stein, May 2010
25. http://www.flickr.com/photos/mattimattila/3003324844/
“Departments don’t have guidelines or
norms for personal back-up and researcher
procedure, knowledge and diligence varies
tremendously. Many have experienced
moderate to catastrophic data loss”
Incremental Project Report, June 2010
29. DCC Institutional Support:
Tools and Services
Martin Donnelly
Digital Curation Centre
University of Edinburgh
University of Stirling
27 April 2012
30. Institutional Engagements
With funding from HEFCE we’re:
• Working intensively with 18 HEIs to increase RDM capability
– 60 days of effort per HEI drawn from a mix of DCC staff
– Deploy DCC & external tools, approaches & best practice
• Support varies based on what each institution wants/needs
• Lessons & examples to be shared with the community
www.dcc.ac.uk/community/institutional-engagements
31. Some current IE activities
Assessing Piloting tools
needs e.g. DataFlow
RDM roadmaps
Policy Policy
development implementation
32. Support offered by the DCC
Institutional
Assess data catalogues
needs Workflow
assessment Pilot RDM
tools
Develop
DAF & CARDIO DCC
assessments Guidance support
support
team and training and
services
RDM policy
Advocacy to senior development
management
Customised Data
Make the case Management Plans
…and support policy implementation
33. DATA MANAGEMENT STRATEGY
(Research and Admin)
Five components:
• Policy
• Advocacy
• Planning
• Tools
• Training
35. Your Data as Assets: DAF
• What are the characteristics of
research data assets?
– Number?
– Scale?
– Complexity?
– Dependencies?
– Liabilities?
• Why do researchers act the way they
do with respect to data?
• What do they need to do research?
36. IN BRIEF
The Data Asset Framework provides a methodology
and online tool to identify research data assets and
find out how they are being managed. This
information will enable institutions to develop a data
strategy so their assets are preserved and remain
accessible in the long term. It is usually applied at
research group / department level to ensure the
scope is manageable.
URL: http://www.data-audit.eu
37. Data Management Planning:
DMP Online
• A growing requirement from
funders, publishers and HEIs,
in the UK and internationally
• Supportive of good research
practice, according to RCUK
• A cross-cutting activity
involving multiple stakeholder
types (researchers, librarians,
IT managers, support staff)
38. IN BRIEF
DMP Online is the DCC's web-based data
management planning tool. It allows you to build and
edit DMPs according to the requirements of the
major UK funders.
The tool also contains helpful guidance and links for
researchers and other data professionals. The
structure of the tool is based on the DCC’s Checklist
for a Data Management Plan.
URL: http://www.dcc.ac.uk/dmponline
39.
40. Capacity Assessment and
Building: CARDIO
• How well does an institution (or
department, School, etc) manage its data?
• Depends on:
– Finances
– Technology
– Policy management
– Organisational will
• Demands acknowledgement of many
perspectives
41. IN BRIEF
An online tool which helps departments or research
groups to identify and communicate their current data
management capabilities, and subsequently identify
coordinated pathways for future enhancement via a
dedicated knowledge base.
CARDIO emphasises a collaborative, consensus-
driven approach, and enables benchmarking with
other groups and institutions.
URL: http://cardio.dcc.ac.uk/
42.
43. Risk Management: DRAMBORA
• A variety of risk factors, both internal
and external, affect the management of
digital objects such as research data
• Risks can tangible (fire/flood) or
intangible (accidental data loss leading
to reputational impact)
• They may exist in isolation, or lead to
other risks if not adequately managed
44. IN BRIEF
DRAMBORA is an audit methodology and tool for
identifying and planning for the management of risks
which may threaten the availability and/or usability of
content in a digital repository or archive.
URL: http://www.repositoryaudit.eu
45. DCC Services
• Policy
• Strategy
• Training
• Other services…
46. Policy (i)
The DCC has a number of guidance resources related to
research data policy. We can guide institutions on their
requirements to manage/share data, and offer practical
steps to help them develop data policies by:
- Providing templates and examples to demonstrate
what aspects could be incorporated into a data policy;
- Coordinating / contributing to meetings of relevant
stakeholders to ensure all activities and perspectives are
addressed;
- Reviewing and feeding back on draft policies;
- Assisting with communications to launch and
implement the policy.
47. Policy (ii)
Benefits of developing a data policy:
- Compliance with funder guidelines, e.g. the EPSRC
expectation that HEIs have a RDM roadmap in place by
May 2012, and be fully compliant by May 2015;
- Assuring the good conduct of research in line with
Research Integrity guidelines (see RCUK & UKRIO docs);
- Clarity for researchers and demonstrable institutional
commitment for RDM;
- The prestige of joining a small but growing group of
leading institutions with a data policy:
http://www.dcc.ac.uk/resources/policy-and-
legal/institutional-data-policies
48. Strategy (i)
We offer a half-day workshop in which key stakeholders
from an institution (e.g. librarians, senior IT staff, research
administration, repository staff, researchers, etc) convene
to discuss and develop an institutional strategy for RDM.
Benefits:
- Coherence across service providers and agreed
direction for RDM services;
- Ability to reference strategy / commitment to RDM (the
University of Oxford policy may be a useful example of
this - http://www.admin.ox.ac.uk/rdm);
- A move towards more efficient management of data.
49. Strategy (ii)
Through practical breakout sessions, senior DCC staff can
lead and mediate discussion to help the institution
determine its priorities and define practical next steps.
These might include the development of infrastructure (e.g.
data repositories), new services (e.g. DMP support), policy
development, improved guidance or data management
training provision.
Suggested actions will depend on gaps/areas for
improvement as perceived by the institution.
50. Training (i)
We offer a variety of training courses:
- DC101 introduction to data management
- Tools of the Trade courses which give practical
overviews and hands-on exercises using DCC tools
- Train-the-Trainer, which equips information professionals
to teach RDM courses.
We also organise regional data management roadshow
events which can incorporate a training element.
Generic training materials are available online, and
hardcopy packs can be produced.
51. Training (ii)
The DCC can:
- Run courses, tailoring content to institutional needs;
- Assist in the development of online learning materials
(screencasts, audio-synced slides);
- Develop resources such as guidance documents, case
studies and manuals.
Key benefits of training provision are:
- Improved data management capacity;
- The opportunity to profile and raise awareness of
institutional support services.
52. Other services...
CARDIO Used at research group or department level to assess activity and
data management infrastructure and contribute to an institution-wide
view
Data Asset Framework DAF is a structured mechanism used to identify what data exists and
understand how research data are being managed and shared
Customised DMP We can work with you to develop an institution-specific instance of
DMP Online for developing data management plans that fit funder
requirements before and after an award of grant
Policy development We can assist in the development of institutional policy
Workflow assessment Using tested methodologies we can analyse current research data
workflows
Training We can train people in the use of many of the above tools and in
generic skills such as data quality assessment
Costing We can assist with the development of costing and pricing for data
management services
Risk management Working with you to identify risks in current or planned research data
management practice, we will make recommendations on mitigation
and the elimination of those risks
Institutional data We can recommend options for exposing metadata about your
research data via CRIS systems, repositories, or a mix of these
catalogues
53. Recap: support offered by the DCC
Institutional
Assess data catalogues
needs Workflow
assessment Pilot RDM
tools
Develop
DAF & CARDIO DCC
assessments Guidance support
support
team and training and
services
RDM policy
Advocacy with senior development
management
Customised Data
Make the case Management Plans
…and support policy implementation
54. Practicalities
• University Modernisation Fund provides
resource for 18 “institutional engagements”
between DCC and HEIs
• Up to 60 days of effort available per
institution, between now and March 2013
• Institution agrees a schedule of work with
the DCC, and each assigns a primary
contact / programme manager