This presentation was given by Joy Davidson from the Digital Curation Centre at the KAPTUR training event held on Monday 19th November and supported by DCC through the Institutional Engagement project.
1. Because good research needs good data
What support will you need and when will
you need it?
Joy Davidson and Sarah Jones
Digital Curation Centre, Glasgow
joy.davidson@glasgow.ac.uk
sarah.jones@glasgow.ac.uk
Funded by:
University of the Arts, DCC101, London, 19th November 2012
2. Because good research needs good data
Key questions researchers may have:
• when do I need to develop a data management plan?
• what data do I need to keep?
• what are the policies of GU, funders and data repositories?
• where should I store my data?
• what kind of license will I need for my data?
• how can I make the most impact with my data?
University of the Arts, DCC101, London, 19th November 2012
3. Because good research needs good data
When do researchers need to develop a
data management plan?
GU policy
Collaborate with support staff in colleges/schools and central
services to undertake sound research data management as a
fundamental part of good research practice.
As early as possible!
University of the Arts, DCC101, London, 19th November 2012
4. Because good research needs good data
What do I need to consider when creating
data?
GU Policy
Undertake research activity to the standards outlined in the
University’s Code of Good Research Practice.
Data should be kept to allow for validation of results.
Minimise risk of damage to reputation for
you and the university.
University of the Arts, DCC101, London, 19th November 2012
5. Because good research needs good data
Where should I store my data?
GU Policy
Work with IT Services and College IT teams to identify
storage requirements that may exceed that currently
offered by the institution.
Work with research support staff to identify key research
data outputs that must be retained to enable validation
(and potentially reuse).
• SoundSoftware Repository Service
• DataCite list of data repositories
http://datacite.org/repolist
University of the Arts, DCC101, London, 19th November 2012
6. Because good research needs good data
Why should I deposit my data in a
repository?
Data will be safeguarded for longer-term
access.
Data will more visible for validation, citation
and reuse.
Browsing data centre content can be a good way
for researchers to find potential collaborators and
gaps in research.
University of the Arts, DCC101, London, 19th November 2012
7. Because good research needs good data
AHRC expectations
Technical summary and plan to replace technical appendix as of
December 1, 2012
• Section 1: Summary of Digital Outputs and Digital Technologies
• Section 2: Technical Methodology
• 2a: Standards and Formats
• 2b: Hardware and Software
• 2c: Data Acquisition, Processing, Analysis and Use
• Section 3: Technical Support and Relevant Experience
• Section 4: Preservation, Sustainability and Use
• 4a: Preserving Your Data
• 4b: Ensuring Continued Access and Use of Your Digital Outputs
http://www.ahrc.ac.uk/Funding-Opportunities/Research-
funding/RFG/Application-guidance/Pages/Technical-Plan.aspx
University of the Arts, DCC101, London, 19th November 2012
8. Because good research needs good data
Funders’ expectations
• The British Academy expects deposits to be offered for
deposit at the AHDS or ESDS “within a reasonable time
after the completion of a project”.
• Wellcome Trust expects as an absolute minimum,
researchers should make relevant data available to
others on publication of their research, however
opportunities for timely and responsible pre-publication
sharing of data should also be maximised. The Trust will
provide grantholders with additional funding, through
their institutions, to cover open access charges.
• EU states all files and documents have to be kept for up
to five years after the end of the project for auditing
purposes. Data management plan is a requirement.
University of the Arts, DCC101, London, 19th November 2012
9. Because good research needs good data
Institutional Policies and Strategies
Support staff should familiarise themselves with
institutional codes of research practice, standards
and strategies.
http://www.arts.ac.uk/research/data-management/
University of the Arts, DCC101, London, 19th November 2012
10. Because good research needs good data
Issues to consider when advising researchers
Sustainability of repository
Prior to advising on places of data deposit, check the data
repository’s sustainability claims – both for researchers’ data and
for the repository itself.
Exeter Data Archive (EDA) Example
EDA regularly backs up its files according to current best
practice.
In the event of Exeter Data Archive being closed down, the
database will be transferred to another appropriate archive.
University of the Arts, DCC101, London, 19th November 2012
11. Because good research needs good data
Issues to consider when advising researchers
Formats
Prior to advising on a place of deposit, check to make sure that
the data repository accepts the format(s) researcher will be
working with.
Check to see if there are normalisation procedures (ingest,
preservation).
Exeter Data Archive Example
EDA collects, preserves and makes available the University's
research data. The content policy states that EDA will accept all
types of data.
University of the Arts, DCC101, London, 19th November 2012
12. Because good research needs good data
Issues to consider when advising researchers
What information do researchers need to provide?
Most repositories have a clearly defined set of minimum
information requirements.
EDA Example
• Title
• Data creator
• Department
• Date of publication
• Dataset description
University of the Arts, DCC101, London, 19th November 2012
13. Because good research needs good data
DCC guidance
• details of how the data have been encoded (database
structures, file formats);
• a list of software known to work with the data and their
supporting information;
• indications of how the data relate to other data assets;
• administrative information (grant info, identifiers, checksums);
University of the Arts, DCC101, London, 19th November 2012
14. Because good research needs good data
DCC guidance
• explanations of what the data represent (e.g. for sensor data,
what the sensor was measuring and in what units);
• the processing history of the data (how they were generated
and subsequently transformed, when and by whom);
• a narrative describing the context (why the data were
generated/collected, what methodology was used and why).
This information is particularly important
for users as they interpret the data, and determine whether and
how they can be integrated with other data.
University of the Arts, DCC101, London, 19th November 2012
15. Because good research needs good data
Issues to consider when advising researchers
Access
Prior to advising on place of deposit, check to make sure that the
data repository’s policy on access meets researchers’ and/or
funders’ needs.
EDA Example
• Anyone may access full items free of charge.
• Copies of full items generally can be:
(a) reproduced, displayed or performed, given to third parties,
and stored in a database in any format or medium
(b) for personal research or study, educational, or not-for-profit
purposes without prior permission or charge
University of the Arts, DCC101, London, 19th November 2012
16. Because good research needs good data
Issues to consider when advising researchers
Restrictions on Access
Are there any restrictions on access to researchers’ data that the
repository should be made aware of?
EDA Example
• Items can be deposited at any time, but will not be made
publicly visible until any publishers' or funders' embargo period
has expired.
• This repository is not the publisher; it is merely the online
archive.
University of the Arts, DCC101, London, 19th November 2012
17. Because good research needs good data
Issues to consider when advising researchers
Licensing
Prior to deposit, work with researchers to determine the best
license for their data. Make sure that researchers’ data license
respects limits associated with any external data they are using.
EDA Example
• Full items must not be sold commercially in any format or
medium without formal permission of the copyright holders.
• Any copyright violations are entirely the responsibility of the
authors/depositors
• Some full items are individually tagged with different rights
permissions and conditions .
University of the Arts, DCC101, London, 19th November 2012
18. Because good research needs good data
General guidance for data licensing
Taken from DCC How-to guide on licensing data
www.dcc.ac.uk/resources/how-guides
Two key issues to consider:
• Licensing – legal instrument stating what people can and can’t
do with data
• Waivers – legal instrument allowing author to give up rights
University of the Arts, DCC101, London, 19th November 2012
19. Because good research needs good data
Creative Commons
• Attribution condition - allows others to copy, distribute, display,
and perform the work as long as the creator is given due credit.
• Non-commercial – users cannot use the work for commercial
purposes
• Share-alike – all derivative works must be released
under the same licence as the original work
Attribution Non-Commercial Share Alike (CC-BY-NC-SA)
http://creativecommons.org/
University of the Arts, DCC101, London, 19th November 2012
20. Because good research needs good data
Issues to consider when advising researchers
Citation
Work with researchers to help make their data citable to increase
your potential impact (for researchers and REF).
EDA Example
Once your work has been approved for entry into the EDA, you
will receive a notification via email. This email will contain a
permanent link to your work - you should cite this link in
preference to the URL of the item as it provides continuing
persistent access in case the URL should ever change.
University of the Arts, DCC101, London, 19th November 2012
21. Because good research needs good data
General guidance for data citation
Taken from DCC How-to guide on data citation
www.dcc.ac.uk/resources/how-guides
If you have generated/collected data to be used as evidence in
an academic publication, you should deposit them with a suitable
data archive or repository as soon as you are able.
If they do not provide you with a persistent identifier or URL for
your data, encourage them to do so.
University of the Arts, DCC101, London, 19th November 2012
22. Because good research needs good data
General guidance for data citation
When citing a dataset in a paper, use the citation style required
by the editor/publisher. If no form is suggested for datasets, take
a standard data citation style and adapt it to match the style for
textual publications.
Give dataset identifiers in the form of a URL wherever possible,
unless otherwise directed.
Include data citations alongside those for textual publications.
Some reference management packages now include support for
datasets, which should make this easier.
University of the Arts, DCC101, London, 19th November 2012
23. Because good research needs good data
General guidance for data citation
Cite datasets at the finest-grained level available that meets your
need. If that is not fine enough, provide details of the subset of
data you are using at the point in the text where you make the
citation.
If a dataset exists in several versions, be sure to cite the exact
version you used.
When you publish a paper that cites a dataset, notify the
repository that holds the dataset, so it can add a link from that
dataset to your paper.
University of the Arts, DCC101, London, 19th November 2012
24. Because good research needs good data
General advice for support provision:
• Discipline specific support is better than generic
• Advice on where to store data during the active phase of
research as well
• Real life examples of good practice are helpful
University of the Arts, DCC101, London, 19th November 2012
25. Because good research needs good data
But remember!
The validity and authenticity of the data is the
responsibility of the researcher.
This is true for any data repository.
Make sure researchers are aware of their
responsibility.
University of the Arts, DCC101, London, 19th November 2012
26. Because good research needs good data
But remember!
You can’t do the work for researchers – this is a
collaborative process.
Start conversations at the bid stage/start of PhD
and keep communications going.
Feed into the KAPUR project as it progresses
to help shape effective infrastructure and support
services.
University of the Arts, DCC101, London, 19th November 2012
27. Because good research needs good data
Useful resources - DCC Tools catalogue
Managing data
http://www.dcc.ac.uk/resources/external/tools-
services/managing-active-research-data
Sharing and tracking reuse
http://www.dcc.ac.uk/resources/external/tools-services/sharing-
output-and-tracking-impact
University of the Arts, DCC101, London, 19th November 2012
28. Because good research needs good data
Any questions?
For DCC guidance, tools and case studies see:
www.dcc.ac.uk/resources
Follow us on twitter @digitalcuration and #ukdcc
University of the Arts, DCC101, London, 19th November 2012