Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
RDM for trainee physicians
1. Stuart Macdonald
Associate Data Librarian
EDINA & Data Library
University of Edinburgh
stuart.macdonald@ed.ac.uk
Research Data Management:
What you need to know
Research - an introduction for trainee physicians
Royal College of Physicians of Edinburgh
22 March 2016
2. Running order
Defining research data & data types
Research Data Management (RDM)
Funder requirements
Data (and software) management planning
Organising data
File formatting
Documentation & metadata
Storage & security
Data protection, rights & access
Preservation, sharing & licensing
3. Defining research data
Research data are collected, observed or created, for the
purposes of analysis to produce and validate original research
results.
Data can also be created by researchers for one purpose and
used by another set of researchers at a later date for a
completely different research agenda.
Digital data can be:
o created in a digital form ('born digital')
o converted to a digital form (digitised)
5. Research Data Management (RDM)
• RDM is a general term covering how you organise, structure,
store, and care for the data used or generated during the
lifetime of a research project.
• It includes:
– How you deal with data on a day-to-day basis over the lifetime of a
project,
– What happens to data after the project concludes.
RDM is considered an essential part of good research practice.
Good research needs good data!
6. Activities involved in RDM
Data management
Planning
Creating data
Documenting data
Storage and backup
Sharing data
Preserving data
7. Why manage your data?
So you can find and understand it when needed.
To avoid unnecessary duplication.
To validate results if required.
So your research is visible and has impact.
To get credit when others cite your work.
8. Drivers of RDM
“Publicly funded research data are a public good, produced
in the public interest, which should be made openly
available with as few restrictions as possible in a timely
and responsible manner that does not harm
intellectual property.”
RCUK Common Principles on Data Policy
http://www.rcuk.ac.uk/research/datapolicy/
9. Funding bodies’ requirements
Funders are increasingly requiring researchers to meet certain data
management criteria.
When applying for funding, you need to submit a technical or data
management plan.
You are expected to make your data publicly available where appropriate at the
end of your project and include a short statement, describing how and on what
terms any supporting research data may be accessed.
Horizon 2020 Open Data Pilot is driving lots of national RDM pilots across Europe
Parallels the response to the EPSRC data policy in UK
10. EPSRC Policy Framework on Research Data
http://www.epsrc.ac.uk/about/standards/researchdata/impact/
11. EPSRC
Expects that:
• published research papers should include a short statement,
describing how and on what terms any supporting research data
may be accessed,
• metadata on the research data they hold will be published by
institutions within 12 months of data generation,
• data will be securely preserved for a minimum of 10 years from
the date of last 3rd party access.
https://www.epsrc.ac.uk/about/standards/researchdata/expectations/
https://www.epsrc.ac.uk/files/aboutus/standards/clarificationsofexpectationsresearchdatamanagement/
12. RCUK Concordat
Research Councils UK (RCUK) published a draft Concordat on Open
Research Data (17 August 2015)
The 10 principles aims to ensure that research data generated by UK
researchers is made openly available for re-use:
• in a manner consistent with relevant legal, ethical and regulatory
frameworks
• recognising the autonomy of researchers
• emphasises responsibilities and accountabilities (research institutions,
universities, funders)
• it does not intend to mandate specific activities.
http://www.rcuk.ac.uk/RCUK-prod/assets/documents/documents/ConcordatOpenResearchData.pdf
13. University’s RDM Policy
University of Edinburgh is one of
the first few Universities in UK
who adopted a policy for
managing research data:
http://www.ed.ac.uk/is/research-data-policy
The policy was approved by the
University Court on 16 May 2011.
It’s acknowledged that this is an
aspirational policy and that
implementation will take some
years.
http://www.ed.ac.uk/is/research-data-policy
15. What would you do if you lost all your data?
• Dropping your laptop
• Hard drive failures
• Software updates
• Obsolescence
• Poorly described data
(metadata)
• Theft of equipment
• Overwriting
data/versioning
• File formats
• Media degradation – CDRs,
memory sticks…
It could happen to you too!
17. What to do?
Consider:
Having a Data Management Plan (DMP).
Organising your data:
o structure
o file names and versions.
File formats.
Documentation & metadata.
Secure data storage & regular backup.
18. What is a Data Management Plan
(DMP)
DMPs are written at the start of a project to define:
What data will be collected or created?
How the data will be documented and described?
Where the data will be stored?
Who will be responsible for data security and backup?
Which data will be shared and/or preserved?
How the data will be shared and with whom?
DMPs are often submitted as part of grant applications, but are
useful in their own right whenever you are creating data.
19. DMPonline
Free and open web-based tool to
help researchers write plans:
https://dmponline.dcc.ac.uk/
It features:
o Templates based on different
funder requirements
o Tailored guidance (disciplinary,
funder etc.)
o Customised exports to a variety of
formats
o Ability to share DMPs with others
DMPonline screencast:
http://www.screenr.com/PJHN
20. Tips to share
Keep it simple, short and specific.
Avoid jargon.
Seek advice - consult and collaborate.
Base plans on available skills and support.
Make sure implementation is feasible.
Justify any resources or restrictions needed.
Also see: http://www.youtube.com/watch?v=7OJtiA53-Fk
21. Software Management Plans
SMPs are relatively new for research proposals.
The EPSRC Software for the Future call requires SMPs as part of the
Pathways to Impact. NSF SI2 funding requires software to be addressed
as part of mandatory data management plans.
A prototype Software Management Plan (SMP) Service has been
developed by the Software Sustainability Institute to help researchers
write software management plans
A guide is on writing & using a software management plan is available:
http://www.software.ac.uk/resources/guides/software-management-plans
22. Organising data
Why? To ensure your research data files are identifiable by you and
others in the future.
Organising and labelling your research data files and folders will help to:
prevent file loss through overwriting, deleting, misplacing
facilitate location and future retrieval
save you time (mostly in the future)
How? With consistent & disciplined approach by:
Setting conventions at the start of your project
Adopting an appropriate file naming & versioning convention
23. File formats
Type Recommended Avoid for sharing
Tabular data CSV, TSV, SPSS portable Excel
Text Plain text, HTML, RTF, PDF/A only if
layout matters
Word
Media Container: MP4, Ogg
Codec: Theora, Dirac, FLAC
Quicktime, H264
Images TIFF, JPEG2000, PNG GIF, JPG
Structured data XML, RDF RDBMS
Files encoded as text or binary files:
• Text encoding: machine- and human-readable. Less likely to become obsolete
.txt, .csv, .html, .xml, .tex, etc.
• Binary encoding: only readable with appropriate software .fcp, .xlxs, .docx, .psd,
.nc, etc.
24. File formatting
If you need to convert or migrate your data files to another format be
aware of the potential risk of loss or corruption of your data.
Always test the files you convert or migrate
You may also use the data normalisation process i.e. convert data from
one format (e.g. proprietary) into another for use or preservation (e.g.
into raw ASCII).
When compressing your data files (storage, sending, sharing) you
encode the information using fewer bits than the original
representation.
Compression programs like Zip and Tar.Z produce files such as .zip,
.tar.gz, .tar.bz2
25. Documentation and metadata
Documentation (intending for reading by humans)
Contextual information
o Aims & objectives of the originating project
Explanatory material
o data source
o collection methodology & process
o questionnaire, codebook
o dataset structure
o technical information
Metadata (intended for reading by machines)
‘data about data’
descriptors to facilitate cataloguing and discoverability.
26. Why it is necessary
To help you …
remember the details of your data
archive your data for future access & re-use
To help others …
discover your data
understand the aims and conduct of the originating research
verify your findings
replicate your results
27. Data Storage - basic principles
Use managed, network services
whenever possible to ensure:
o Regular back-up
o Data Security
o Accessibility
Avoid using portable HD’s, USB
memory sticks, CD’s, or DVD’s to
avoid:
o Data loss due to damage or failure
o Quality control issues due to version
confusion
o Unnecessary security risks e.g. theft
Digital Preservation Coalition’s new promotional USB
stick:
https://twitter.com/digitalfay/status/411444578122
600450/photo/1
28. Secure storage & backup
Make at least 3 copies of the data:
o on at least 2 different media,
o keep storage devices in separate
locations with at least 1 offsite,
o check they work regularly,
o ensure you know the back-up
procedure and follow it.
Ensure you can keep track of
different versions of data,
especially when backing-up to
multiple devices.
o Use a versioning software e.g.,
SVNTortoise, Subversion
One copy = risk of data loss
•CC image by Sharyn Morrow on Flickr
•CCimagebymomboleumonFlickr
29. Keeping sensitive data secure
Ensure PC’s, laptops, and portable
data storage devices are stored
securely and encrypted if necessary
- BitLocker (Windows), FileVault
(Mac).
Be aware that if the any encrypted
data will be lost if the
password/encryption key is lost or if
the hard disk fails.
Give access to data to authorised
people only
System lock: Image by Yuri Yu. Samoilov - Flickr (CC-
BY)
https://www.flickr.com/photos/110751683@N02/
30. Data disposal
Ensure disposal of confidential data
securely.
o Hard drives: use software for secure
erasing such as BC Wipe, Wipe File,
DeleteOnClick, Eraser for Windows;
‘secure empty trash’ for Mac.
o USB Drives: physical destruction is the
only way
o Paper and CDs/optical Discs: shredding
UoE has a comprehensive guide on
the disposal of confidential and/or
sensitive waste held on paper, CDs,
DVDs, tapes, discs hard drives etc. http://www.ed.ac.uk/schools-departments/estates-
buildings/waste-recycling/how/confidential-waste
31. Things to think about …
Ethics
Requirements relating to data that relates to human subjects.
Privacy, confidentiality & disclosure
Data protection
Intellectual Property Rights (IPR)
Copyright
32. Ethics
Ethics committees
Review research applications and advise on whether they are ethical.
Safeguard the rights of research participants.
Participants
Must be fully informed as to the purpose and intended uses of the research,
and advised of what their involvement will entail.
Participation must be voluntary, fully informed and free of any coercion.
Confidentiality of information collected and anonymity of subjects must be
respected at all times.
33. Privacy, confidentiality & disclosure
Privacy
An entitlement of an individual subject.
Handling, storage and sharing of data must be managed to preserve the
privacy of the subject.
Confidentiality
Refers to the behaviour of the researcher, whereby the privacy of the
subject is maintained at all times.
Disclosure
Must be guarded against!
Various techniques to avoid it, whether for ethical, legal reasons or
commercial reasons, e.g.
o removing identifiers from personal information (e.g. D.o.B, Nat. Ins. No.)
o aggregating geographical data to reduce precision
o anonymising data – but without overdoing it!
34. Data protection &
Intellectual Property Rights (IPR)
The UK Data Protection Act 1998 is a
Parliamentary Act defining the law on
the processing of data on living people.
It is the main piece of legislation that
governs the protection of personal
data in the UK
Research data falls within the scope of
this Act.
Failure to observe it can result in:
monetary penalty notices,
prosecutions
enforcement notices
audit without consent
IPR is the legally recognized rights
and protection given to persons for
‘creations of the mind’
e.g. music, literature, and other
artistic & scholarly works; discoveries,
inventions, symbols, and designs
IPR grants exclusive rights to
creators to:
Publish a work
License its distribution to others
Sue if unlawful copies or use is
made of it
35. Copyright
Can be contentious & complex!
When data are archived or shared, the
creator retains copyright.
Data structured within a database as a
result of intellectual investment, retains
an additional ‘database right’
Can sit alongside the copyright attached
to the data contents.
36. Freedom of Information
The Freedom of
Information Act 2000
… gives a right of access to
information held by 'public
authorities‘, which includes most
universities
… covers all records and
information held by them ,
whether digital or print, current or
archived.
Some research data are exempt
(data about human subject,
commercial partners, national
security)
37. Data preservation …
Preservation is key to the long term existence and future accessibility of research
data and is worth thinking about at the planning stage. For the purposes of
preservation data should be deposited in a trusted repository.
Research-funders
ESRC data store: http://store.data-archive.ac.uk/store/
Zenodo (EU): https://zenodo.org/
Institutional (UoE)
Edinburgh DataShare: http://datashare.is.ed.ac.uk/
Discipline-specific
Archaeology Data Service:
http://archaeologydataservice.ac.uk/
Discipline-agnostic
Figshare: http://figshare.com/
Mapping the preservation process, workflow devised by Higgins, S., DCC (Digital Curation Centre)
38. Data sharing ..
… the researcher
Comply with funder requirements
Research can be validated
Increase impact through citation (reputation)
Increase visibility of research
Long-term data storage (preservation)
Enables future re-use (you & others)
… research & society
Avoid duplication of effort & resources
Publicly funded research is available
Academic & scientific integrity
increases transparency & accountability
facilitates scrutiny of research findings
prevents fraud
Extend reach of original research & fosters
collaboration
..is making your research available for others to reuse & build upon.
Benefits
39. Barriers to sharing
“Scientists would rather share their toothbrush than their data!”
Carol Goble, Keynote address, EGEE (Enabling Grid for EsciencE) ’06 Conference
Valid reasons not to share:
Research conducted in clinical settings (e.g. clinical trials)
Research that includes confidential data pertaining to human subjects
Research for national security (e.g. with MoD)
Research with commercial partners to develop patents (e.g. for drug development)
Future ‘share-ability’ of the data - issues to consider:
Format, Software, Documentation, Ethics, Consent & Confidentiality, Anonymisation
Timescale for release (embargo)
Infrastructure for sharing
Rights & licensing
http://openclipart.org/detail/172856/toothbrush-by-bpcomp-172856
40. Data licensing
Why?
The license explicitly states how
your data may be used
Makes them available to others
(where appropriate)
Ensures your data are open!
How?
Repository rights statement’
Creative Commons (CC):
http://wiki.creativecommons.org
Open Data Commons (ODC):
http://opendatacommons.org/