SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Preparing Research Data
for Sharing
An overview for LSHTM students
Gareth Knight & Victoria Cranna
This work is licensed under a
Creative Commons Attribution 2.0 UK:
England & Wales License
LSHTM eThesis session
Presented on 10th and 18th July 2013
Data Sharing in the News
Research Data
“Data produced during the research activity
should be managed appropriately, ensuring
that it is stored, organised and documented in
a manner that allows it to be understood and
used for the intended purpose.”
Research Degrees Handbook: Academic Year 2012-13
To Share or not to Share
1. Is the Sharing justified?
• What benefits will it provide?
• What are the risks associated with sharing data?
2. Do you have the ability to share?
• Intellectual Property Rights (IPR)
• Participant Consent
• Other obligations, e.g. confidentiality
3. Are there any conditions associated with sharing?
• What measures need to be in place to protect data? (e.g. record access
requests, specific use only)
Information Commissioner Office. Data Sharing Code of Practice
http://www.ico.org.uk/for_organisations/data_protection/topic_guides/data_sharing
Reasons for
• Encourages validation of research
findings
• Increase visibility of research
findings through attribution and
further analysis
• Comply with sponsor obligations
• Comply with journal publisher
req.
• Simple way to deal with annoying
data requests
Reasons against
• Ownership issues , e.g. 3rd party
rights
• Participant Confidentiality - DPA
1998 –not apply to anonymised data
• Sensitivity - Implications of release
(e.g. geo-references for animal
migration).
• Commercial/Research exploitation
• Contractual, regulatory, & legislative
What are the risks of data release?
Protection of
Research Participants
“ Researchers must ensure the confidentiality
of personal information relating to research
participants”
“Prior to publication or depositing data in a
public depository, data should be fully
anonymised”
LSHTM Guidelines on Good Research Practice
Data Protection Act 1998
Personal Data
Info that can be used to identify
individual in isolation, or in tandem with
other info. E.g. Name, age, address, etc.
Sensitive Personal Data
racial or ethnic origin
political opinions
religious beliefs
trade union membership
physical or mental health
sexual life
criminal convictions
Protect living individual’s fundamental rights and freedoms in
relation to storage, processing, and disclosure of information held
about them
Data Protection Principles
Eight principles which broadly state that personal data shall be:
1. Fairly and lawfully processed
2. Obtained only for specified purposes, and shall not be further processed
for other purposes that are incompatible with the original reason
3. Adequate, relevant and not excessive in comparison to original purpose
4. Accurate and where necessary, kept up to date
5. Held no longer than is necessary
6. Processed in accordance with the data subject’s rights
7. Kept securely and safely with appropriate measures to prevent
unauthorised or unlawful processing of the data and against accidental
loss, destruction or damage
8. Not transferred to countries without adequate protection
Potential Exemptions
No blanket exemption, but...
• Certain exemptions for research purposes including
statistical or historical purposes.
• If the research processing is not targeted at particular
individual & does not cause substantial distress or
damage to a data subject, then:
• 2nd principle - personal data can be processed for purposes other
than for which they were originally obtained
• 5th principle - personal data can be held indefinitely
• Analysis results do not identify data subjects
Information Commissioner Office: Guide to Data Protection
http://www.ico.org.uk/for_organisations/data_protection/the_guide
Reducing Disclosure risk
Disclosure Types:
• Identity: Identify person directly
• Attribute: ID sensitive info on subject
• Inferential: Determine value of a subject’s
characteristic more accurately than would have
been otherwise possible
Techniques:
• Remove obvious identifiers (DPA 1998)
• Replace real data with synthetic
• Limit variables that are made available
• Sampling with a larger group
• Group significant values / Top/bottom coding
• Limit geographic detail
Avoiding inappropriate attribution of information to a data subject
Information Commissioner Office: Anonymisation Code of Practice
http://www.ico.org.uk/for_organisations/data_protection/topic_guides/anonymisation
Ensuring continued access
Problems:
1. User doesn’t possess relevant
software package
2. User runs a different operating
system than the creator (e.g.
Linux, MacOS)
3. Software package is obsolete
Options:
• Emulation of original
environment
• Export to other format
Choosing File Formats
Format should be:
• Accessible using wide-range of
software tools
• In widespread use
• Support relevant information
attributes without loss
• Based upon a public specification
• Able to be created without DRM or
other limitations
“turning [a] PDF into XML is like turning a hamburger into a cow”
Peter Murray-Rust
Recommended Formats
Quantitative tabular:
• Preferred: SPSS portable format (.por), delimited txt & command/setup file
• Acceptable: SPSS (.sav), Stata (.dta), MS Access & other proprietary formats
Geospatial:
• Preferred: ESRI Shapefile, Geo-referenced TIFF (.tif, .tfw)
• Acceptable: SRI Geodatabase format (.mdb), MapInfo Interchange Format (.mif),
Keyhole Mark-up Language (KML) (.kml)
Qualitative text:
• Preferred: XML-encoded text (e.g. DDI, TEI), Open Document Format (ODF), Rich
Text Format (RTF)
• Acceptable: MS Word, NVivo
Still Images:
• Preferred: TIFF, Uncompressed lossless JP2000
• Acceptable: PNG, RAW, Compressed JP2000
Ensuring Understandability
Researcher Qs:
• What does the variable mean?
• How were the results produced?
• What are the boundaries of the
measurement?
• What instruments and measures
were used?
A user – a 3rd party or future self) has difficult understanding
some aspect of the research data
Source:
• Lab notebooks & research protocols
• Codebooks and data dictionaries
• Equipment settings &
instrument calibration
Approach:
1. Check reqs in your field (e.g. Clinical)
2. Look at other collections (e.g. UKDS)
3. Consider Qs that user may have when accessing
Ensuring Usability
Scenarios:
1. Uncertain if permitted to
analyse data – does not use.
2. Researcher uses data in research
for non-permitted purpose
End user unsure on permitted use of data
Licence should specify:
• Data that the licence applies to;
• Who owns each component;
• Who is permitted access & use;
• Conditions associated with use
1. Standard licence model
Creative Commons
Attribution (BY): Creator must be credited
No Derivatives (ND): No editing or manipulation
Non-Commercial (NC): Cannot be sold
Share Alike (SA): Share under same licence
Open Data Commons
Public Domain Dedication & License
(PDDL)
Attribution License (ODC-By)
Open Database License (ODC-ODbL)
Attribution Share-Alike
Various software Licence Models
GNU General Public License (GPL)
GNU General Public License (LGPL)
BSD license
Etc.
2. Tailored Licence form
• National Cancer Research Institute - Data
and Material Transfer Agreement
template
• http://www.ncri.org.uk/default.asp?s=1&
p=8&ss=9
• UK Data Service licence
http://ukdataservice.ac.uk/deposit-
data/support/licence.aspx
• CELCIUS Data Access Agreement
http://celsius.lshtm.ac.uk/documents/Dat
a%20Access%20Agreement.doc
• Participant Consent form
http://www.lshtm.ac.uk/research/ethicsc
ommittees/
Digital Curation Centre: How to License Research Data
http://www.dcc.ac.uk/resources/how-guides/license-research-data
LSHTM Data Repository
• Public: data made available for
anonymous access
• Registered: End user required to
register for time-limited access
• Approved: End user must state
purpose they wish to use data for.
• Embargoed: Data associated
withheld for a designated time
period, e.g. 5 years.
• Request: Data not held in the
repository may be requested from
the creator
In-development service capable of
curating, preserving, and sharing LSHTM research data
A Few Useful References
• MANTRA – Data Management training for PhD students
http://datalib.edina.ac.uk/mantra/
• UK Data Archive – Managing and Sharing Data
http://www.data-archive.ac.uk/media/2894/managingsharing.pdf
• LSHTM Information Management support material
http://intra.lshtm.ac.uk/infoman/
• Data Protection web pages: http://intra.lshtm.ac.uk/infoman/data/
• Guidelines on good research practice: Implementing research governance:
http://www.lshtm.ac.uk/research/ethicscommittees/good_research_practice.p
df
• Research Degrees Handbook:
http://www.lshtm.ac.uk/study/currentstudents/studentinformation/rd_handbo
ok_12_13.pdf
• Information Management and Security Policy:
http://intra.lshtm.ac.uk/infoman/security/index.html
Contact
Open Access
Andrew.gray@lshtm.ac.uk
Data Protection
Victoria.cranna@lshtm.ac.uk
Data Management
gareth.knight@lshtm.ac.uk
Image References
• “Sharing” (CC BY-NC 2.0) http://www.flickr.com/photos/tobanblack/3773116901/
• "Women slicing tomatoes for food preparation" (CC BY-NC 2.0)
• http://www.flickr.com/photos/45796762@N03/7999269493/
• “Warned” (CC BY 2.0) http://www.flickr.com/photos/figgenhoffer/2598487764/
• “Day 114, Project 365 - 2.13.10” (CC BY 2.0)
• http://www.flickr.com/photos/93841400@N00/4355611690/
• "license" (CC BY 2.0)
• http://www.flickr.com/photos/flowizm/3861998999/
• Rosetta Stone (CC BY-NC 2.0)
http://www.flickr.com/photos/65713088@N00/6268592919/
• “Obsolete Packages” (CC BY-SA 2.0)
http://www.flickr.com/photos/floydwilde/160475157/
• “Activity SpreadSheet. Aug. 1” (CC BY-NC 2.0).
http://www.flickr.com/photos/bitchcakes/7993211140/
• "2006-06-14 012 - Cow" (CC BY-NC 2.0)
http://www.flickr.com/photos/chrisq/167074953/
• My favorite (CC BY-SA 2.0)
• http://www.flickr.com/photos/erwss/3129884643/

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Research data management at TU Eindhoven
Research data management at TU EindhovenResearch data management at TU Eindhoven
Research data management at TU Eindhoven
 
Fair data principles for AOASG
Fair data principles for AOASGFair data principles for AOASG
Fair data principles for AOASG
 
Preparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR PrinciplesPreparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR Principles
 
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
DATA MANAGEMENT – WHAT DOES IT MEAN FOR RESEARCHERS?
 
Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...Supporting the development of a national Research Data Discovery Service - A ...
Supporting the development of a national Research Data Discovery Service - A ...
 
What funders want you to do with your data
What funders want you to do with your dataWhat funders want you to do with your data
What funders want you to do with your data
 
Horizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandatesHorizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandates
 
20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?
 
Research Ethics and Use of Restricted Access Data
Research Ethics and Use of Restricted Access DataResearch Ethics and Use of Restricted Access Data
Research Ethics and Use of Restricted Access Data
 
CARARE: Can I use this data? FAIR into practice
CARARE: Can I use this data? FAIR into practiceCARARE: Can I use this data? FAIR into practice
CARARE: Can I use this data? FAIR into practice
 
Adjusting to the GDPR: The Impact on Data Scientists and Behavioral Researchers
Adjusting to the GDPR: The Impact on Data Scientists and Behavioral ResearchersAdjusting to the GDPR: The Impact on Data Scientists and Behavioral Researchers
Adjusting to the GDPR: The Impact on Data Scientists and Behavioral Researchers
 
Data sharing: How, what and why?
Data sharing: How, what and why?Data sharing: How, what and why?
Data sharing: How, what and why?
 
OU Library Research Support webinar: Data sharing
OU Library Research Support webinar: Data sharingOU Library Research Support webinar: Data sharing
OU Library Research Support webinar: Data sharing
 
FAIR data overview
FAIR data overviewFAIR data overview
FAIR data overview
 
Data sharing: Legal and ethical issues
Data sharing: Legal and ethical issuesData sharing: Legal and ethical issues
Data sharing: Legal and ethical issues
 
Research Data Services Best Practices by Dalal Rahme
Research Data Services Best Practices by Dalal RahmeResearch Data Services Best Practices by Dalal Rahme
Research Data Services Best Practices by Dalal Rahme
 
EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu | EUDAT Research Data Management | www.eudat.eu |
EUDAT Research Data Management | www.eudat.eu |
 
Developing a Data Management Plan
Developing a Data Management PlanDeveloping a Data Management Plan
Developing a Data Management Plan
 
Access and licencing of data
Access and licencing of dataAccess and licencing of data
Access and licencing of data
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
 

Ähnlich wie Preparing research data for sharing

Managing Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Managing Your Research Data for Maximum Impact -Rob Daley 300616_SharedManaging Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Managing Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Rob Daley
 

Ähnlich wie Preparing research data for sharing (20)

Preparing Research Data for Sharing
Preparing Research Data for SharingPreparing Research Data for Sharing
Preparing Research Data for Sharing
 
20160523 23 Research Data Things
20160523 23 Research Data Things20160523 23 Research Data Things
20160523 23 Research Data Things
 
Managing Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Managing Your Research Data for Maximum Impact -Rob Daley 300616_SharedManaging Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Managing Your Research Data for Maximum Impact -Rob Daley 300616_Shared
 
LSHTM Research Data Management Policy: An Overview
LSHTM Research Data Management Policy: An OverviewLSHTM Research Data Management Policy: An Overview
LSHTM Research Data Management Policy: An Overview
 
DIRISA for Open Data and Open Science/Anwar Vahed
DIRISA for Open Data and Open Science/Anwar VahedDIRISA for Open Data and Open Science/Anwar Vahed
DIRISA for Open Data and Open Science/Anwar Vahed
 
Introduction to Data Management Planning
Introduction to Data Management PlanningIntroduction to Data Management Planning
Introduction to Data Management Planning
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 
Creating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant ApplicationCreating a Data Management Plan for your Grant Application
Creating a Data Management Plan for your Grant Application
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020
 
Securing, storing and enabling safe access to data
Securing, storing and enabling safe access to dataSecuring, storing and enabling safe access to data
Securing, storing and enabling safe access to data
 
Increasing transparency in Medical Education through Open Data
Increasing transparency in Medical Education through Open Data Increasing transparency in Medical Education through Open Data
Increasing transparency in Medical Education through Open Data
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data management
 
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
 
Use of data in safe havens: ethics and reproducibility issues
Use of data in safe havens: ethics and reproducibility issuesUse of data in safe havens: ethics and reproducibility issues
Use of data in safe havens: ethics and reproducibility issues
 
DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciences
 
Privacy Preserving DB Systems
Privacy Preserving DB SystemsPrivacy Preserving DB Systems
Privacy Preserving DB Systems
 
Ariadne: Data Management Planning
Ariadne: Data Management PlanningAriadne: Data Management Planning
Ariadne: Data Management Planning
 
Part I: Data management planning - Training for trainers
Part I: Data management planning - Training for trainers Part I: Data management planning - Training for trainers
Part I: Data management planning - Training for trainers
 
FAIR vs GDPR: which will win?
FAIR vs GDPR: which will win?FAIR vs GDPR: which will win?
FAIR vs GDPR: which will win?
 

Mehr von London School of Hygiene and Tropical Medicine

Mehr von London School of Hygiene and Tropical Medicine (20)

Preparing to submit your thesis at LSHTM
Preparing to submit your thesis at LSHTMPreparing to submit your thesis at LSHTM
Preparing to submit your thesis at LSHTM
 
Your research is more than a thesis: Make the most of research data and other...
Your research is more than a thesis: Make the most of research data and other...Your research is more than a thesis: Make the most of research data and other...
Your research is more than a thesis: Make the most of research data and other...
 
Enhance your rese​arch impact through open science
Enhance your rese​arch impact through open scienceEnhance your rese​arch impact through open science
Enhance your rese​arch impact through open science
 
Information Security and GDPR
Information Security and GDPRInformation Security and GDPR
Information Security and GDPR
 
GDPR and Research Data Management
GDPR and Research Data ManagementGDPR and Research Data Management
GDPR and Research Data Management
 
Towards Open Research: practices, experiences, barriers and opportunities
Towards Open Research: practices, experiences, barriers and opportunitiesTowards Open Research: practices, experiences, barriers and opportunities
Towards Open Research: practices, experiences, barriers and opportunities
 
Data Journals and repositories: Getting academic credit for data sharing
Data Journals and repositories: Getting academic credit for data sharingData Journals and repositories: Getting academic credit for data sharing
Data Journals and repositories: Getting academic credit for data sharing
 
Crowd sourcing and high resolution satellite imagery in public health
Crowd sourcing and high resolution satellite imagery in public healthCrowd sourcing and high resolution satellite imagery in public health
Crowd sourcing and high resolution satellite imagery in public health
 
Determining the relationship between physical environment and weight status u...
Determining the relationship between physical environment and weight status u...Determining the relationship between physical environment and weight status u...
Determining the relationship between physical environment and weight status u...
 
i-Sense: an early-warning sensing systems for infectious diseases
i-Sense: an early-warning sensing systems for infectious diseasesi-Sense: an early-warning sensing systems for infectious diseases
i-Sense: an early-warning sensing systems for infectious diseases
 
Internet-based surveillance of illness: the FluSurvey platform
Internet-based surveillance of illness: the FluSurvey platformInternet-based surveillance of illness: the FluSurvey platform
Internet-based surveillance of illness: the FluSurvey platform
 
An overview of the MyHeart Counts app
An overview of the MyHeart Counts appAn overview of the MyHeart Counts app
An overview of the MyHeart Counts app
 
Electronic data collection for a modular household survey in Ethiopia
Electronic data collection for a modular household survey in EthiopiaElectronic data collection for a modular household survey in Ethiopia
Electronic data collection for a modular household survey in Ethiopia
 
Mobile-Based Experience Sampling for Behaviour Research
Mobile-Based Experience Sampling for Behaviour ResearchMobile-Based Experience Sampling for Behaviour Research
Mobile-Based Experience Sampling for Behaviour Research
 
RDM Training for health researchers: An institutional perspective
RDM Training for health researchers: An institutional perspectiveRDM Training for health researchers: An institutional perspective
RDM Training for health researchers: An institutional perspective
 
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
Research Data Readiness in UK Institutions: Digital Curation Centre’s 2015 Su...
 
Research data services at the University of Oxford
Research data services at the University of OxfordResearch data services at the University of Oxford
Research data services at the University of Oxford
 
Research Data Management at The University of Edinburgh
Research Data Management at The University of EdinburghResearch Data Management at The University of Edinburgh
Research Data Management at The University of Edinburgh
 
Research data management at UAL
Research data management at UALResearch data management at UAL
Research data management at UAL
 
RDM at UEL: agile, fragile or feral?
RDM at UEL: agile, fragile or feral?RDM at UEL: agile, fragile or feral?
RDM at UEL: agile, fragile or feral?
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Preparing research data for sharing

  • 1. Preparing Research Data for Sharing An overview for LSHTM students Gareth Knight & Victoria Cranna This work is licensed under a Creative Commons Attribution 2.0 UK: England & Wales License LSHTM eThesis session Presented on 10th and 18th July 2013
  • 2. Data Sharing in the News
  • 3. Research Data “Data produced during the research activity should be managed appropriately, ensuring that it is stored, organised and documented in a manner that allows it to be understood and used for the intended purpose.” Research Degrees Handbook: Academic Year 2012-13
  • 4. To Share or not to Share 1. Is the Sharing justified? • What benefits will it provide? • What are the risks associated with sharing data? 2. Do you have the ability to share? • Intellectual Property Rights (IPR) • Participant Consent • Other obligations, e.g. confidentiality 3. Are there any conditions associated with sharing? • What measures need to be in place to protect data? (e.g. record access requests, specific use only) Information Commissioner Office. Data Sharing Code of Practice http://www.ico.org.uk/for_organisations/data_protection/topic_guides/data_sharing
  • 5. Reasons for • Encourages validation of research findings • Increase visibility of research findings through attribution and further analysis • Comply with sponsor obligations • Comply with journal publisher req. • Simple way to deal with annoying data requests
  • 6. Reasons against • Ownership issues , e.g. 3rd party rights • Participant Confidentiality - DPA 1998 –not apply to anonymised data • Sensitivity - Implications of release (e.g. geo-references for animal migration). • Commercial/Research exploitation • Contractual, regulatory, & legislative What are the risks of data release?
  • 7. Protection of Research Participants “ Researchers must ensure the confidentiality of personal information relating to research participants” “Prior to publication or depositing data in a public depository, data should be fully anonymised” LSHTM Guidelines on Good Research Practice
  • 8. Data Protection Act 1998 Personal Data Info that can be used to identify individual in isolation, or in tandem with other info. E.g. Name, age, address, etc. Sensitive Personal Data racial or ethnic origin political opinions religious beliefs trade union membership physical or mental health sexual life criminal convictions Protect living individual’s fundamental rights and freedoms in relation to storage, processing, and disclosure of information held about them
  • 9. Data Protection Principles Eight principles which broadly state that personal data shall be: 1. Fairly and lawfully processed 2. Obtained only for specified purposes, and shall not be further processed for other purposes that are incompatible with the original reason 3. Adequate, relevant and not excessive in comparison to original purpose 4. Accurate and where necessary, kept up to date 5. Held no longer than is necessary 6. Processed in accordance with the data subject’s rights 7. Kept securely and safely with appropriate measures to prevent unauthorised or unlawful processing of the data and against accidental loss, destruction or damage 8. Not transferred to countries without adequate protection
  • 10. Potential Exemptions No blanket exemption, but... • Certain exemptions for research purposes including statistical or historical purposes. • If the research processing is not targeted at particular individual & does not cause substantial distress or damage to a data subject, then: • 2nd principle - personal data can be processed for purposes other than for which they were originally obtained • 5th principle - personal data can be held indefinitely • Analysis results do not identify data subjects Information Commissioner Office: Guide to Data Protection http://www.ico.org.uk/for_organisations/data_protection/the_guide
  • 11. Reducing Disclosure risk Disclosure Types: • Identity: Identify person directly • Attribute: ID sensitive info on subject • Inferential: Determine value of a subject’s characteristic more accurately than would have been otherwise possible Techniques: • Remove obvious identifiers (DPA 1998) • Replace real data with synthetic • Limit variables that are made available • Sampling with a larger group • Group significant values / Top/bottom coding • Limit geographic detail Avoiding inappropriate attribution of information to a data subject Information Commissioner Office: Anonymisation Code of Practice http://www.ico.org.uk/for_organisations/data_protection/topic_guides/anonymisation
  • 12. Ensuring continued access Problems: 1. User doesn’t possess relevant software package 2. User runs a different operating system than the creator (e.g. Linux, MacOS) 3. Software package is obsolete Options: • Emulation of original environment • Export to other format
  • 13. Choosing File Formats Format should be: • Accessible using wide-range of software tools • In widespread use • Support relevant information attributes without loss • Based upon a public specification • Able to be created without DRM or other limitations “turning [a] PDF into XML is like turning a hamburger into a cow” Peter Murray-Rust
  • 14. Recommended Formats Quantitative tabular: • Preferred: SPSS portable format (.por), delimited txt & command/setup file • Acceptable: SPSS (.sav), Stata (.dta), MS Access & other proprietary formats Geospatial: • Preferred: ESRI Shapefile, Geo-referenced TIFF (.tif, .tfw) • Acceptable: SRI Geodatabase format (.mdb), MapInfo Interchange Format (.mif), Keyhole Mark-up Language (KML) (.kml) Qualitative text: • Preferred: XML-encoded text (e.g. DDI, TEI), Open Document Format (ODF), Rich Text Format (RTF) • Acceptable: MS Word, NVivo Still Images: • Preferred: TIFF, Uncompressed lossless JP2000 • Acceptable: PNG, RAW, Compressed JP2000
  • 15. Ensuring Understandability Researcher Qs: • What does the variable mean? • How were the results produced? • What are the boundaries of the measurement? • What instruments and measures were used? A user – a 3rd party or future self) has difficult understanding some aspect of the research data Source: • Lab notebooks & research protocols • Codebooks and data dictionaries • Equipment settings & instrument calibration Approach: 1. Check reqs in your field (e.g. Clinical) 2. Look at other collections (e.g. UKDS) 3. Consider Qs that user may have when accessing
  • 16. Ensuring Usability Scenarios: 1. Uncertain if permitted to analyse data – does not use. 2. Researcher uses data in research for non-permitted purpose End user unsure on permitted use of data Licence should specify: • Data that the licence applies to; • Who owns each component; • Who is permitted access & use; • Conditions associated with use
  • 17. 1. Standard licence model Creative Commons Attribution (BY): Creator must be credited No Derivatives (ND): No editing or manipulation Non-Commercial (NC): Cannot be sold Share Alike (SA): Share under same licence Open Data Commons Public Domain Dedication & License (PDDL) Attribution License (ODC-By) Open Database License (ODC-ODbL) Attribution Share-Alike Various software Licence Models GNU General Public License (GPL) GNU General Public License (LGPL) BSD license Etc.
  • 18. 2. Tailored Licence form • National Cancer Research Institute - Data and Material Transfer Agreement template • http://www.ncri.org.uk/default.asp?s=1& p=8&ss=9 • UK Data Service licence http://ukdataservice.ac.uk/deposit- data/support/licence.aspx • CELCIUS Data Access Agreement http://celsius.lshtm.ac.uk/documents/Dat a%20Access%20Agreement.doc • Participant Consent form http://www.lshtm.ac.uk/research/ethicsc ommittees/ Digital Curation Centre: How to License Research Data http://www.dcc.ac.uk/resources/how-guides/license-research-data
  • 19. LSHTM Data Repository • Public: data made available for anonymous access • Registered: End user required to register for time-limited access • Approved: End user must state purpose they wish to use data for. • Embargoed: Data associated withheld for a designated time period, e.g. 5 years. • Request: Data not held in the repository may be requested from the creator In-development service capable of curating, preserving, and sharing LSHTM research data
  • 20. A Few Useful References • MANTRA – Data Management training for PhD students http://datalib.edina.ac.uk/mantra/ • UK Data Archive – Managing and Sharing Data http://www.data-archive.ac.uk/media/2894/managingsharing.pdf • LSHTM Information Management support material http://intra.lshtm.ac.uk/infoman/ • Data Protection web pages: http://intra.lshtm.ac.uk/infoman/data/ • Guidelines on good research practice: Implementing research governance: http://www.lshtm.ac.uk/research/ethicscommittees/good_research_practice.p df • Research Degrees Handbook: http://www.lshtm.ac.uk/study/currentstudents/studentinformation/rd_handbo ok_12_13.pdf • Information Management and Security Policy: http://intra.lshtm.ac.uk/infoman/security/index.html
  • 22. Image References • “Sharing” (CC BY-NC 2.0) http://www.flickr.com/photos/tobanblack/3773116901/ • "Women slicing tomatoes for food preparation" (CC BY-NC 2.0) • http://www.flickr.com/photos/45796762@N03/7999269493/ • “Warned” (CC BY 2.0) http://www.flickr.com/photos/figgenhoffer/2598487764/ • “Day 114, Project 365 - 2.13.10” (CC BY 2.0) • http://www.flickr.com/photos/93841400@N00/4355611690/ • "license" (CC BY 2.0) • http://www.flickr.com/photos/flowizm/3861998999/ • Rosetta Stone (CC BY-NC 2.0) http://www.flickr.com/photos/65713088@N00/6268592919/ • “Obsolete Packages” (CC BY-SA 2.0) http://www.flickr.com/photos/floydwilde/160475157/ • “Activity SpreadSheet. Aug. 1” (CC BY-NC 2.0). http://www.flickr.com/photos/bitchcakes/7993211140/ • "2006-06-14 012 - Cow" (CC BY-NC 2.0) http://www.flickr.com/photos/chrisq/167074953/ • My favorite (CC BY-SA 2.0) • http://www.flickr.com/photos/erwss/3129884643/

Hinweis der Redaktion

  1. Protect living individual’s fundamental rights and freedoms in relation to storage, processing, and disclosure of information held about themAims to protect an individual’s fundamental rights and freedoms in respect of personal data processingGives individuals the right to access the personal data the School holds on them, to correct it, purpose for which it is held, and who the information can be disclosed toOnly relates to living individualsPublic register of data controllers to which institutions have to add their notification
  2. If the purpose of the research processing is not measures or decisions targeted at particular individuals and does not cause substantial distress or damage to a data