SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Manging and Sharing
Research Data
Sarah Jones
Digital Curation Centre, Glasgow
sarah.jones@glasgow.ac.uk
Twitter: @sjDCC
Webinar for Diponegoro University, Indonesia, 3rd September 2019
What is the Digital Curation Centre?
“a centre of expertise in digital information curation
with a focus on building capacity, capability and skills
for research data management across the UK's
higher education research community”
www.dcc.ac.uk
Training | Events | Tools | Advocacy | Consultancy | Guidance | Publications | Projects
What is RDM?
Image CC-BY-SA by Janneke Staaks www.flickr.com/photos/jannekestaaks/14411397343
What is Research Data Management?
Create
Document
Use
Store
Share
Preserve
“the active management and
appraisal of data over the
lifecycle of scholarly and
scientific interest”
Data management is part of
good research practice
What is involved in RDM?
• Data Management Planning
• Data creation
• Annotating / documenting data
• Analysis, use, versioning
• Storage and backup
• Publishing papers and data
• Preparing for deposit
• Archiving and sharing
• Licensing
• Citing…
Create
Document
Use
Store
Share
Preserve
Why manage and share data?
Direct benefits for you
• To make your research
easier!
• Stop yourself drowning
in irrelevant stuff
• Make sure you can
understand and reuse
your data again later
• Advance your career –
data is growing in
significance
Research integrity
• To avoid accusations of
fraud or bad science
• Evidence findings and
enable validation of
research methods
• Meet codes of practice
on research conduct
• Many research funders
worldwide now require
Data Management and
Sharing Plans
Potential to share data
• So others can reuse and
build on your data
• To gain credit – several
studies have shown
higher citation rates
when data are shared
• For greater visibility,
impact and new
research collaborations
• Promote innovation and
allow research in your
field to advance faster
Why make data available?
Sharing leads to breakthroughs
www.nytimes.com/2010/08/13/health/research
/13alzheimer.html?pagewanted=all&_r=0
“It was unbelievable. Its not science
the way most of us have practiced in
our careers. But we all realised that
we would never get biomarkers
unless all of us parked our egos and
intellectual property noses outside
the door and agreed that all of our
data would be public immediately.”
Dr John Trojanowski, University of Pennsylvania
...and increases the speed of discovery
Benefits for researchers:
sharing data increases citations!
Want evidence?
• Piwowar, Vision – 9% (microarray data)
• Drachen, Dorch, et al – 25-40%, astronomy
• Gleditch, et al – doubling to trebling (international relations)
Open Data Citation Advantage
http://sparceurope.org/open-data-citation-advantage
Benefits for institutions
“If an institution spent A$10 million on data, what
would be the return? The answer is: more publications;
an increased citation count; more grants; greater
profile; and more collaboration.”
Dr Ross Wilkinson, ANDS
www.ariadne.ac.uk/issue72/oar-2013-rpt
Creating data
Image CC-SA-ND by Bill Dickinson www.flickr.com/photos/skynoir/8270436894
Data creation tips
• Ensure consent forms, licences and agreements
don’t restrict opportunities to share data
• Choose appropriate formats
• Adopt a file naming convention
• Create metadata and documentation as you go
Ask for consent for data sharing
If not, data centres won’t be able to accept the data
– regardless of any conditions on the original grant.
www.data-archive.ac.uk/create-manage/consent-ethics/consent?index=3
Choose appropriate file formats
Different formats are good for different things
- open, lossless formats are more sustainable e.g. rtf, xml, tif, wav
- proprietary and/or compressed formats are less preservable but are often
in widespread use e.g. doc, jpg, mp3
One format for analysis then
convert to a standard format
Data centres may suggest preferred formats for deposit
www.data-archive.ac.uk/create-manage/format/formats-table
BioformatsConverter batch converts a
variety of proprietary microscopy image
formats to the Open Microscopy
Environment format - OME-TIFF
What is metadata?
Metadata
• Standardised
• Structured
• Machine and human
readable
Metadata helps to cite &
disambiguate data
Documentation aids reuse
Metadata
Documentation
Metadata standards
These can be general – such as Dublin Core
Or discipline specific
– Data Documentation Initiative (DDI) – social science
– Ecological Metadata Language (EML) - ecology
– Flexible Image Transport System (FITS) – astronomy
Search for standards in catalogues like:
http://rd-alliance.github.io/metadata-directory
“MTBLS1: A metabolomic study of urinary changes
in type 2 diabetes in……”
Why are ontologies important?
Example courtesy of Ken Haug, European Bioinformatics Institute (EMBL-EBI)
Documentation
Think about what is needed in order to evaluate,
understand, and reuse the data.
• Why was the data created?
• Have you documented what you did and how?
• Did you develop code to run analyses? If so, this
should be kept and shared too.
• Important to provide wider context for trust
ReadMe files
We recommend that a ReadMe be a plain text file containing the following:
• for each filename, a short description of what data it includes,
optionally describing the relationship to the tables, figures, or sections
within the accompanying publication
• for tabular data: definitions of column headings and row labels; data
codes (including missing data); and measurement units
• any data processing steps, especially if not described in the publication,
that may affect interpretation of results
• a description of what associated datasets are stored elsewhere, if
applicable
• whom to contact with questions
http://datadryad.org/pages/readme
Example template: https://www.lib.umn.edu/datamanagement/metadata
Managing data
Image ‘tools’ CC-BY by zzpza www.flickr.com/photos/zzpza/3269784239
Where will you store the data?
• Your own device (laptop, flash drive, server etc.)
– And if you lose it? Or it breaks?
• Departmental drives or university servers
• “Cloud” storage
– Do they care as much about your data as you do?
The decision will be based on how sensitive your data
are, how robust you need the storage to be, and
who needs access to the data and when
How to keep you data secure?
Develop a practical solution that fits your circumstances
– Store your data on managed servers
– Restrict access to collaborators or smaller subset
– Encrypt mobile devices carrying sensitive information
– Keep anti-virus software up-to-date
– Use secure data services for long-term sharing
www.wsj.com/articles/SB10001424052748703843804575534122591921594
Collaborative platforms e.g. OSF
https://osf.io
Data-specific platforms e.g. OMERO
Open Microscopy Environment
http://www.openmicroscopy.org/site/products/omero
Third-party tools for collaboration
ownCloud
• Open source product
with Dropbox-like
functionality
• Used by many unis
and service providers
to offer ‘approved’
solution
https://owncloud.org
Using Dropbox and other cloud
services – LSE guidelines
http://www.lse.ac.uk/intranet/LSEServices/
IMT/guides/softwareGuides/other/usingDr
opboxCloudStorageServices.aspx
University RDM services e.g. Edinburgh
• DataStore
• Compute & Data Facility (HPC)
• DataSync
• Wiki service
• Subversion
• Electronic Lab Notebook
• DataShare repository
• DataVault
• Pure (research info)
• Secure data service
www.ed.ac.uk/information-services/research-support/research-data-service
CCimagebymomboleumonFlickr
One copy = risk of data loss
Who will do the backup?
Use managed services where possible (e.g. University
filestores rather than local or external hard drives), so
backup is done automatically
3… 2… 1… backup!
at least 3 copies of a file
on at least 2 different media
with at least 1 offsite
Ask central IT team for advice
Data sharing
Image CC-BY-NC-ND by talkingplant www.flickr.com/photos/talkingplant/2256485110
How to make data open?
1. Choose your dataset(s)
– What can you may open? You may need to revisit this step if you encounter
problems later.
2. Apply an open license
– Determine what IP exists. Apply a suitable licence e.g. CC-BY
3. Make the data available
– Provide the data in a suitable format. Use repositories.
4. Make it discoverable
– Post on the web, register in catalogues…
https://okfn.org
DCC how-to guide: www.dcc.ac.uk/resources/how-guides/license-research-data
License research data openly
EUDAT licensing tool
Answer questions to determine which licence(s) are
appropriate to use
http://ufal.github.io/lindat-license-selector
Deposit in a data repository
http://databib.org
www.re3data.org
The Re3data catalogue can be searched to find a home for data
www.fosteropenscience.eu
/content/re3data-demo
National / domain repositories
BioSharing portal of
databases in life sciences
www.re3data.org
https://biosharing.org
Zenodo is a multi-disciplinary repository that can be
used for the long-tail of research data
• An OpenAIRE-CERN joint effort
• Multidisciplinary repository accepting
– Multiple data types
– Publications
– Software
• Assigns a Digital Object Identifier (DOI)
• Links funding, publications, data & software
www.zenodo.org
Zenodo
Archiving code in Zenodo
Get a DOI for each release
https://guides.github.com/
activities/citable-code
Citing research data: why?
http://ands.org.au/cite-data
How to cite data
www.dcc.ac.uk/resources/briefing-papers/introduction-
curation/data-citation-and-linking
Key citation elements
• Author
• Publication date
• Title
• Location (= identifier)
• Funder (if applicable)
How do you share data effectively?
• Use appropriate repositories, this
catalogue is a good place to start
http://www.re3data.org
• Document and describe it enough for
others to understand, use and cite
http://www.dcc.ac.uk/resources/how-
guides/cite-datasets
• Licence it so others can reuse
www.dcc.ac.uk/resources/how-guides/license-
research-data
Thanks for listening
For DCC resources see:
www.dcc.ac.uk/resources
Follow DCC us on twitter:
@digitalcuration and #ukdcc

Weitere ähnliche Inhalte

Was ist angesagt?

Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?confluent
 
AstraZeneca - The promise of graphs & graph-based learning in drug discovery
AstraZeneca - The promise of graphs & graph-based learning in drug discoveryAstraZeneca - The promise of graphs & graph-based learning in drug discovery
AstraZeneca - The promise of graphs & graph-based learning in drug discoveryNeo4j
 
Metadata based statistics for DSpace
Metadata based statistics for DSpaceMetadata based statistics for DSpace
Metadata based statistics for DSpaceBram Luyten
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake ArchitectureDATAVERSITY
 
Introdução à Neo4j
Introdução à Neo4j Introdução à Neo4j
Introdução à Neo4j Neo4j
 
Applying Network Analytics in KYC
Applying Network Analytics in KYCApplying Network Analytics in KYC
Applying Network Analytics in KYCNeo4j
 
Introduction to Open Science and EOSC
Introduction to Open Science and EOSCIntroduction to Open Science and EOSC
Introduction to Open Science and EOSCSarah Jones
 
Great Expectations Presentation
Great Expectations PresentationGreat Expectations Presentation
Great Expectations PresentationAdam Doyle
 
Data Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaData Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaScyllaDB
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...Databricks
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineeringThang Bui (Bob)
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshSion Smith
 
Part 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapsePart 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapseNilesh Gule
 
What is Power BI
What is Power BIWhat is Power BI
What is Power BIDries Vyvey
 
Intro to Data Management Plans
Intro to Data Management PlansIntro to Data Management Plans
Intro to Data Management PlansSarah Jones
 

Was ist angesagt? (20)

Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
FAIR data
FAIR dataFAIR data
FAIR data
 
How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?
 
AstraZeneca - The promise of graphs & graph-based learning in drug discovery
AstraZeneca - The promise of graphs & graph-based learning in drug discoveryAstraZeneca - The promise of graphs & graph-based learning in drug discovery
AstraZeneca - The promise of graphs & graph-based learning in drug discovery
 
Metadata based statistics for DSpace
Metadata based statistics for DSpaceMetadata based statistics for DSpace
Metadata based statistics for DSpace
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake Architecture
 
Introdução à Neo4j
Introdução à Neo4j Introdução à Neo4j
Introdução à Neo4j
 
Applying Network Analytics in KYC
Applying Network Analytics in KYCApplying Network Analytics in KYC
Applying Network Analytics in KYC
 
Introduction to Open Science and EOSC
Introduction to Open Science and EOSCIntroduction to Open Science and EOSC
Introduction to Open Science and EOSC
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Data engineering design patterns
Data engineering design patternsData engineering design patterns
Data engineering design patterns
 
Great Expectations Presentation
Great Expectations PresentationGreat Expectations Presentation
Great Expectations Presentation
 
Data Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation CriteriaData Platform Architecture Principles and Evaluation Criteria
Data Platform Architecture Principles and Evaluation Criteria
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 
Part 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapsePart 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure Synapse
 
What is Power BI
What is Power BIWhat is Power BI
What is Power BI
 
Intro to Data Management Plans
Intro to Data Management PlansIntro to Data Management Plans
Intro to Data Management Plans
 

Ähnlich wie Intro to RDM

The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...Projeto RCAAP
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020Sarah Jones
 
Managing and sharing data
Managing and sharing dataManaging and sharing data
Managing and sharing dataSarah Jones
 
Managing and sharing data
Managing and sharing dataManaging and sharing data
Managing and sharing dataSarah Jones
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATTony Ross-Hellauer
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATOpenAIRE
 
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | EUDAT
 
RDM for Librarians
RDM for LibrariansRDM for Librarians
RDM for LibrariansMarieke Guy
 
Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6ARDC
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Sarah Anna Stewart
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
 
DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciencesSarah Jones
 
20160523 23 Research Data Things
20160523 23 Research Data Things20160523 23 Research Data Things
20160523 23 Research Data ThingsKatina Toufexis
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementJamie Bisset
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycleMarieke Guy
 

Ähnlich wie Intro to RDM (20)

The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
 
Data Management and Horizon 2020
Data Management and Horizon 2020Data Management and Horizon 2020
Data Management and Horizon 2020
 
Managing and sharing data
Managing and sharing dataManaging and sharing data
Managing and sharing data
 
Managing and sharing data
Managing and sharing dataManaging and sharing data
Managing and sharing data
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
RDM for Librarians
RDM for LibrariansRDM for Librarians
RDM for Librarians
 
Introduction to RDM for trainee physicians
Introduction to RDM for trainee physiciansIntroduction to RDM for trainee physicians
Introduction to RDM for trainee physicians
 
What is-rdm
What is-rdmWhat is-rdm
What is-rdm
 
Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6Fsci 2018 thursday2_august_am6
Fsci 2018 thursday2_august_am6
 
Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...Research Data (and Software) Management at Imperial: (Everything you need to ...
Research Data (and Software) Management at Imperial: (Everything you need to ...
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
 
DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciences
 
20160523 23 Research Data Things
20160523 23 Research Data Things20160523 23 Research Data Things
20160523 23 Research Data Things
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Introduction to Data Management and Sharing
Introduction to Data Management and SharingIntroduction to Data Management and Sharing
Introduction to Data Management and Sharing
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
 

Mehr von Sarah Jones

Data training tips and tricks
Data training tips and tricksData training tips and tricks
Data training tips and tricksSarah Jones
 
EOSC and libraries
EOSC and librariesEOSC and libraries
EOSC and librariesSarah Jones
 
EOSC Association priorities and activities
EOSC Association priorities and activitiesEOSC Association priorities and activities
EOSC Association priorities and activitiesSarah Jones
 
Managing and sharing data: lessons from the European context
Managing and sharing data: lessons from the European contextManaging and sharing data: lessons from the European context
Managing and sharing data: lessons from the European contextSarah Jones
 
Reflections on Open Science
Reflections on Open ScienceReflections on Open Science
Reflections on Open ScienceSarah Jones
 
MAR comments analysis
MAR comments analysisMAR comments analysis
MAR comments analysisSarah Jones
 
EOSC-MAR-update.pptx
EOSC-MAR-update.pptxEOSC-MAR-update.pptx
EOSC-MAR-update.pptxSarah Jones
 
Why is EOSC so hard?
Why is EOSC so hard?Why is EOSC so hard?
Why is EOSC so hard?Sarah Jones
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchersSarah Jones
 
Is Europe ready for Open Science
Is Europe ready for Open ScienceIs Europe ready for Open Science
Is Europe ready for Open ScienceSarah Jones
 
DMPonline: 10 years, 10 lessons
DMPonline: 10 years, 10 lessonsDMPonline: 10 years, 10 lessons
DMPonline: 10 years, 10 lessonsSarah Jones
 
Do & don't of supporting Open Science
Do & don't of supporting Open ScienceDo & don't of supporting Open Science
Do & don't of supporting Open ScienceSarah Jones
 
Why institutions need to raise their capabilities to support FAIR
Why institutions need to raise their capabilities to support FAIRWhy institutions need to raise their capabilities to support FAIR
Why institutions need to raise their capabilities to support FAIRSarah Jones
 
It takes more than a village: lessons on building global research commons
It takes more than a village: lessons on building global research commonsIt takes more than a village: lessons on building global research commons
It takes more than a village: lessons on building global research commonsSarah Jones
 
DMPTuuli - what's new?
DMPTuuli - what's new?DMPTuuli - what's new?
DMPTuuli - what's new?Sarah Jones
 
DCC and FAIR initiatives
DCC and FAIR initiativesDCC and FAIR initiatives
DCC and FAIR initiativesSarah Jones
 
Reflections on EOSC through the mirror of ARDC
Reflections on EOSC through the mirror of ARDCReflections on EOSC through the mirror of ARDC
Reflections on EOSC through the mirror of ARDCSarah Jones
 
Future EOSC roadmap
Future EOSC roadmapFuture EOSC roadmap
Future EOSC roadmapSarah Jones
 
Global Open Research Commons IG
Global Open Research Commons IGGlobal Open Research Commons IG
Global Open Research Commons IGSarah Jones
 

Mehr von Sarah Jones (20)

Data training tips and tricks
Data training tips and tricksData training tips and tricks
Data training tips and tricks
 
EOSC and libraries
EOSC and librariesEOSC and libraries
EOSC and libraries
 
EOSC Association priorities and activities
EOSC Association priorities and activitiesEOSC Association priorities and activities
EOSC Association priorities and activities
 
Managing and sharing data: lessons from the European context
Managing and sharing data: lessons from the European contextManaging and sharing data: lessons from the European context
Managing and sharing data: lessons from the European context
 
Reflections on Open Science
Reflections on Open ScienceReflections on Open Science
Reflections on Open Science
 
MAR comments analysis
MAR comments analysisMAR comments analysis
MAR comments analysis
 
EOSC-MAR-update.pptx
EOSC-MAR-update.pptxEOSC-MAR-update.pptx
EOSC-MAR-update.pptx
 
Intro-EOSC.pptx
Intro-EOSC.pptxIntro-EOSC.pptx
Intro-EOSC.pptx
 
Why is EOSC so hard?
Why is EOSC so hard?Why is EOSC so hard?
Why is EOSC so hard?
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchers
 
Is Europe ready for Open Science
Is Europe ready for Open ScienceIs Europe ready for Open Science
Is Europe ready for Open Science
 
DMPonline: 10 years, 10 lessons
DMPonline: 10 years, 10 lessonsDMPonline: 10 years, 10 lessons
DMPonline: 10 years, 10 lessons
 
Do & don't of supporting Open Science
Do & don't of supporting Open ScienceDo & don't of supporting Open Science
Do & don't of supporting Open Science
 
Why institutions need to raise their capabilities to support FAIR
Why institutions need to raise their capabilities to support FAIRWhy institutions need to raise their capabilities to support FAIR
Why institutions need to raise their capabilities to support FAIR
 
It takes more than a village: lessons on building global research commons
It takes more than a village: lessons on building global research commonsIt takes more than a village: lessons on building global research commons
It takes more than a village: lessons on building global research commons
 
DMPTuuli - what's new?
DMPTuuli - what's new?DMPTuuli - what's new?
DMPTuuli - what's new?
 
DCC and FAIR initiatives
DCC and FAIR initiativesDCC and FAIR initiatives
DCC and FAIR initiatives
 
Reflections on EOSC through the mirror of ARDC
Reflections on EOSC through the mirror of ARDCReflections on EOSC through the mirror of ARDC
Reflections on EOSC through the mirror of ARDC
 
Future EOSC roadmap
Future EOSC roadmapFuture EOSC roadmap
Future EOSC roadmap
 
Global Open Research Commons IG
Global Open Research Commons IGGlobal Open Research Commons IG
Global Open Research Commons IG
 

Kürzlich hochgeladen

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 

Kürzlich hochgeladen (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 

Intro to RDM

  • 1. Manging and Sharing Research Data Sarah Jones Digital Curation Centre, Glasgow sarah.jones@glasgow.ac.uk Twitter: @sjDCC Webinar for Diponegoro University, Indonesia, 3rd September 2019
  • 2. What is the Digital Curation Centre? “a centre of expertise in digital information curation with a focus on building capacity, capability and skills for research data management across the UK's higher education research community” www.dcc.ac.uk Training | Events | Tools | Advocacy | Consultancy | Guidance | Publications | Projects
  • 3. What is RDM? Image CC-BY-SA by Janneke Staaks www.flickr.com/photos/jannekestaaks/14411397343
  • 4. What is Research Data Management? Create Document Use Store Share Preserve “the active management and appraisal of data over the lifecycle of scholarly and scientific interest” Data management is part of good research practice
  • 5. What is involved in RDM? • Data Management Planning • Data creation • Annotating / documenting data • Analysis, use, versioning • Storage and backup • Publishing papers and data • Preparing for deposit • Archiving and sharing • Licensing • Citing… Create Document Use Store Share Preserve
  • 6. Why manage and share data? Direct benefits for you • To make your research easier! • Stop yourself drowning in irrelevant stuff • Make sure you can understand and reuse your data again later • Advance your career – data is growing in significance Research integrity • To avoid accusations of fraud or bad science • Evidence findings and enable validation of research methods • Meet codes of practice on research conduct • Many research funders worldwide now require Data Management and Sharing Plans Potential to share data • So others can reuse and build on your data • To gain credit – several studies have shown higher citation rates when data are shared • For greater visibility, impact and new research collaborations • Promote innovation and allow research in your field to advance faster
  • 7. Why make data available?
  • 8. Sharing leads to breakthroughs www.nytimes.com/2010/08/13/health/research /13alzheimer.html?pagewanted=all&_r=0 “It was unbelievable. Its not science the way most of us have practiced in our careers. But we all realised that we would never get biomarkers unless all of us parked our egos and intellectual property noses outside the door and agreed that all of our data would be public immediately.” Dr John Trojanowski, University of Pennsylvania ...and increases the speed of discovery
  • 9. Benefits for researchers: sharing data increases citations! Want evidence? • Piwowar, Vision – 9% (microarray data) • Drachen, Dorch, et al – 25-40%, astronomy • Gleditch, et al – doubling to trebling (international relations) Open Data Citation Advantage http://sparceurope.org/open-data-citation-advantage
  • 10. Benefits for institutions “If an institution spent A$10 million on data, what would be the return? The answer is: more publications; an increased citation count; more grants; greater profile; and more collaboration.” Dr Ross Wilkinson, ANDS www.ariadne.ac.uk/issue72/oar-2013-rpt
  • 11. Creating data Image CC-SA-ND by Bill Dickinson www.flickr.com/photos/skynoir/8270436894
  • 12. Data creation tips • Ensure consent forms, licences and agreements don’t restrict opportunities to share data • Choose appropriate formats • Adopt a file naming convention • Create metadata and documentation as you go
  • 13. Ask for consent for data sharing If not, data centres won’t be able to accept the data – regardless of any conditions on the original grant. www.data-archive.ac.uk/create-manage/consent-ethics/consent?index=3
  • 14. Choose appropriate file formats Different formats are good for different things - open, lossless formats are more sustainable e.g. rtf, xml, tif, wav - proprietary and/or compressed formats are less preservable but are often in widespread use e.g. doc, jpg, mp3 One format for analysis then convert to a standard format Data centres may suggest preferred formats for deposit www.data-archive.ac.uk/create-manage/format/formats-table BioformatsConverter batch converts a variety of proprietary microscopy image formats to the Open Microscopy Environment format - OME-TIFF
  • 15. What is metadata? Metadata • Standardised • Structured • Machine and human readable Metadata helps to cite & disambiguate data Documentation aids reuse Metadata Documentation
  • 16. Metadata standards These can be general – such as Dublin Core Or discipline specific – Data Documentation Initiative (DDI) – social science – Ecological Metadata Language (EML) - ecology – Flexible Image Transport System (FITS) – astronomy Search for standards in catalogues like: http://rd-alliance.github.io/metadata-directory
  • 17. “MTBLS1: A metabolomic study of urinary changes in type 2 diabetes in……” Why are ontologies important? Example courtesy of Ken Haug, European Bioinformatics Institute (EMBL-EBI)
  • 18. Documentation Think about what is needed in order to evaluate, understand, and reuse the data. • Why was the data created? • Have you documented what you did and how? • Did you develop code to run analyses? If so, this should be kept and shared too. • Important to provide wider context for trust
  • 19. ReadMe files We recommend that a ReadMe be a plain text file containing the following: • for each filename, a short description of what data it includes, optionally describing the relationship to the tables, figures, or sections within the accompanying publication • for tabular data: definitions of column headings and row labels; data codes (including missing data); and measurement units • any data processing steps, especially if not described in the publication, that may affect interpretation of results • a description of what associated datasets are stored elsewhere, if applicable • whom to contact with questions http://datadryad.org/pages/readme Example template: https://www.lib.umn.edu/datamanagement/metadata
  • 20. Managing data Image ‘tools’ CC-BY by zzpza www.flickr.com/photos/zzpza/3269784239
  • 21. Where will you store the data? • Your own device (laptop, flash drive, server etc.) – And if you lose it? Or it breaks? • Departmental drives or university servers • “Cloud” storage – Do they care as much about your data as you do? The decision will be based on how sensitive your data are, how robust you need the storage to be, and who needs access to the data and when
  • 22. How to keep you data secure? Develop a practical solution that fits your circumstances – Store your data on managed servers – Restrict access to collaborators or smaller subset – Encrypt mobile devices carrying sensitive information – Keep anti-virus software up-to-date – Use secure data services for long-term sharing www.wsj.com/articles/SB10001424052748703843804575534122591921594
  • 23. Collaborative platforms e.g. OSF https://osf.io
  • 24. Data-specific platforms e.g. OMERO Open Microscopy Environment http://www.openmicroscopy.org/site/products/omero
  • 25. Third-party tools for collaboration ownCloud • Open source product with Dropbox-like functionality • Used by many unis and service providers to offer ‘approved’ solution https://owncloud.org Using Dropbox and other cloud services – LSE guidelines http://www.lse.ac.uk/intranet/LSEServices/ IMT/guides/softwareGuides/other/usingDr opboxCloudStorageServices.aspx
  • 26. University RDM services e.g. Edinburgh • DataStore • Compute & Data Facility (HPC) • DataSync • Wiki service • Subversion • Electronic Lab Notebook • DataShare repository • DataVault • Pure (research info) • Secure data service www.ed.ac.uk/information-services/research-support/research-data-service
  • 28. Who will do the backup? Use managed services where possible (e.g. University filestores rather than local or external hard drives), so backup is done automatically 3… 2… 1… backup! at least 3 copies of a file on at least 2 different media with at least 1 offsite Ask central IT team for advice
  • 29. Data sharing Image CC-BY-NC-ND by talkingplant www.flickr.com/photos/talkingplant/2256485110
  • 30. How to make data open? 1. Choose your dataset(s) – What can you may open? You may need to revisit this step if you encounter problems later. 2. Apply an open license – Determine what IP exists. Apply a suitable licence e.g. CC-BY 3. Make the data available – Provide the data in a suitable format. Use repositories. 4. Make it discoverable – Post on the web, register in catalogues… https://okfn.org
  • 31. DCC how-to guide: www.dcc.ac.uk/resources/how-guides/license-research-data License research data openly
  • 32. EUDAT licensing tool Answer questions to determine which licence(s) are appropriate to use http://ufal.github.io/lindat-license-selector
  • 33. Deposit in a data repository http://databib.org www.re3data.org The Re3data catalogue can be searched to find a home for data www.fosteropenscience.eu /content/re3data-demo
  • 34. National / domain repositories BioSharing portal of databases in life sciences www.re3data.org https://biosharing.org
  • 35. Zenodo is a multi-disciplinary repository that can be used for the long-tail of research data • An OpenAIRE-CERN joint effort • Multidisciplinary repository accepting – Multiple data types – Publications – Software • Assigns a Digital Object Identifier (DOI) • Links funding, publications, data & software www.zenodo.org Zenodo
  • 36. Archiving code in Zenodo Get a DOI for each release https://guides.github.com/ activities/citable-code
  • 37. Citing research data: why? http://ands.org.au/cite-data
  • 38. How to cite data www.dcc.ac.uk/resources/briefing-papers/introduction- curation/data-citation-and-linking Key citation elements • Author • Publication date • Title • Location (= identifier) • Funder (if applicable)
  • 39. How do you share data effectively? • Use appropriate repositories, this catalogue is a good place to start http://www.re3data.org • Document and describe it enough for others to understand, use and cite http://www.dcc.ac.uk/resources/how- guides/cite-datasets • Licence it so others can reuse www.dcc.ac.uk/resources/how-guides/license- research-data
  • 40. Thanks for listening For DCC resources see: www.dcc.ac.uk/resources Follow DCC us on twitter: @digitalcuration and #ukdcc

Hinweis der Redaktion

  1. Data as institutional crown jewels, special collections, the new oil...
  2. From Cambridge University workshop, 33 participants Asked what species the study was on and a range of answers denoting humans were provided Humans can see these all mean the same things but computers can’t