SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
This work is licensed under the Creative Commons Attribution 2.5 UK: Scotland License
S. Venkataraman, PhD
Research Data Specialist
Digital Curation Centre
s.venkataraman@ed.ac.uk
21st October 2019, Open Access Week Webinar
Principles of Research Data
Management
About the DCC
• Established in 2004
• Based in Edinburgh and Glasgow
• Works at national and international levels
• One of leading organisations in the world specialising in
training, consultancy, policy making and advocacy in digital
data management best practice and services provision
• Involved in many international consortia, schools and projects
• (Not involved in actual curation of any data!)
Is there a reproducibility crisis?
Baker, M. (2016)
“1,500 scientists lift
the lid on
reproducibility”,
Nature, 533:7604,
http://www.nature.com/n
ews/1-500-scientists-lift-
the-lid-on-
reproducibility-1.19970
Why make data available?
Who has heard of this before…?
Image CC-BY-SA by SangyaPundir
European perspective…
https://publications.europa.eu/en/publication-detail/-
/publication/7769a148-f1f6-11e8-9982-
01aa75ed71a1/language-en/format-PDF/source-
80611283
Slide CC-BY by Erik Schultes, Leiden UMC
What FAIR means: 15 principles
Comprehensive descriptions can be found at https://www.go-fair.org/fair-
principles/
Common misconceptions
• FAIR data does not have to be open
• The principles do not specify particular technologies or implementations
e.g. semantic web
• FAIR is not a standard to be followed or strict criteria – it’s a spectrum /
continuum
• It doesn’t only apply to the life sciences
All research data
Managed data
FAIR
data
Open
data
the
wild
Increasing that which is FAIR & open
Managed data
FAIR
data
Open
data
the
wild
as open as
possible, as
closed as
necessary
Image: ‘Balancing rocks’ by Viewminder CC-BY-SA-ND
www.flickr.com/photos/light_seeker/7780857224
RDM & the Data LifecycleImage CC-BY-SA by Janneke Staaks www.flickr.com/photos/jannekestaaks/14411397343
What is Research Data Management?
“the active management and
appraisal of data over the lifecycle of
scholarly and scientific interest”
Data management is part of good
research practice
Create
Document
Use
Store
Share
Preserve
Create
Document
Use
Store
Share
Preserve
Data creation tips
• Ensure consent forms, licences and agreements don’t restrict
opportunities to share data
• Choose appropriate formats
• Adopt a file naming convention
• Create metadata and documentation as you go
Ask for consent for data sharing
If not, data centres won’t be able to accept the data – regardless of any
conditions on the original grant.
www.data-archive.ac.uk/create-manage/consent-ethics/consent?index=3
Choose appropriate file formats
Different formats are good for different things
• open, lossless formats are more sustainable e.g. rtf, xml, tif, wav
• proprietary and/or compressed formats are less preservable but are often
in widespread use e.g. doc, jpg, mp3
One format for analysis then
convert to a standard format
Data centres may suggest preferred formats for deposit
https://www.ukdataservice.ac.uk/manage-data/format/recommended-formats
Type of data Recommended formats Acceptable formats
Tabular data with extensive metadata
variable labels, code labels, and defined missing values
SPSS portable format (.por)
delimited text and command ('setup') file (SPSS, Stata, SAS, etc.)
structured text or mark-up file of metadata information, e.g.
DDI XML file
proprietary formats of statistical packages: SPSS (.sav), Stata
(.dta), MS Access (.mdb/.accdb)
Tabular data with minimal metadata
column headings, variable names
comma-separated values (.csv)
tab-delimited file (.tab)
delimited text with SQL data definition statements
delimited text (.txt) with characters not present in data used as
delimiters
widely-used formats: MS Excel (.xls/.xlsx), MS Access
(.mdb/.accdb), dBase (.dbf), OpenDocument Spreadsheet (.ods)
Geospatial data
vector and raster data
ESRI Shapefile (.shp, .shx, .dbf, .prj, .sbx, .sbn optional)
geo-referenced TIFF (.tif, .tfw)
CAD data (.dwg)
tabular GIS attribute data
Geography Markup Language (.gml)
ESRI Geodatabase format (.mdb)
MapInfo Interchange Format (.mif) for vector data
Keyhole Mark-up Language (.kml)
Adobe Illustrator (.ai), CAD data (.dxf or .svg)
binary formats of GIS and CAD packages
Textual data Rich Text Format (.rtf)
plain text, ASCII (.txt)
eXtensible Mark-up Language (.xml) text according to an
appropriate Document Type Definition (DTD) or schema
Hypertext Mark-up Language (.html)
widely-used formats: MS Word (.doc/.docx)
some software-specific formats: NUD*IST, NVivo and ATLAS.ti
Image data TIFF 6.0 uncompressed (.tif) JPEG (.jpeg, .jpg, .jp2) if original created in this format
GIF (.gif)
TIFF other versions (.tif, .tiff)
RAW image format (.raw)
Photoshop files (.psd)
BMP (.bmp)
PNG (.png)
Adobe Portable Document Format (PDF/A, PDF) (.pdf)
Audio data Free Lossless Audio Codec (FLAC) (.flac) MPEG-1 Audio Layer 3 (.mp3) if original created in this format
Audio Interchange File Format (.aif)
Waveform Audio Format (.wav)
Video data MPEG-4 (.mp4)
OGG video (.ogv, .ogg)
motion JPEG 2000 (.mj2)
AVCHD video (.avchd)
Documentation and scripts Rich Text Format (.rtf)
PDF/UA, PDF/A or PDF (.pdf)
XHTML or HTML (.xhtml, .htm)
OpenDocument Text (.odt)
plain text (.txt)
widely-used formats: MS Word (.doc/.docx), MS Excel
(.xls/.xlsx)
XML marked-up text (.xml) according to an appropriate DTD or
schema, e.g. XHMTL 1.0
https://www.ukdataservice.ac.uk/manage-data/format/recommended-formats
How will you organise your data?
• Keep file and folder names short, but meaningful
• Agree a method for versioning
• Include dates in a set format e.g. YYYYMMDD
• Avoid using non-alphanumeric characters in file names
• Use hyphens or underscores not spaces e.g. day-sheet, day sheet
• Order the elements in the most appropriate way to retrieve the record
Example from ARM Climate Research Facility www.arm.gov/data/docs/plan
Create
Document
Use
Store
Share
Preserve
Documentation
Think about what is needed in order to evaluate, understand, and reuse the
data.
• Why was the data created?
• Have you documented what you did and how?
• Did you develop code to run analyses? If so, this should be kept and
shared too.
• Important to provide wider context for trust
What are metadata?
Metadata
• Standardised
• Structured
• Machine and human readable
Metadata helps to cite &
disambiguate data
Documentation aids reuse
Metadata
Documentation
Metadata standards
These can be general – such as Dublin Core
Or discipline specific
• Data Documentation Initiative (DDI) – social science
• Ecological Metadata Language (EML) - ecology
• Flexible Image Transport System (FITS) – astronomy
Search for standards in catalogues like:
http://rd-alliance.github.io/metadata-directory/
https://rdamsc.dcc.ac.uk/
“MTBLS1: A metabolomic study of urinary changes in type 2 diabetes in……”
Example courtesy of Ken Haug, European Bioinformatics Institute (EMBL-EBI)
Controlled vocabularies
e.g. SNOMED CT (clinical terms) or MeSH
• Defined terms + taxonomy
• Useful for selecting keywords to tag datasets
• You can find many ontologies in the BARTOC catalogue and elsewhere
➢ Organism A
➢ Term A1
➢ Term A2
➢ Term A3
➢ Term B1
➢ Term B2
➢ Term C4
➢ .
➢ .
➢ .
➢ Term n
► Organism B
► Term A1
► Term A2
► Term A3
► Term B1
► Term B2
► Term C4
► .
► .
► .
► Term n
…and ontologies?
Create
Document
Use
Store
Share
Preserve
Where will you store the data?
• Your own device (laptop, flash drive, server etc.)
– And if you lose it? Or it breaks?
• Departmental drives or university servers
• “Cloud” storage
– Do they care as much about your data as you do?
The decision will be based on how sensitive your data
are, how robust you need the storage to be, and
who needs access to the data and when
Third-party tools for collaboration
ownCloud
• Open source product with
Dropbox-like functionality
• Used by many universities and
service providers to offer
‘approved’ solution
https://owncloud.org
Using Dropbox and other cloud
services
Backup and preservation – not the
same thing!
Backups
• Used to take periodic snapshots of data in case the current version is
destroyed or lost
• Backups are copies of files stored for short or near-long-term
• Often performed on a somewhat frequent schedule
Archiving
• Used to preserve data for historical reference or potentially during
disasters
• Archives are usually the final version, stored for long-term, and generally
not copied over
• Often performed at the end of a project or during major milestones
Create
Document
Use
Store
Share
Preserve
Part of How To Attribute Creative Commons Photos by Foter, licensed CC BY SA 3.0
License research data openly
EUDAT licensing tool
Answer questions to determine which licence(s) are appropriate to use
https://ufal.github.io/public-license-selector/
Create
Document
Use
Store
Share
Preserve
Deposit in a data repository
http://databib.org
www.re3data.org
The Re3data catalogue can be searched to find a home for data
www.fosteropenscience.eu/
content/re3data-demo
Criteria for selecting a repository
• Better to use a domain specific repository if available
• Check they match particular data needs e.g. formats accepted, mixture of
Open and Restricted Access.
• Do they assign a persistent and globally unique identifier for sustainable
citations and to links back to particular researchers and grants?
• Look for certification as a ‘Trustworthy Digital Repository’ with an explicit
ambition to keep the data available in long term.
Icons to note open
access, licenses,
PIDs,
certificates…
What is a Persistent Identifier (PID)?
a long-lasting reference to a document, file or other object
• PIDs come in various forms e.g. ORCID, DOI, ISBN...
• Typically they’re actionable i.e. type it into web browser to access
• Many repositories will assign them on deposit
Questions?
Thank you!
For DCC resources see:
www.dcc.ac.uk/resources
Follow us on twitter:
@digitalcuration and #ukdcc

Weitere ähnliche Inhalte

Was ist angesagt?

Glossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data GovernanceGlossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data GovernanceDATAVERSITY
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best PracticesDATAVERSITY
 
Inside open metadata—the deep dive
Inside open metadata—the deep diveInside open metadata—the deep dive
Inside open metadata—the deep diveDataWorks Summit
 
Reference master data management
Reference master data managementReference master data management
Reference master data managementDr. Hamdan Al-Sabri
 
White Paper - Data Warehouse Documentation Roadmap
White Paper -  Data Warehouse Documentation RoadmapWhite Paper -  Data Warehouse Documentation Roadmap
White Paper - Data Warehouse Documentation RoadmapDavid Walker
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata ManagementDATAVERSITY
 
Intro to Data Management Plans
Intro to Data Management PlansIntro to Data Management Plans
Intro to Data Management PlansSarah Jones
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Building Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks DeltaBuilding Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks DeltaDatabricks
 
White Paper - Data Warehouse Project Management
White Paper - Data Warehouse Project ManagementWhite Paper - Data Warehouse Project Management
White Paper - Data Warehouse Project ManagementDavid Walker
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeDatabricks
 
Data Governance Initiative
Data Governance InitiativeData Governance Initiative
Data Governance InitiativeDataWorks Summit
 
How to Make a Data Governance Program that Lasts
How to Make a Data Governance Program that LastsHow to Make a Data Governance Program that Lasts
How to Make a Data Governance Program that LastsDATAVERSITY
 
Physical architecture of sql server
Physical architecture of sql serverPhysical architecture of sql server
Physical architecture of sql serverDivya Sharma
 
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsApache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsDataWorks Summit/Hadoop Summit
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...DATAVERSITY
 

Was ist angesagt? (20)

Linked Data Tutorial
Linked Data TutorialLinked Data Tutorial
Linked Data Tutorial
 
Glossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data GovernanceGlossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data Governance
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
 
Inside open metadata—the deep dive
Inside open metadata—the deep diveInside open metadata—the deep dive
Inside open metadata—the deep dive
 
Reference master data management
Reference master data managementReference master data management
Reference master data management
 
White Paper - Data Warehouse Documentation Roadmap
White Paper -  Data Warehouse Documentation RoadmapWhite Paper -  Data Warehouse Documentation Roadmap
White Paper - Data Warehouse Documentation Roadmap
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata Management
 
Intro to Data Management Plans
Intro to Data Management PlansIntro to Data Management Plans
Intro to Data Management Plans
 
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
EUDAT & OpenAIRE Webinar: How to write a Data Management Plan - July 14, 2016...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Building Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks DeltaBuilding Robust Production Data Pipelines with Databricks Delta
Building Robust Production Data Pipelines with Databricks Delta
 
White Paper - Data Warehouse Project Management
White Paper - Data Warehouse Project ManagementWhite Paper - Data Warehouse Project Management
White Paper - Data Warehouse Project Management
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
 
Data Governance Initiative
Data Governance InitiativeData Governance Initiative
Data Governance Initiative
 
How to Make a Data Governance Program that Lasts
How to Make a Data Governance Program that LastsHow to Make a Data Governance Program that Lasts
How to Make a Data Governance Program that Lasts
 
Physical architecture of sql server
Physical architecture of sql serverPhysical architecture of sql server
Physical architecture of sql server
 
Apache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop componentsApache Atlas: Tracking dataset lineage across Hadoop components
Apache Atlas: Tracking dataset lineage across Hadoop components
 
From Data Warehouse to Lakehouse
From Data Warehouse to LakehouseFrom Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
 

Ähnlich wie OpenAIRE webinar: Principles of Research Data Management, with S. Venkataraman (DCC)

Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data ManagementOpenAIRE
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and SharingC. Tobin Magle
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...Projeto RCAAP
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchersSarah Jones
 
File Formats for Preservation
File Formats for PreservationFile Formats for Preservation
File Formats for PreservationStephen Gray
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycleMarieke Guy
 
Take control of your PhD journey: Manage your research data according to best...
Take control of your PhD journey: Manage your research data according to best...Take control of your PhD journey: Manage your research data according to best...
Take control of your PhD journey: Manage your research data according to best...Lars Figenschou
 
Research Lifecycles and RDM
Research Lifecycles and RDMResearch Lifecycles and RDM
Research Lifecycles and RDMMarieke Guy
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Managementdancrane_open
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersRebekah Cummings
 
Keep Calm and Curate
Keep Calm and CurateKeep Calm and Curate
Keep Calm and CurateGarethKnight
 
Data management for TA's
Data management for TA'sData management for TA's
Data management for TA'saaroncollie
 
Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016IzzyChad
 
FAIR BioData Management
FAIR BioData ManagementFAIR BioData Management
FAIR BioData ManagementUlrike Wittig
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016 Rebecca Raworth, MLIS
 
Research data management workshop April 2016
Research data management workshop April 2016Research data management workshop April 2016
Research data management workshop April 2016Rebecca Raworth, MLIS
 

Ähnlich wie OpenAIRE webinar: Principles of Research Data Management, with S. Venkataraman (DCC) (20)

Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data Management
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and Sharing
 
Good Practice in Research Data Management
Good Practice in Research Data ManagementGood Practice in Research Data Management
Good Practice in Research Data Management
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchers
 
File Formats for Preservation
File Formats for PreservationFile Formats for Preservation
File Formats for Preservation
 
Managing data throughout the research lifecycle
Managing data throughout the research lifecycleManaging data throughout the research lifecycle
Managing data throughout the research lifecycle
 
Take control of your PhD journey: Manage your research data according to best...
Take control of your PhD journey: Manage your research data according to best...Take control of your PhD journey: Manage your research data according to best...
Take control of your PhD journey: Manage your research data according to best...
 
Research Lifecycles and RDM
Research Lifecycles and RDMResearch Lifecycles and RDM
Research Lifecycles and RDM
 
Planning for Research Data Management
Planning for Research Data ManagementPlanning for Research Data Management
Planning for Research Data Management
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
 
Keep Calm and Curate
Keep Calm and CurateKeep Calm and Curate
Keep Calm and Curate
 
Data management for TA's
Data management for TA'sData management for TA's
Data management for TA's
 
Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016
 
Resources for Research Data Managers - 2014-05-28 - University of Oxford
Resources for Research Data Managers - 2014-05-28 - University of OxfordResources for Research Data Managers - 2014-05-28 - University of Oxford
Resources for Research Data Managers - 2014-05-28 - University of Oxford
 
Data management
Data management Data management
Data management
 
FAIR BioData Management
FAIR BioData ManagementFAIR BioData Management
FAIR BioData Management
 
Introduction to RDM for trainee physicians
Introduction to RDM for trainee physiciansIntroduction to RDM for trainee physicians
Introduction to RDM for trainee physicians
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016
 
Research data management workshop April 2016
Research data management workshop April 2016Research data management workshop April 2016
Research data management workshop April 2016
 

Mehr von OpenAIRE

10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community Call10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community CallOpenAIRE
 
9th Content Providers Community Call\
9th Content Providers Community Call\9th Content Providers Community Call\
9th Content Providers Community Call\OpenAIRE
 
OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)OpenAIRE
 
8th Content Providers Community Call
8th Content Providers Community Call8th Content Providers Community Call
8th Content Providers Community CallOpenAIRE
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community CallOpenAIRE
 
OpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managersOpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managersOpenAIRE
 
What will it cost to manage and share my data?
What will it cost to manage and share my data?What will it cost to manage and share my data?
What will it cost to manage and share my data?OpenAIRE
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)OpenAIRE
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)OpenAIRE
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)OpenAIRE
 
6th Content Providers Community Call
6th Content Providers Community Call6th Content Providers Community Call
6th Content Providers Community CallOpenAIRE
 
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing DataOpenAIRE
 
20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?OpenAIRE
 
20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open ScienceOpenAIRE
 
20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)OpenAIRE
 
20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open ScienceOpenAIRE
 
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing DataOpenAIRE
 
COVID-19: Activities, tools, best practice and contact points in Greece
 COVID-19: Activities, tools, best practice and contact points in Greece COVID-19: Activities, tools, best practice and contact points in Greece
COVID-19: Activities, tools, best practice and contact points in GreeceOpenAIRE
 
5th Content Providers Community Call
5th Content Providers Community Call5th Content Providers Community Call
5th Content Providers Community CallOpenAIRE
 
4th Content Providers Community Call
4th Content Providers Community Call4th Content Providers Community Call
4th Content Providers Community CallOpenAIRE
 

Mehr von OpenAIRE (20)

10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community Call10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community Call
 
9th Content Providers Community Call\
9th Content Providers Community Call\9th Content Providers Community Call\
9th Content Providers Community Call\
 
OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)
 
8th Content Providers Community Call
8th Content Providers Community Call8th Content Providers Community Call
8th Content Providers Community Call
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community Call
 
OpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managersOpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managers
 
What will it cost to manage and share my data?
What will it cost to manage and share my data?What will it cost to manage and share my data?
What will it cost to manage and share my data?
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
 
6th Content Providers Community Call
6th Content Providers Community Call6th Content Providers Community Call
6th Content Providers Community Call
 
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
 
20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?
 
20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science
 
20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)
 
20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science
 
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
 
COVID-19: Activities, tools, best practice and contact points in Greece
 COVID-19: Activities, tools, best practice and contact points in Greece COVID-19: Activities, tools, best practice and contact points in Greece
COVID-19: Activities, tools, best practice and contact points in Greece
 
5th Content Providers Community Call
5th Content Providers Community Call5th Content Providers Community Call
5th Content Providers Community Call
 
4th Content Providers Community Call
4th Content Providers Community Call4th Content Providers Community Call
4th Content Providers Community Call
 

Kürzlich hochgeladen

Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Joonhun Lee
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oManavSingh202607
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicinesherlingomez2
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 

Kürzlich hochgeladen (20)

Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 

OpenAIRE webinar: Principles of Research Data Management, with S. Venkataraman (DCC)

  • 1. This work is licensed under the Creative Commons Attribution 2.5 UK: Scotland License S. Venkataraman, PhD Research Data Specialist Digital Curation Centre s.venkataraman@ed.ac.uk 21st October 2019, Open Access Week Webinar Principles of Research Data Management
  • 2. About the DCC • Established in 2004 • Based in Edinburgh and Glasgow • Works at national and international levels • One of leading organisations in the world specialising in training, consultancy, policy making and advocacy in digital data management best practice and services provision • Involved in many international consortia, schools and projects • (Not involved in actual curation of any data!)
  • 3. Is there a reproducibility crisis? Baker, M. (2016) “1,500 scientists lift the lid on reproducibility”, Nature, 533:7604, http://www.nature.com/n ews/1-500-scientists-lift- the-lid-on- reproducibility-1.19970
  • 4. Why make data available?
  • 5. Who has heard of this before…? Image CC-BY-SA by SangyaPundir
  • 7. Slide CC-BY by Erik Schultes, Leiden UMC What FAIR means: 15 principles Comprehensive descriptions can be found at https://www.go-fair.org/fair- principles/
  • 8. Common misconceptions • FAIR data does not have to be open • The principles do not specify particular technologies or implementations e.g. semantic web • FAIR is not a standard to be followed or strict criteria – it’s a spectrum / continuum • It doesn’t only apply to the life sciences
  • 9. All research data Managed data FAIR data Open data the wild
  • 10. Increasing that which is FAIR & open Managed data FAIR data Open data the wild
  • 11. as open as possible, as closed as necessary Image: ‘Balancing rocks’ by Viewminder CC-BY-SA-ND www.flickr.com/photos/light_seeker/7780857224
  • 12. RDM & the Data LifecycleImage CC-BY-SA by Janneke Staaks www.flickr.com/photos/jannekestaaks/14411397343
  • 13. What is Research Data Management? “the active management and appraisal of data over the lifecycle of scholarly and scientific interest” Data management is part of good research practice Create Document Use Store Share Preserve
  • 15. Data creation tips • Ensure consent forms, licences and agreements don’t restrict opportunities to share data • Choose appropriate formats • Adopt a file naming convention • Create metadata and documentation as you go
  • 16. Ask for consent for data sharing If not, data centres won’t be able to accept the data – regardless of any conditions on the original grant. www.data-archive.ac.uk/create-manage/consent-ethics/consent?index=3
  • 17. Choose appropriate file formats Different formats are good for different things • open, lossless formats are more sustainable e.g. rtf, xml, tif, wav • proprietary and/or compressed formats are less preservable but are often in widespread use e.g. doc, jpg, mp3 One format for analysis then convert to a standard format Data centres may suggest preferred formats for deposit https://www.ukdataservice.ac.uk/manage-data/format/recommended-formats
  • 18. Type of data Recommended formats Acceptable formats Tabular data with extensive metadata variable labels, code labels, and defined missing values SPSS portable format (.por) delimited text and command ('setup') file (SPSS, Stata, SAS, etc.) structured text or mark-up file of metadata information, e.g. DDI XML file proprietary formats of statistical packages: SPSS (.sav), Stata (.dta), MS Access (.mdb/.accdb) Tabular data with minimal metadata column headings, variable names comma-separated values (.csv) tab-delimited file (.tab) delimited text with SQL data definition statements delimited text (.txt) with characters not present in data used as delimiters widely-used formats: MS Excel (.xls/.xlsx), MS Access (.mdb/.accdb), dBase (.dbf), OpenDocument Spreadsheet (.ods) Geospatial data vector and raster data ESRI Shapefile (.shp, .shx, .dbf, .prj, .sbx, .sbn optional) geo-referenced TIFF (.tif, .tfw) CAD data (.dwg) tabular GIS attribute data Geography Markup Language (.gml) ESRI Geodatabase format (.mdb) MapInfo Interchange Format (.mif) for vector data Keyhole Mark-up Language (.kml) Adobe Illustrator (.ai), CAD data (.dxf or .svg) binary formats of GIS and CAD packages Textual data Rich Text Format (.rtf) plain text, ASCII (.txt) eXtensible Mark-up Language (.xml) text according to an appropriate Document Type Definition (DTD) or schema Hypertext Mark-up Language (.html) widely-used formats: MS Word (.doc/.docx) some software-specific formats: NUD*IST, NVivo and ATLAS.ti Image data TIFF 6.0 uncompressed (.tif) JPEG (.jpeg, .jpg, .jp2) if original created in this format GIF (.gif) TIFF other versions (.tif, .tiff) RAW image format (.raw) Photoshop files (.psd) BMP (.bmp) PNG (.png) Adobe Portable Document Format (PDF/A, PDF) (.pdf) Audio data Free Lossless Audio Codec (FLAC) (.flac) MPEG-1 Audio Layer 3 (.mp3) if original created in this format Audio Interchange File Format (.aif) Waveform Audio Format (.wav) Video data MPEG-4 (.mp4) OGG video (.ogv, .ogg) motion JPEG 2000 (.mj2) AVCHD video (.avchd) Documentation and scripts Rich Text Format (.rtf) PDF/UA, PDF/A or PDF (.pdf) XHTML or HTML (.xhtml, .htm) OpenDocument Text (.odt) plain text (.txt) widely-used formats: MS Word (.doc/.docx), MS Excel (.xls/.xlsx) XML marked-up text (.xml) according to an appropriate DTD or schema, e.g. XHMTL 1.0 https://www.ukdataservice.ac.uk/manage-data/format/recommended-formats
  • 19. How will you organise your data? • Keep file and folder names short, but meaningful • Agree a method for versioning • Include dates in a set format e.g. YYYYMMDD • Avoid using non-alphanumeric characters in file names • Use hyphens or underscores not spaces e.g. day-sheet, day sheet • Order the elements in the most appropriate way to retrieve the record Example from ARM Climate Research Facility www.arm.gov/data/docs/plan
  • 21. Documentation Think about what is needed in order to evaluate, understand, and reuse the data. • Why was the data created? • Have you documented what you did and how? • Did you develop code to run analyses? If so, this should be kept and shared too. • Important to provide wider context for trust
  • 22. What are metadata? Metadata • Standardised • Structured • Machine and human readable Metadata helps to cite & disambiguate data Documentation aids reuse Metadata Documentation
  • 23. Metadata standards These can be general – such as Dublin Core Or discipline specific • Data Documentation Initiative (DDI) – social science • Ecological Metadata Language (EML) - ecology • Flexible Image Transport System (FITS) – astronomy Search for standards in catalogues like: http://rd-alliance.github.io/metadata-directory/ https://rdamsc.dcc.ac.uk/
  • 24. “MTBLS1: A metabolomic study of urinary changes in type 2 diabetes in……” Example courtesy of Ken Haug, European Bioinformatics Institute (EMBL-EBI) Controlled vocabularies
  • 25. e.g. SNOMED CT (clinical terms) or MeSH • Defined terms + taxonomy • Useful for selecting keywords to tag datasets • You can find many ontologies in the BARTOC catalogue and elsewhere ➢ Organism A ➢ Term A1 ➢ Term A2 ➢ Term A3 ➢ Term B1 ➢ Term B2 ➢ Term C4 ➢ . ➢ . ➢ . ➢ Term n ► Organism B ► Term A1 ► Term A2 ► Term A3 ► Term B1 ► Term B2 ► Term C4 ► . ► . ► . ► Term n …and ontologies?
  • 27. Where will you store the data? • Your own device (laptop, flash drive, server etc.) – And if you lose it? Or it breaks? • Departmental drives or university servers • “Cloud” storage – Do they care as much about your data as you do? The decision will be based on how sensitive your data are, how robust you need the storage to be, and who needs access to the data and when
  • 28. Third-party tools for collaboration ownCloud • Open source product with Dropbox-like functionality • Used by many universities and service providers to offer ‘approved’ solution https://owncloud.org Using Dropbox and other cloud services
  • 29. Backup and preservation – not the same thing! Backups • Used to take periodic snapshots of data in case the current version is destroyed or lost • Backups are copies of files stored for short or near-long-term • Often performed on a somewhat frequent schedule Archiving • Used to preserve data for historical reference or potentially during disasters • Archives are usually the final version, stored for long-term, and generally not copied over • Often performed at the end of a project or during major milestones
  • 31. Part of How To Attribute Creative Commons Photos by Foter, licensed CC BY SA 3.0 License research data openly
  • 32. EUDAT licensing tool Answer questions to determine which licence(s) are appropriate to use https://ufal.github.io/public-license-selector/
  • 34. Deposit in a data repository http://databib.org www.re3data.org The Re3data catalogue can be searched to find a home for data www.fosteropenscience.eu/ content/re3data-demo
  • 35. Criteria for selecting a repository • Better to use a domain specific repository if available • Check they match particular data needs e.g. formats accepted, mixture of Open and Restricted Access. • Do they assign a persistent and globally unique identifier for sustainable citations and to links back to particular researchers and grants? • Look for certification as a ‘Trustworthy Digital Repository’ with an explicit ambition to keep the data available in long term. Icons to note open access, licenses, PIDs, certificates…
  • 36. What is a Persistent Identifier (PID)? a long-lasting reference to a document, file or other object • PIDs come in various forms e.g. ORCID, DOI, ISBN... • Typically they’re actionable i.e. type it into web browser to access • Many repositories will assign them on deposit
  • 38. Thank you! For DCC resources see: www.dcc.ac.uk/resources Follow us on twitter: @digitalcuration and #ukdcc