SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Downloaden Sie, um offline zu lesen
Using Open Science to advance science
–
advancing open data
Robert Oostenveld
Donders Institute, Radboud University, Nijmegen, NL
Karolinska Institutet, Stockholm, SE
r.oostenveld@donders.ru.nl
FieldTrip toolbox
Open Source MATLAB-based
toolbox for MEG, EEG and iEEG
analysis
Development started around
2004 with the “F.C. Donders
Centre” (now the Donders
Institute)
Estimated 3000 users, 1500
people on the discussion list,
close to 1000 citations per
year
Donders Repository
Brain Imaging Data Structure
Top-down pressure to change
how we do our research
Expectations on publicly funded research
Cartoon by Bas van der Schot
Shifts in research funding
EU: train young researchers for jobs
in society in European Training
Networks.
EU/NL: Public-Private partnerships for
better knowledge transfer and
utilization.
“NWO is of the opinion that research
results paid for by public funds should
be freely accessible worldwide. This
applies to both scientific publications
and other forms of scientific output.”
Bottom-up pressure to change
how we do our research
The problem
many studies with low statistical power
publish or perish results in reporting bias
The consequence
overreporting of false positives
overestimates effect size
low reproducibility of results
Poor reproducibility
Open Science Collaboration, Science (2015). DOI: 10.1126/science.aac4716
70 independent teams analyzed the same dataset,
testing the same 9 hypotheses
no two teams chose identical workflows to analyse the
data … resulted in sizeable variation in the results
analytical flexibility can have substantial effects on
scientific conclusions
results emphasize the importance of validating and
sharing complex analysis workflows
Scientific efficiency – pushing the boundaries
Scientific efficiency – not only money
Although hardware might be getting more affordable …
• Patients are not available in abundance
• Effect sizes of interest are getting smaller
• Larger samples needed to boost sensitivity
• Larger datasets needed for machine learning
Scientific efficiency – not only data
• Collaboration and networks (team science) needed
to increase our shared knowledge and understanding
Open Science
Open educational resources
Open access publications
Open peer review
Open methodology
Open source
Open hardware
Open data
Inclusive and ethical
Open Data
Shared data allows for
Improved reproducibility
Small effects that require large group sizes
Data mining, discovery science and generating new hypothesis
Results in methodological opportunities
Improve algorithms
Estimate effect and group size
Make informed decisions on analysis pipeline
Prevent harking and p-hacking
Open Data
Findable
Make your data available on repository with a persistent identifier (DOI, handle)
and metadata
Accessible
Be explicit about data usage terms (agreement with downloader)
Interoperable
Make your data human and machine readable, e.g. BIDS
Reusable
Make sure you document enough details, e.g. “data descriptor” paper
this can be cited, along with citing our data -> measurable impact!
Open Data – challenges with our data
• Neuroimaging data is large
• Many files
• Many GB
• Complex organization (not a simple table)
• Neuroimaging data can be sensitive
• Data from human research participants (not “subjects”)
• Ethical framework – Declaration of Helsinki
• Legal framework – General Data Protection Regulation
Brain Imaging Data Structure
http://bids.neuroimaging.io
What is is?
BIDS is a way to organize your existing raw data
To improve consistent and complete documentation
To facilitate re-use by your future self and others
BIDS is not
A new file format
A search engine
A data sharing tool
BIDS for MRI, MEG, EEG, iEEG …
in future also PET, eye-tracker, genetics etc.
data/README
CHANGES
dataset_description.json
participants.tsv
/sub-01/anat/…
/sub-01/meg/…
/sub-01/eeg/sub-01_task-auditory_eeg.edf
/sub-01/eeg/sub-01_task-auditory_eeg.json
/sub-01/eeg/sub-01_task-auditory_channels.tsv
/sub-01/eeg/sub-01_task-auditory_events.tsv
/sub-01/eeg/sub-01_electrodes.tsv
/sub-01/eeg/sub-01_coordinates.json
Actual EEG data
Directory structure
Metadata
Open Standard
For all toolboxes
For all researchers
Academic/Industrial
Open/Closed Source
Our current research data
will outlive our current
research tools.
Aim for >10 years.
Data from human participants
General Data Protection Regulation (GDPR)
Challenges:
Explicit and strict protection of personal data
Opportunities:
Less influence of national legislation differences
Learn from each other
Develop best practices
https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=uriserv:OJ.L_.2016.119.01.0001.01.ENG
Personal data
name
address
date of birth
phone number
license plate number
IP address
...
Crime Scene Investigation
http://www.abc.net.au/news/2017-09-19/csi/8960590
This is the information the police will first search for.
In case this cannot be found, CSI is called in.
Biometric data
facial details
dental record
fingerprint
genetics
cortical folding pattern
clinical data
gait/movement pattern
…
These are identifying in case they are
sufficiently unique and stable over time.
Personal Data is needed
and should be managed
Required for administration
Contacting your participants
Paying your participants
Follow up incidental findings
Often not required to address the research question
Sometimes used as confound (e.g. age, but not date of birth)
Check whether the sample is representative (e.g. social status)
Possibly required to assess scientific integrity
GDPR – data minimization
Only collect what you need
Only use it for the intended purpose
Delete (contact) data that you do not need any more
Personal Data
Personal data
Name, address, date of birth
Special personal data = “bijzondere persoonsgegevens in NL”
Race
Religion or beliefs
Health
Sexual activities
Political preference, membership of a union
Criminal record
Indirect personal data – identifies someone … when linked to another database
Fingerprint, DNA, facial details
Anatomical MRI
Specific pattern of data (e.g. answers on a questionnaire or interview)
https://autoriteitpersoonsgegevens.nl/nl/over-privacy/persoonsgegevens/wat-zijn-persoonsgegevens
Organize personal data for deletion
name address phone date of birth pseudonym age gender
John Doe 7 Willow road 918 247462 19-7-1984 sub-01 35 M
Fern Travers sub-02
Griffin Mora sub-03
Peter Dillon sub-04
Kathy Kirk sub-05
… …
Don’t put identifying details in the header of the binary files (e.g. DICOM)
Don’t put it in the file names (e.g. BrainVision *.vhdr/vmrk/eeg)
Delete as soon as requirements fulfilled (e.g. incidental findings procedure)
Don’t delete what needs to be retained (signed informed consent forms)
Gradient between
personal and research data
indirect personal
data
personal data
a lot of research data
easy easyhard
Keep private
don’t share
but delete
Share as it is
with others
?
Limit possible identification
Anonymous
Nobody is able to identify the participant
Pseudonymization
Use a code instead of the participants name
De-identification
Remove (indirectly) identifying features
Blur the indirect personal data
Deface anatomical MRI
Age at the time of acquisition instead of date of birth
Use age bins instead of years
Questionnaire outcomes rather than individual item scores
…
Appropriate blurring
depends on the situation
… for example the age of the participant
1 month bins 10 year bins
Personal and research data
indirect personal
data
personal data
a lot of research data
Personal and research data
data minimization
pseudonymization
data minimization
de-identifying, blurring
alotofresearchdata
personaldata
indirect
personaldata
Share
responsibly with
legal constraints
on reuse
Keep safe
and private (or delete)
Legal constraints
Contract between you as researcher
… and the funding agency
… and the ethics committee
… and the participants/patients
… and the publisher of the results
… and the recipient of the data upon sharing
Legal constraints – Data Use Agreement
CC0 - Public Domain
No copyright.
The person who associated a work with this deed
has dedicated the work to the public domain by
waiving all of his or her rights to the work
worldwide under copyright law, including all related
and neighboring rights, to the extent allowed by law.
You can copy, modify, distribute and perform the
work, even for commercial purposes, all without
asking permission.
Donders Institute - Data Use Agreement
for identifiable human data
I will comply with all relevant rules and regulations
imposed by my institution and my government ….
I will not attempt to establish the identity of or attempt
to contact any of the included human subjects. I will not
link this data to any other database in a way that could
provide identifying information ….
I will not redistribute or share the data with others,
including individuals in my research group, unless they
have independently applied and been granted access to
this data.
I will acknowledge the use of the data and data derived
from the data when publicly presenting …
Failure to abide by these guidelines will result in
termination of my privileges to access to these data.https://creativecommons.org/publicdomain/zero/1.0/
https://data.donders.ru.nl/doc/dua/
https://open-brain-consent.readthedocs.io/
participant → you → recipient
Where to share?
Institutional repository
Donders https://data.donders.ru.nl
Radboud University http://data.ru.nl
In the UK Oxford, Cambridge, Edinburg
…
National repository
https://easy.dans.knaw.nl
https://dataverse.nl
https://data.4tu.nl
Domain specific repository
http://openneuro.org
General repository
Zenodo
Harvard dataverse
Commercial publishers
https://datadryad.org
https://figshare.com
Considerations for shared data
• For the ethics board
• Be explicit about sharing, e.g. https://open-brain-consent.readthedocs.io
• For our research participants and the GDPR
• Use pseudonyms
• Remove identifying features (names, dates, faces)
• For the researchers that want to share
• Allow uploading and reorganization of large datasets (1GB-1TB)
• Provide guidelines for structuring the data
• Provide methods to review the data, also for journal editors/reviewers
• Provide versioning of datasets
• For researchers that want to reuse the data
• Allow browsing the data
• Allow selective downloads to get a taste of it
• Allow bulk downloads
Summary
Open Data improves reproducibility and accelerates new research
BIDS helps to organize your data FAIR and easy to understand
Open Source community is building tools to create and reuse data
There is more to Open Science, also education, open access publications,
methodology and data

Weitere ähnliche Inhalte

Was ist angesagt?

Data management (1)
Data management (1)Data management (1)
Data management (1)
SM Lalon
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Natsuko Nicholls
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
Brad Houston
 

Was ist angesagt? (20)

IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
 
Research Data Management: An Overview - 2014-05-12 - Humanities Division, Uni...
Research Data Management: An Overview - 2014-05-12 - Humanities Division, Uni...Research Data Management: An Overview - 2014-05-12 - Humanities Division, Uni...
Research Data Management: An Overview - 2014-05-12 - Humanities Division, Uni...
 
Data management (1)
Data management (1)Data management (1)
Data management (1)
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data Management
 
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositories
 
A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4A basic course on Research data management: part 1 - part 4
A basic course on Research data management: part 1 - part 4
 
Good (enough) research data management practices
Good (enough) research data management practicesGood (enough) research data management practices
Good (enough) research data management practices
 
Creating a Data Management Plan
Creating a Data Management PlanCreating a Data Management Plan
Creating a Data Management Plan
 
Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...
Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...
Introduction to Research Data Management - 2015-02-09 - MPLS Division, Univer...
 
Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...
Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...
Preparing Your Research Material for the Future - 2014-06-09 - Humanities Div...
 
RDM & ELNs @ Edinburgh
RDM & ELNs @ EdinburghRDM & ELNs @ Edinburgh
RDM & ELNs @ Edinburgh
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
 
What funders want you to do with your data
What funders want you to do with your dataWhat funders want you to do with your data
What funders want you to do with your data
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
 
Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521
 
Data Management Planning for Researchers - 2016-02-08 - University of Oxford
Data Management Planning for Researchers - 2016-02-08 - University of OxfordData Management Planning for Researchers - 2016-02-08 - University of Oxford
Data Management Planning for Researchers - 2016-02-08 - University of Oxford
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
No Free Lunch: Metadata in the life sciences
No Free Lunch:  Metadata in the life sciencesNo Free Lunch:  Metadata in the life sciences
No Free Lunch: Metadata in the life sciences
 

Ähnlich wie Using Open Science to advance science - advancing open data

Data management plans
Data management plansData management plans
Data management plans
Brad Houston
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
Brad Houston
 
Data management plans
Data management plansData management plans
Data management plans
Brad Houston
 

Ähnlich wie Using Open Science to advance science - advancing open data (20)

ChildBrain/Predictable summer school - Open Science
ChildBrain/Predictable summer school - Open Science ChildBrain/Predictable summer school - Open Science
ChildBrain/Predictable summer school - Open Science
 
Using Open Science to accelerate advancements in auditory EEG signal processing
Using Open Science to accelerate advancements in auditory EEG signal processingUsing Open Science to accelerate advancements in auditory EEG signal processing
Using Open Science to accelerate advancements in auditory EEG signal processing
 
The OpenCon Intro to Open Data
The OpenCon Intro to Open DataThe OpenCon Intro to Open Data
The OpenCon Intro to Open Data
 
Open science, open data - FOSTER training, Potsdam
Open science, open data - FOSTER training, PotsdamOpen science, open data - FOSTER training, Potsdam
Open science, open data - FOSTER training, Potsdam
 
Data management plans
Data management plansData management plans
Data management plans
 
From byte to mind
From byte to mindFrom byte to mind
From byte to mind
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Data management plans
Data management plansData management plans
Data management plans
 
Open Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon HodsonOpen Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon Hodson
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
Open, FAIR data and RDM
Open, FAIR data and RDMOpen, FAIR data and RDM
Open, FAIR data and RDM
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and Practice
 
Managing Confidential Information – Trends and Approaches
Managing Confidential Information – Trends and ApproachesManaging Confidential Information – Trends and Approaches
Managing Confidential Information – Trends and Approaches
 
Open Science in Research Libraries: Research, Research Integrity and Legal As...
Open Science in Research Libraries: Research, Research Integrity and Legal As...Open Science in Research Libraries: Research, Research Integrity and Legal As...
Open Science in Research Libraries: Research, Research Integrity and Legal As...
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
Open Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practicesOpen Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practices
 
Research-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhDResearch-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhD
 
Donders Research Data Repository
Donders Research Data Repository Donders Research Data Repository
Donders Research Data Repository
 
An itinerary for FAIR and privacy respecting data-driven innovation and research
An itinerary for FAIR and privacy respecting data-driven innovation and researchAn itinerary for FAIR and privacy respecting data-driven innovation and research
An itinerary for FAIR and privacy respecting data-driven innovation and research
 

Mehr von Robert Oostenveld

On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
Robert Oostenveld
 

Mehr von Robert Oostenveld (17)

FieldTrip tutorial at WIRED20204 in Paris
FieldTrip tutorial at WIRED20204 in ParisFieldTrip tutorial at WIRED20204 in Paris
FieldTrip tutorial at WIRED20204 in Paris
 
Developing and sharing tools for bioelectromagnetic research
Developing and sharing tools for bioelectromagnetic researchDeveloping and sharing tools for bioelectromagnetic research
Developing and sharing tools for bioelectromagnetic research
 
Connecting GLIMR with the BIDS initiative
Connecting GLIMR with the BIDS initiativeConnecting GLIMR with the BIDS initiative
Connecting GLIMR with the BIDS initiative
 
Spectral-, source-, connectivity- and network analysis of EEG and MEG data
Spectral-, source-, connectivity- and network analysis of EEG and MEG dataSpectral-, source-, connectivity- and network analysis of EEG and MEG data
Spectral-, source-, connectivity- and network analysis of EEG and MEG data
 
EEG, MEG and FieldTrip
EEG, MEG and FieldTripEEG, MEG and FieldTrip
EEG, MEG and FieldTrip
 
OHBM 2020 OSR - Brain research data sharing and personal data privacy
OHBM 2020 OSR - Brain research data sharing and personal data privacyOHBM 2020 OSR - Brain research data sharing and personal data privacy
OHBM 2020 OSR - Brain research data sharing and personal data privacy
 
BIOMAG2018 - Tzvetan Popov - HCP from a user's perspective
BIOMAG2018 - Tzvetan Popov - HCP from a user's perspectiveBIOMAG2018 - Tzvetan Popov - HCP from a user's perspective
BIOMAG2018 - Tzvetan Popov - HCP from a user's perspective
 
BIOMAG2018 - Vladimir Litvak - Frontiers
BIOMAG2018 - Vladimir Litvak - FrontiersBIOMAG2018 - Vladimir Litvak - Frontiers
BIOMAG2018 - Vladimir Litvak - Frontiers
 
BIOMAG2018 - Jan-Mathijs Schoffelen - COBIDAS
BIOMAG2018 - Jan-Mathijs Schoffelen - COBIDASBIOMAG2018 - Jan-Mathijs Schoffelen - COBIDAS
BIOMAG2018 - Jan-Mathijs Schoffelen - COBIDAS
 
BIOMAG2018 - Darren Price - CamCAN
BIOMAG2018 - Darren Price - CamCANBIOMAG2018 - Darren Price - CamCAN
BIOMAG2018 - Darren Price - CamCAN
 
RDM and the Donders Repository
RDM and the Donders RepositoryRDM and the Donders Repository
RDM and the Donders Repository
 
Real-time EEG: timing and block size
Real-time EEG: timing and block sizeReal-time EEG: timing and block size
Real-time EEG: timing and block size
 
EEG signal background and real-time processing
EEG signal background and real-time processingEEG signal background and real-time processing
EEG signal background and real-time processing
 
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
On the large scale of studying dynamics with MEG: Lessons learned from the Hu...
 
Group analyses with FieldTrip
Group analyses with FieldTripGroup analyses with FieldTrip
Group analyses with FieldTrip
 
Donders Institute - Research Data Management
Donders Institute - Research Data Management Donders Institute - Research Data Management
Donders Institute - Research Data Management
 
EEGSynth pitch for brainhack@paris
EEGSynth pitch for brainhack@parisEEGSynth pitch for brainhack@paris
EEGSynth pitch for brainhack@paris
 

Kürzlich hochgeladen

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
Bhagirath Gogikar
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 

Kürzlich hochgeladen (20)

COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 

Using Open Science to advance science - advancing open data

  • 1. Using Open Science to advance science – advancing open data Robert Oostenveld Donders Institute, Radboud University, Nijmegen, NL Karolinska Institutet, Stockholm, SE r.oostenveld@donders.ru.nl
  • 2. FieldTrip toolbox Open Source MATLAB-based toolbox for MEG, EEG and iEEG analysis Development started around 2004 with the “F.C. Donders Centre” (now the Donders Institute) Estimated 3000 users, 1500 people on the discussion list, close to 1000 citations per year
  • 4. Brain Imaging Data Structure
  • 5.
  • 6. Top-down pressure to change how we do our research
  • 7. Expectations on publicly funded research Cartoon by Bas van der Schot
  • 8. Shifts in research funding EU: train young researchers for jobs in society in European Training Networks. EU/NL: Public-Private partnerships for better knowledge transfer and utilization. “NWO is of the opinion that research results paid for by public funds should be freely accessible worldwide. This applies to both scientific publications and other forms of scientific output.”
  • 9. Bottom-up pressure to change how we do our research
  • 10. The problem many studies with low statistical power publish or perish results in reporting bias The consequence overreporting of false positives overestimates effect size low reproducibility of results
  • 11. Poor reproducibility Open Science Collaboration, Science (2015). DOI: 10.1126/science.aac4716
  • 12. 70 independent teams analyzed the same dataset, testing the same 9 hypotheses no two teams chose identical workflows to analyse the data … resulted in sizeable variation in the results analytical flexibility can have substantial effects on scientific conclusions results emphasize the importance of validating and sharing complex analysis workflows
  • 13. Scientific efficiency – pushing the boundaries
  • 14. Scientific efficiency – not only money Although hardware might be getting more affordable … • Patients are not available in abundance • Effect sizes of interest are getting smaller • Larger samples needed to boost sensitivity • Larger datasets needed for machine learning
  • 15. Scientific efficiency – not only data • Collaboration and networks (team science) needed to increase our shared knowledge and understanding
  • 16. Open Science Open educational resources Open access publications Open peer review Open methodology Open source Open hardware Open data Inclusive and ethical
  • 17.
  • 18. Open Data Shared data allows for Improved reproducibility Small effects that require large group sizes Data mining, discovery science and generating new hypothesis Results in methodological opportunities Improve algorithms Estimate effect and group size Make informed decisions on analysis pipeline Prevent harking and p-hacking
  • 19. Open Data Findable Make your data available on repository with a persistent identifier (DOI, handle) and metadata Accessible Be explicit about data usage terms (agreement with downloader) Interoperable Make your data human and machine readable, e.g. BIDS Reusable Make sure you document enough details, e.g. “data descriptor” paper this can be cited, along with citing our data -> measurable impact!
  • 20. Open Data – challenges with our data • Neuroimaging data is large • Many files • Many GB • Complex organization (not a simple table) • Neuroimaging data can be sensitive • Data from human research participants (not “subjects”) • Ethical framework – Declaration of Helsinki • Legal framework – General Data Protection Regulation
  • 21. Brain Imaging Data Structure http://bids.neuroimaging.io
  • 22. What is is? BIDS is a way to organize your existing raw data To improve consistent and complete documentation To facilitate re-use by your future self and others BIDS is not A new file format A search engine A data sharing tool
  • 23. BIDS for MRI, MEG, EEG, iEEG … in future also PET, eye-tracker, genetics etc. data/README CHANGES dataset_description.json participants.tsv /sub-01/anat/… /sub-01/meg/… /sub-01/eeg/sub-01_task-auditory_eeg.edf /sub-01/eeg/sub-01_task-auditory_eeg.json /sub-01/eeg/sub-01_task-auditory_channels.tsv /sub-01/eeg/sub-01_task-auditory_events.tsv /sub-01/eeg/sub-01_electrodes.tsv /sub-01/eeg/sub-01_coordinates.json Actual EEG data Directory structure Metadata
  • 24. Open Standard For all toolboxes For all researchers Academic/Industrial Open/Closed Source Our current research data will outlive our current research tools. Aim for >10 years.
  • 25.
  • 26. Data from human participants General Data Protection Regulation (GDPR) Challenges: Explicit and strict protection of personal data Opportunities: Less influence of national legislation differences Learn from each other Develop best practices https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=uriserv:OJ.L_.2016.119.01.0001.01.ENG
  • 27. Personal data name address date of birth phone number license plate number IP address ... Crime Scene Investigation http://www.abc.net.au/news/2017-09-19/csi/8960590 This is the information the police will first search for. In case this cannot be found, CSI is called in.
  • 28. Biometric data facial details dental record fingerprint genetics cortical folding pattern clinical data gait/movement pattern … These are identifying in case they are sufficiently unique and stable over time.
  • 29. Personal Data is needed and should be managed Required for administration Contacting your participants Paying your participants Follow up incidental findings Often not required to address the research question Sometimes used as confound (e.g. age, but not date of birth) Check whether the sample is representative (e.g. social status) Possibly required to assess scientific integrity GDPR – data minimization Only collect what you need Only use it for the intended purpose Delete (contact) data that you do not need any more
  • 30. Personal Data Personal data Name, address, date of birth Special personal data = “bijzondere persoonsgegevens in NL” Race Religion or beliefs Health Sexual activities Political preference, membership of a union Criminal record Indirect personal data – identifies someone … when linked to another database Fingerprint, DNA, facial details Anatomical MRI Specific pattern of data (e.g. answers on a questionnaire or interview) https://autoriteitpersoonsgegevens.nl/nl/over-privacy/persoonsgegevens/wat-zijn-persoonsgegevens
  • 31. Organize personal data for deletion name address phone date of birth pseudonym age gender John Doe 7 Willow road 918 247462 19-7-1984 sub-01 35 M Fern Travers sub-02 Griffin Mora sub-03 Peter Dillon sub-04 Kathy Kirk sub-05 … … Don’t put identifying details in the header of the binary files (e.g. DICOM) Don’t put it in the file names (e.g. BrainVision *.vhdr/vmrk/eeg) Delete as soon as requirements fulfilled (e.g. incidental findings procedure) Don’t delete what needs to be retained (signed informed consent forms)
  • 32. Gradient between personal and research data indirect personal data personal data a lot of research data easy easyhard Keep private don’t share but delete Share as it is with others ?
  • 33. Limit possible identification Anonymous Nobody is able to identify the participant Pseudonymization Use a code instead of the participants name De-identification Remove (indirectly) identifying features Blur the indirect personal data Deface anatomical MRI Age at the time of acquisition instead of date of birth Use age bins instead of years Questionnaire outcomes rather than individual item scores …
  • 34. Appropriate blurring depends on the situation … for example the age of the participant 1 month bins 10 year bins
  • 35. Personal and research data indirect personal data personal data a lot of research data
  • 36. Personal and research data data minimization pseudonymization data minimization de-identifying, blurring alotofresearchdata personaldata indirect personaldata Share responsibly with legal constraints on reuse Keep safe and private (or delete)
  • 37. Legal constraints Contract between you as researcher … and the funding agency … and the ethics committee … and the participants/patients … and the publisher of the results … and the recipient of the data upon sharing
  • 38. Legal constraints – Data Use Agreement CC0 - Public Domain No copyright. The person who associated a work with this deed has dedicated the work to the public domain by waiving all of his or her rights to the work worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law. You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. Donders Institute - Data Use Agreement for identifiable human data I will comply with all relevant rules and regulations imposed by my institution and my government …. I will not attempt to establish the identity of or attempt to contact any of the included human subjects. I will not link this data to any other database in a way that could provide identifying information …. I will not redistribute or share the data with others, including individuals in my research group, unless they have independently applied and been granted access to this data. I will acknowledge the use of the data and data derived from the data when publicly presenting … Failure to abide by these guidelines will result in termination of my privileges to access to these data.https://creativecommons.org/publicdomain/zero/1.0/ https://data.donders.ru.nl/doc/dua/ https://open-brain-consent.readthedocs.io/ participant → you → recipient
  • 39.
  • 40. Where to share? Institutional repository Donders https://data.donders.ru.nl Radboud University http://data.ru.nl In the UK Oxford, Cambridge, Edinburg … National repository https://easy.dans.knaw.nl https://dataverse.nl https://data.4tu.nl Domain specific repository http://openneuro.org General repository Zenodo Harvard dataverse Commercial publishers https://datadryad.org https://figshare.com
  • 41. Considerations for shared data • For the ethics board • Be explicit about sharing, e.g. https://open-brain-consent.readthedocs.io • For our research participants and the GDPR • Use pseudonyms • Remove identifying features (names, dates, faces) • For the researchers that want to share • Allow uploading and reorganization of large datasets (1GB-1TB) • Provide guidelines for structuring the data • Provide methods to review the data, also for journal editors/reviewers • Provide versioning of datasets • For researchers that want to reuse the data • Allow browsing the data • Allow selective downloads to get a taste of it • Allow bulk downloads
  • 42. Summary Open Data improves reproducibility and accelerates new research BIDS helps to organize your data FAIR and easy to understand Open Source community is building tools to create and reuse data There is more to Open Science, also education, open access publications, methodology and data

Hinweis der Redaktion

  1. https://jmopendata.cbs.nl/#/JM/nl/dataset/71009ned/barv?dl=23D6F Open Access - cOAlition S and Plan S
  2. Review and critical evaluation beyond publication – open methods, tools and data
  3. Ethical framework = Consent Buikhuisen studying biological factors in criminal behavior (end ’70s) Swaab studying differences between homo/heterosexual brains (end ’80s)
  4. Personal data is what the CSI will search for .. And if they cannot find it they will look at biometric data
  5. Commercial publishers not to be confused with data publications (about data) in journals such as Scientific Data