High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
OpenAIRE-RDM@healthdata
1. Health Research Data Management
for Open Science
MD-Paedigree final conference, May 22-23 2017, Rome
@openaire_eu
Natalia Manola
OpenAIRE Managing Director
Athena Research & Innovation Centre
2. Outline
• Who we are
• Open Science (and health data)
• Europe’s OA policies
• Research data management practices
MD-Paedigree final conference, May 22-23 2017, Rome
3. MD-Paedigree final conference, May 22-23 2017, Rome
Question
How to guarantee the ongoing availability of
research data?
From local availability to distributed access.
4. Clinical data in predictive medicine
Researchers following 30,000 Type 2 diabetic patients
for a decade were able to use an analytics model to
predict the risk of developing dementia while suffering
from the endocrine disease
The Lancet Diabetes & Endocrinology journal
Clinical data IS research data
5. • Institutional, national and international
perspectives on OA policies & e-
Infrastructures
Open Access experts
• Building efficient e-Infra technologies
• State of the art technologies (big data,
linked data)
Information & Computer
Science experts
• Legal &policy recommendations
Legal experts
• Best practices for data
• Linking to data infrastructures
Data communities
MD-Paedigree final conference, May 22-23 2017, Rome
Who we are
A European key e-Infrastructure to
support Open Access and Open
Science
In 24x7 operation since Dec 2010
Nodes in 34 countries
www.openaire.eu
6. Human
Network
Digital
Network
… foster the social and technical links
that enable Open Science in Europe and beyond
MD-Paedigree final conference, May 22-23 2017, Rome
National Open Access
Desks (NOADs)
34 OA expert nodes in all Europe
• (OA) Policy aligning
• Technical assistance
• Training
Tools and services to
implement OA
At university/organization or
national
7. Human
Network
Digital
Network
… foster the social and technical links
that enable Open Science in Europe and beyond
MD-Paedigree final conference, May 22-23 2017, Rome
National Open Access
Desks (NOADs)
34 OA expert nodes in all Europe
• (OA) Policy aligning
• Technical assistance
• Training
Tools and services to
implement OA
At university/organization or
nationalInteroperability at all levels -
technical, human, legal, policy, organizational
8. Integrated Scientific Information System
MD-Paedigree final conference, May 22-23 2017, Rome
20 mi unique publications
800 validated data providers
500Κ publications linked to projects
from 12 funders
60 K datasets linked to
publications or funders
3.5 K links to software
All linked together - Linked Open Data
9. Open Science - beyond open data
CC-BY Andreas Neuhold
https://commons.wikimedia.org/wiki/File:Open_Science_-_Prinzipien.png
• Access to research facilities
• Access to processing
capabilities
• Communication at all levels
of research life cycle
MD-Paedigree final conference, May 22-23 2017, Rome
10. Open Science - beyond open data
CC-BY Andreas Neuhold
https://commons.wikimedia.org/wiki/File:Open_Science_-_Prinzipien.png
• Access to research facilities
• Access to processing
capabilities
• Communication at all levels
of research life cycle
MD-Paedigree final conference, May 22-23 2017, Rome
Change in how we
operate
interact with citizens
are evaluated
are valued
are taking decisions on policies
13. Data and decision flow in health
MD-Paedigree final conference, May 22-23 2017, Rome
Traditional view
Doct
or
Admini
strator
Rese
arche
r
14. Data and decision flow in health
MD-Paedigree final conference, May 22-23 2017, Rome
Current view
Innovator
Doct
or
Admini
strator
Rese
arche
r
Patient
Policy
Maker
Educator
15. MD-Paedigree final conference, May 22-23 2017, Rome
Hospitals are research data
producing organizations
… and must guarantee availability of health
data beyond their participation in projects or
studies
16. Open Science requires good research data
management practices
MD-Paedigree final conference, May 22-23 2017, Rome
Medical Data Challenges
1. Technical
2. Ethical
3. Legal
Same as other domains –
health is more sensitive
17. Data Flow for precision medicine
Data Mining
Disease signatures
Patient grouping & similarity
Personalized Model
Guided Medicine
For a particular
patient
Unknown / missing data
Predict value of missing
variable
Variable dependencies & causality
Simulation Models
Create Statistical
Simulation
Models
Individualized diagnosis,
prognosis & treatment plan
Model & VerificationKnowledge Discovery Reasoning & decision support
Domain knowledge &
assumptions
Clinical workflows
TOP-DOWNRaw data from biomarker based
personalized acquisition
Data
Preprocessing
Curation & Profiling
Transformed &
Validated Data
BOTTOM-UP
Data- & model-based medicine
MD-Paedigree final conference, May 22-23 2017, Rome
Data produced in many stages
18. Data Flow for P. Medicine
Data Mining
Disease signatures
Patient grouping & similarity
Personalized Model
Guided Medicine
For a particular
patient
Unknown / missing data
Predict value of missing
variable
Variable dependencies & causality
Simulation Models
Create Statistical
Simulation
Models
Individualized diagnosis,
prognosis & treatment plan
Model & VerificationKnowledge Discovery Reasoning & decision support
Domain knowledge &
assumptions
Clinical workflows
TOP-DOWNRaw data from biomarker based
personalized acquisition
Data
Preprocessing
Curation & Profiling
Transformed &
Validated Data
BOTTOM-UP
Data- & model-based medicine
MD-Paedigree final conference, May 22-23 2017, Rome
Data produced in many stages
Data in many cycles: produced and processed at different
stages
Data must have be properly stewarded at all stages
Data may be shared at any stage
• At least: open data linked to publication
20. Data ethics principles
• Privacy-preserving analysis of clinical data for research
• Transparency and explanation of results of opaque algorithms
• Honesty and lack of bias in result presentation
• …
MD-Paedigree final conference, May 22-23 2017, Rome
21. Research reproducibility
• Data provenance
• Data cleaning and curation
• Data profiling and validation
• Data transformation for cleaning and curation
• Workflow support for repeating experiments and reproducing
results
MD-Paedigree final conference, May 22-23 2017, Rome
22. Sharing data: legal mechanisms
MD-Paedigree final conference, May 22-23 2017, Rome
• European legal context
• General Data Protection Regulation (GDPR)
• Give citizens control of their personal data - effective from 25
May 2018
• New mechanisms of trust and value-based relationships
between people, hospitals, research centres, businesses
• Smart contracts – blockchain
• Informed consent
24. H2020 Open Access policies
• Publications
• Openly accessible and minable.
Eligible costs for APCs.
• Research data
• Openly accessible research data
can typically be accessed, mined,
exploited, reproduced and
disseminated free of charge for the
user.
MD-Paedigree final conference, May 22-23 2017, Rome
26. MD-Paedigree final conference, May 22-23 2017, Rome
Publications ARE data!
2.5 mi EN publications published every year
Text and data mining the future way to extract knowledge
30. Data management for organisations -
a few rules of thumb
MD-Paedigree final conference, May 22-23 2017, Rome
31. What should be preserved and shared?
• The data needed to validate results in scientific publications
(minimally!).
• The associated metadata: the dataset’s creator, title, year of
publication, repository, identifier etc.
• Follow a metadata standards and be FAIR.
• Documentation: code books, lab journals, informed consent
forms – domain-dependent, and important for understanding the
data and combining them with other data sources.
• Software, hardware, tools, syntax queries, machine
configurations – domain-dependent, and important for using the
data. (Alternative: information about the software etc.)
MD-Paedigree final conference, May 22-23 2017, Rome
32. What should be preserved and shared?
• The data needed to validate results in scientific publications
(minimally!).
• The associated metadata: the dataset’s creator, title, year of
publication, repository, identifier etc.
• Follow a metadata standards and be FAIR.
• Documentation: code books, lab journals, informed consent
forms – domain-dependent, and important for understanding the
data and combining them with other data sources.
• Software, hardware, tools, syntax queries, machine
configurations – domain-dependent, and important for using the
data. (Alternative: information about the software etc.)
MD-Paedigree final conference, May 22-23 2017, Rome
Basically, everything that is needed to
replicate a study should be available.
33. How much does it cost? Who pays?
• What are the costs for making data FAIR in your project?
• Resources needed for long term access
• “Well budgeted data stewardship plans should be made mandatory and
expect that on average about 5% of research expenditure should be spent
on properly managing and stewarding data”
• Who pays? How?
UKDS model http://www.data-archive.ac.uk/create-manage/planning-for-sharing/costing
HLEG report http://ec.europa.eu/research/openscience/pdf/realising_the_european_open_science_cloud_2016.pdf#view=fit&pagemode=none
MD-Paedigree final conference, May 22-23 2017, Rome
34. How much does it cost? Who pays?
• What are the costs for making data FAIR in your project?
• Resources needed for long term access
• “Well budgeted data stewardship plans should be made mandatory and
expect that on average about 5% of research expenditure should be spent
on properly managing and stewarding data”
• Who pays? How?
UKDS model http://www.data-archive.ac.uk/create-manage/planning-for-sharing/costing
HLEG report http://ec.europa.eu/research/openscience/pdf/realising_the_european_open_science_cloud_2016.pdf#view=fit&pagemode=none
MD-Paedigree final conference, May 22-23 2017, Rome
Hospitals:
Commit to investment for continuous data curation
Consider a resident data steward or data scientist
35. For how long?
RDNL Selection criteria: http://www.researchdata.nl/en/services/data-management/selecting-research-data/
DCC How-to guide: http://www.dcc.ac.uk/resources/how-guides/appraise-select-data
• When regenerating data is cheaper than archiving, don’t archive.
Select what data you’ll need and want to retain.
• 10 years is often stated in data policies and academic codes, but
data can be valuable for ages, in climatology, sociology, health
sciences, astronomy, linguistics, …
MD-Paedigree final conference, May 22-23 2017, Rome
36. For how long?
RDNL Selection criteria: http://www.researchdata.nl/en/services/data-management/selecting-research-data/
DCC How-to guide: http://www.dcc.ac.uk/resources/how-guides/appraise-select-data
• When regenerating data is cheaper than archiving, don’t archive.
Select what data you’ll need and want to retain.
• 10 years is often stated in data policies and academic codes, but
data can be valuable for ages, in climatology, sociology, health
sciences, astronomy, linguistics, …
MD-Paedigree final conference, May 22-23 2017, Rome
Health: look beyond minimal retention periods
38. How can OpenAIRE help?
Participating in the
European Open Science Cloud initiative
MD-Paedigree final conference, May 22-23 2017, Rome
39. Support material
Briefing papers, factsheets,
webinars, workshops, FAQs
Information on
• Open Research Data Pilot
• Creating a data management
plan
• Selecting a data repository
• Personal data
Helpdesk - in 34 countries
Training
MD-Paedigree final conference, May 22-23 2017, Rome
40. OpenAIRE services
• Integrated research catalogue (pubs, data, ..)
• Zenodo (@CERN) – repository for all types of publications, data
and software
• Amnesia - an anonymization tool for all
• Data providers – Interoperability Guidelines, validation,…
• Project coordinators – reporting
• Funders and institutions – monitoring
• Research communities – gathering, monitoring all research
MD-Paedigree final conference, May 22-23 2017, Rome
MONITORING DASHBOARDS
41. OpenAIRE services
• Integrated research catalogue (pubs, data, ..)
• Zenodo (@CERN) – repository for all types of publications, data
and software
• Amnesia - an anonymization tool for all
• Data providers – Interoperability Guidelines, validation,…
• Project coordinators – reporting
• Funders and institutions – monitoring
• Research communities – gathering, monitoring all research
MD-Paedigree final conference, May 22-23 2017, Rome
MONITORING DASHBOARDS
Be part of the research ecosystem.
Get connected!
In multi-beneficiary projects it is also possible for specific beneficiaries to keep their data closed if relevant provisions are made in the consortium agreement and are in line with the reasons for opting out
Re Software etc: you might also think of virtual machines with the corresponding setup information.
In many cases copyright will prevent the archiving of software and tools. The alternative is a sensible description of configuration settings etc.
It’s no fun to do the exercise by yourself, so use this as a communication opportunity.
[final bullet] Acting on requests from the community, DMPonline will add an ‘export to Zenodo’ feature alongside the other export options. You might want to use this to increase your project’s transparancy, share good practices, or maybe because you write your DMP as a (kind of ) data paper, which is interesting in its own right. At the moment there are a few H2020 DMPs in Zenodo and figshare.