Palestra apresentada à CONFOA 2013 (Universidade de São Paulo, São Paulo, Brasil, de 06 a 08 de outubro de 2013) na Mesa III - A ciência aberta e a gestão de dados de pesquisa - pelo Prof. Dr. Peter Elias – REINO UNIDO - The Royal Society of UK.
1. Science as an open enterprise: open
data for an open science
Peter Elias
4th Luso-Brazilian Conference on Open Access
University of São Paulo
October 6 - 9, 2013
Report:twww.royalsociety.org
2. Open communication of data: the source of a
scientific revolution and of scientific progress
Henry Oldenburg
3. The challenge to Oldenburg’s principle: a crisis of replicability and credibility?
The data providing the evidence for a published concept MUST
be concurrently published, together with the metadata
6. Working Group
•
Professor Geoffrey Boulton FRS FRSE (Chair) Regius Professor of Geology Emeritus
at the University of Edinburgh
•
Dr Philip Campbell Editor in Chief, Nature
•
Professor Brian Collins CB FREng Professor of Engineering Policy, University College
London
•
Professor Peter Elias CBE Institute for Employment Research, University of Warwick
•
Dame Wendy Hall FRS FREng Professor of Computer Science , University of
Southampton
•
Professor Graeme Laurie FRSE Professor of Medical Jurisprudence, University of
Edinburgh
•
Baroness Onora O’Neill FRS FBA FMedSci Professor of Philosophy, University of
Cambridge
•
Sir Michael Rawlins FMedSci Chairman, National Institute for Health and Clinical
Excellence
•
Professor Janet Thornton FRS CBE Director, European Bioinformatics Institute
•
Professor Patrick Vallance FMedSci Senior Vice President, Medicines discovery and
development, GlaxoSmithKline
•
Sir Mark Walport FRS FMedSci Director, Wellcome Trust
7. Data, information, knowledge
Data from human or machine observation are numbers,
characters or images that refer to an attribute of a
phenomenon. In order to be interpretable, data usually require
metadata, which are data about the data, for example details
of the context in which the data were collected. Data only
become information when analysed in ways that reveal
patterns in the phenomenon under investigation. Information
yields knowledge when it supports non-trivial, true claims
about a phenomenon.
8. Open data is more than disclosure
To be open, and to be communicated effectively, data on which
scientific knowledge is built must be accessible and readily
located. Data must be intelligible to those who wish to
scrutinise them. They must be assessable so that judgments
can be made about their reliability and the competence of those
who created them. And they must be usable by others. Only
when these four criteria are fulfilled are data properly open.
9. Gastro-intestinal infection in Hamburg - 2011
• E-coli outbreak spread through
several countries affecting 4000
people
• Strain analysed and genome
released under an open data
license.
• Two dozen reports in a week with
interest from 4 continents
• Crucial information about strain’s
virulence and resistance made
available to public health authorities
12. “Scientific fraud is rife: it's time to stand up
for good science”
“Science is broken”
Examples:
psychology academics making up data,
anaesthesiologist Yoshitaka Fujii with 172 faked articles
Nature - rise in biomedical retraction rates overtakes rise in published papers
Cause:
Rewards and pressures promote extreme behaviours, and normalise malpractice
(e.g. selective publication of positive novel findings)
Cures:
Open data for replication
Transparent peer review
Not just personal integrity – but system integrity
13. Openness of data per se has no value.
Open science is more than disclosure
For effective communication, replication and re-purposing
we need intelligent openness. Data and meta-data must
be:
•
•
•
•
Accessible
Intelligible
Assessable
Re-usable
Only when these four criteria are fulfilled are data
properly open
Scientific data rarely fits neatly into an EXCEL spreadsheet!
14. But, intelligently open to whom?
To “taxpayers who are paying for that research will want to see
something back. Directly – through open access to results and
data.” Neelie Kroes, Vice President of the European Commission 27.03.13
Effective communication must be audience-sensitive – “data
dumping” is ineffectual – a waste of time and effort.
We must prioritise. How? We need to target public interest
science, not with ex cathedra statements, but intelligently open
data, arguments, uncertainties and options. (climate change,
energy, Earth resources, infectious disease, obesity, novel
technologies etc)
The Commission’s policy should be more nuanced.
15. …. and, intelligently open to citizen scientists
Examples:
Collecting the Data: professional community
• Galaxy Zoo: Hubble
• Solar Storm Watch
working with citizens in a different way
• Old Weather
• Whale FM
• Ancient Lives
• Fold It (creating protein
molecules)
• SETI (extra terrestrial
intelligence)
Benefits:
• Collaboration
• Scale
• Statistical power
and changing the social dynamics of science?
16. Boundaries of openness?
Openness should be the default position, with
proportional exceptions for:
• Legitimate commercial interests (sectoral variation)
• Privacy (completely anonymised data is impossible)
• Safety and security (impacts contentious)
All these boundaries are fuzzy
17. Recommendations (1)
Open data should be the default, not the exception.
It is part of the professional responsibility of scientists to communicate
data.
Universities and research institutes should support the ability of
scientists to communicate data through investment in skills training and
infrastructure.
Research assessment should include metrics for open data on the same
scale as journal articles. These metrics should recognise those who
maximise usability and good communication of their data.
The costs of preparing data and metadata for curation are part of the
normal cost of the research process.
Data that underpin major claims in a publication should be traceable
and usable from information in the article, within practical limits.
18. Recommendations (2)
Businesses should actively consider opportunities for the use and
commercial exploitation of freely available data and information.
Governments should recognise the potential of open data and open
science to enhance the productivity and excellence of the national
science base.
Appropriate sharing of research data and information should be
recognised to be in the public interest. Restrictions on sharing should
be proportionate and risk based.
Secure practices should be introduced as part of scientists’ training and
codes of conduct as they evolve.
19. The cascade of responsibility
Funders of research:
- mandate intelligent openness
- accept diverse outputs
- cost of open data is a cost of science
Scientists:
- changing the mindset
Learned Societies:
- influencing their communities
Universities/Institutes:
-
Research assessors:
- recognise diversity of contribution
Publishers:
- mandate concurrent open deposition
accept responsibility
strategies
management processes
incentives & promotion criteria
proactive, not just compliant
20. A realizable aspiration: all scientific literature online,
all data online, and for them to interoperate
… but, this is a process, not an event!