SlideShare a Scribd company logo
1 of 26
Science 2.0: discussing the best
available evidence
David Osimo
Open Evidence
based on a study for DG RTD
24th January 2013

1
Three stories
• Galaxyzoo: Galaxyzoo let users classify galaxies – 150K
volunteers had already classified more than 10 million
images of galaxies. “as accurate as that done by
astronomers“. 25+ scientific articles by Galaxy Zoo project
(from 2009)
• Synaptic Leap: to find an alternative drug treatment for
schistosomiasis with fewer side effect. All data and
experiments published on Electronic Lab Notebook; social
network activated. About 30 people, half from industry,
participated. Identified new process and resolving agent.
• Excel-gate: Reinhart & Rogoff, 2010: “as countries see
debt/GDP going above 90%, growth slows dramatically”.
Paper was used as main theoretical justification for
austerity. 2013: after getting the original excel file,
Herndon et al. discover coding error + data gaps +
unconventional weighting.

2
Science
outside
academia

Open data,
Article
mining

Mass
collaboration
Crowdsourcing
Sensors

3
Science 2.0: much more than Open Access

Open
Open
access
access

4
DataDataintensive
intensive

Citizens
Citizens
science
science

Open
Open
code
code

Open
Open
labbooks
labbooks
//wflows
wflows

Pre-print
Pre-print

Open
Open
access
access

Open
Open
data
data

Alternative
Alternative
Reputation
Reputation
systems
systems

Open
Open
annotati
annotati
on
on
Scientific
Scientific
blogs
blogs

Collaborative
Collaborative
bibliographies
bibliographies

5
DataDataFigshare.com
intensive
intensive

Sci-starter.com
Citizens
Citizens
science
science

An emerging
ecosystem of services
and standards

Open
Open
Runmycode.org
code
code

ArXiv
Open
Open
Myexperiment.org
labbooks
labbooks
//wflows
wflows

Pre-print
Pre-print

Open
Open
Roar.eprints.org
access
access

Datadryad.or
Open
Open
g
data
data

Alternative
Alternative
Altmetric.com
Reputation
Reputation
systems
systems

Open
Open
annotati
Openannotation.org
annotati
on
on
Scientific
Scientific
Researchgate.com
blogs
blogs

Collaborative
Collaborative
Mendeley.com
bibliographies
bibliographies

6
Growing at different speed
Trend

Status

Data

Pre-print

Mature

694.000 articles in arXiv

Open access

Fast growing

Exponential growth of OA journals. 8/10%
of scientific output is OA

Data intensive

Fast growing

52% of science authors deals with datasets
larger than 1Gb

Citizen scientist

Medium growth

650K Zoouniverse users
500 similar projects on SciStarter

Open data

Medium growth

20% scientists share data
15% journals require data sharing

Reference
sharing

Medium growth

2 Million users of Mendeley referencesharing tools

Open code

Sketchy growth

21% of JASA articles make code available
7% journals require code

Open Notebook

Sketchy growth

Isolated projects

Natural sciences outrank social science across all trends

7
Where The Data Goes Now:
> 50 My Papers
> 50 My Papers
2 M scientists
2 M scientists
2 M papers/year
2 M papers/year

A small portion of data
(1-2%?) stored in small,
topic-focused
data repositories

Majority of data
(90%?) is stored
on local hard drives

PDB:
PDB:
88,3 kk
88,3

PetDB:
PetDB:
1,5 kk
1,5

MiRB:
MiRB:
25k
25k

Some data
(8%?) stored in large,
generic data
repositories
Dryad:
Dryad:
7,631 files
7,631 files

SedDB:
SedDB:
0.6 kk
0.6
TAIR:
TAIR:
72,1 kk
72,1

Dataverse:
Dataverse:
0.6 M
0.6 M

Datacite:
Datacite:
1.5 M
1.5 M
Source: Anita De Waard 2013
Deep implications
• New scientific outputs and players: nanopublications,
data and code; vertical disintegration of the value
chain
• Greater role for inductive methods: everything
becomes a Genome Project
• Scaling serendipity: Big linked data, collaborative
annotation, social networking and knowledge mining
detect unexpected correlations on a massive scale
• Better science: reproducible and truly falsifiable
research findings; earlier uncovering of mistakes
• More productive science: reusing data and products,
crowdsourcing work, reduce time-to-publication
9
Europe can lead
• European scientific publishers are leading on
experimentation with new kind of open and data-intensive
services
E.g. “Article of the Future project, AppsForScience competition
(Elsevier) Thieme ( a small German publisher) data integration

• Home to world class science 2.0 startups:

Mendeley and ResearchGate are global players in social networking
for scientists, Digital Science that recently acquired FigShare
Mendeley used by about 2 million researchers, covering 65 million
documents vs 49 by commercial databases by Thomson Reuters.
Elsevier just bought Mendeley for 50 M Euros.

• Home to top citizen science initiatives

(GalaxyZoo was launched in Oxford, ExCiteS group and Citizen
Cyberscience Centre)

• Funding agencies are active in new mandates on openness
(e.g. Wellcome Trust, FP7) – open access, open data

10
BUT the institutional framework is
a bottleneck
• Researchers are reluctant to
share data and code [1], and
to provide open peer review
• Current career mechanisms
are “publish or perish”. No
reward for sharing.
• Publishing data and code
requires additional work
• Publishing intermediate
products can actually hinder
publication/patenting:
sharing is difficult in patentintensive domains
• Funding mechanisms are too
rigid, roadmap-based and
evaluated on articles and
patents
[1] Wicherts et al., 2011 ; Research Information Network, 2008 ; Campbell , 2002

11
Institutional failure and the case for public intervention
• Contradictions emerge between individuals’ and societal benefits
• Research funders (and publishers) have high leverage on
scientific institutions
BENEFITS

Individual
Researchers

Institutions

Business

Publishers

Societal
benefits

Open access

++

+

+

--

++

Open data

--

--

--

+

++

Open code

--

--

--

=

++

Citizen
science

+

=

+

=

+

Alternative
reputation
systems

+

-

+

-

+

Dataintensive

+

+

+

+

++

Social media

+

=

=

=

+

12
How to grasp this opportunity?
• It’s not about adding a science 2.0 top-down
roadmap-based initiative in existing
programmes
• It’s not about simply letting a thousand
flowers flourish bottom-up
• It’s about nudging the right institutional rearrangement (Perez) and right system of
incentives for the scientific value chain
13
Towards research policy 2.0
Recommendation

Inspiring example

Adopt more flexible reputation
mechanisms for scientists

From 2013, NSF requires PI to list
research “products” rather than
“publications”

Encourage sharing by regulation

Wellcome Trust mandatory data plan

Cover the costs of sharing
intermediate output such as data

Gold access publication costs to be
covered in Horizon2020

Develop Innovative infrastructure, Alternative reputation system,
tools , methods and standards
Openannotation, Datadryad
Make IPR more flexible

Innocentive.com, Peertopatent.com

Increase open-ended funding
system

FET open, UK Arts council, Inducement
prizes

Collect better evidence

Dedicated data-gathering exercise (a’ la
PEW)
14
Thanks
• Continue the discussion at
science20study.wordpress.com
• Collect evidence and cases at
groups.diigo.com/group/science-20
• Contact david.osimo@tech4i2.com ;
katarzyna.szkuta@tech4i2.com ; @osimod

15
Backup

16
Emerging impact:
a) more productive science
– using the same data sets for multiple research. 50% of Hubble
papers came from data re-users [1].
– Crowdsourcing work: “thousands recruited in months versus years
and billions of data points per person, potential novel discovery in
the patterns of large data sets, and the possibility of near real-time
testing and application of new medical findings.” [2].
– “cut down the time it takes to go from lab to medicine by 10-15
years with Open Notebook Science”. “because of poor literature
analysis tools 20-25% of the work done in his synthetic chemistry
lab is unnecessary duplication or could be predicted to fail” [3]
– Faster circulaton of high-quality ideas: 70% of publications
discussed in blogs are from high-impact journals
– Open research solved one-third of a sample of problems that large
and well-known R & D-intensive firms had been unsuccessful in
solving internally [4]
[1] http://archive.stsci.edu/hst/bibliography/pubstat.html
[2] http://www.jmir.org/2012/2/e46/
[3] http://science.okfn.org/category/pubs/
[4] Lakhani et al., 2007)

17
b) Better science
• Greater falsifiability (Popper): move towards reproducible
science thanks to publishing data + code in addition to
article,
• Rapidly uncover mistaken findings (Climategate 2009 or
microarray-based clinical trials underway at Duke
University)
• Data sharing is associated with greater robustness of
findings [1]. Sharing data and notes applies to failures, as
well as successes
• Especially important for computational science
“Computational science cannot be elevated to a third
branch of the scientific method until it generates routinely
verifiable knowledge” [2]
[1] Wicherts et al., 2011
[2] Donoho, Stodden, et al. 2009

18
c) Greater role of inductive
methods
• “The end of theory”: “Here’s the evidence, now
what is the hypothesis?”
• All science becomes computational. 38% of
scientists spend more than 1/5 of their time
developing software (Merali, 2010).
• Greater availability of data collection and
datasets increases the utility of inductive
methods. Genome project as new paradigm

19
d) Scaling serendipity
• From penicillin to theory of relativity, serendipity has
always been a core component of science
• Big linked data, collaborative annotation and knowledge
mining of OA articles allow to detect unexpected
correlation on a massive scale. Mendeley manages the
bibliographies of 2 Million scientists and uses them for
suggest further reading.
• Emerging evidence that for scholars recommendation is
more important than search for references. Social
networking and recommendation systems allow scientists
to “stumble upon” new evidence
• Open research successful solvers solved problems at the
boundary or outside of their fields of expertise [1]
[1] Lakhani et al., 2007

20
e) New outputs and players
#beyondthepdf
• Nanopublications, datasets, code
• Integration of data and code with articles
• Reproducible papers and books

21
Emerging policies
• Funders and publishers have high leverage on researchers
• Increasing push towards Open Access from funders
• Journals and funding agencies increasingly require data
submission and data management plans
• From 14 January 2013, NSF grants forms requires PI to list
research “products” rather than “publications”
• Alternative metrics emerge such as altmetrics and
download statistics

22
Towards research policy 2.0
Features
• Simplified proposals
• Rewarding solutions, not proposal
• Multi-stage
• Open priorities
• Flexible and open ended (allowing for
• serendipity)
• Peer-selection Reputation-based
• (funding not to the proposal but to the person)
• Multidisciplinarity by design
• Flexible IPR
• Short project time
• Accepting failure
• transparency (open monitoring)
• Based on social network analysis

Examples
•
•
•
•
•
•
•
•
•
•
•

•
•
•
•
•
•

Inducement prizes e.g.
http://www.heritagehealthprize.com
Seed Capital
http://www.ibbt.be/en/istart/our-istart
toolbox/iventure)
ERC http://erc.europa.eu
SBIR http://www.sbir.gov
FET OPEN
http://cordis.europa.eu/fp7/ict/fetopen/h
ome_en.html
SME

htt://cordis.europa.eu/fetch?
CALLER=PROGLINK_PARTNERS&AC
TION=D&DOC=1&CAT=PROG&QUERY=012e7c32
4da6:39b1:49a0
957c&RCN=862

IBBT www.ibbt.be
Arts council
http://www.artscouncil.org.uk/funding/gr
ants
arts
Banca dell’innovazione / Innovation Bank
http://italianvalley.wired.it/news/altri/per
che-ci
23
serve-una-banca-nazionale-dell-
• Last year researchers at one biotech firm, Amgen,
found they could reproduce just six of 53
“landmark” studies in cancer research.
• Earlier, a group at Bayer, a drug company,
managed to repeat just a quarter of 67 similarly
important papers.
• A leading computer scientist frets that threequarters of papers in his subfield are bunk.
• In 2000-10 roughly 80,000 patients took part in
clinical trials based on research that was later
retracted because of mistakes or improprieties.
24
• Conversely, failures to prove a hypothesis are
rarely even offered for publication, let alone
accepted. “Negative results” now account for
only 14% of published papers, down from 30%
in 1990. Yet knowing what is false is as
important to science as knowing what is true.
The failure to report failures means that
researchers waste money and effort exploring
blind alleys already investigated by other
scientists.

25
• When a prominent medical journal ran
research past other experts in the field, it
found that most of the reviewers failed to
spot mistakes it had deliberately inserted into
papers, even after being told they were being
tested.

26

More Related Content

What's hot

Ict와 사회과학지식간 학제간 연구동향(23 march2013)
Ict와 사회과학지식간 학제간 연구동향(23 march2013)Ict와 사회과학지식간 학제간 연구동향(23 march2013)
Ict와 사회과학지식간 학제간 연구동향(23 march2013)
Han Woo PARK
 

What's hot (20)

20160523 23 Research Data Things
20160523 23 Research Data Things20160523 23 Research Data Things
20160523 23 Research Data Things
 
20160719 23 Research Data Things
20160719 23 Research Data Things20160719 23 Research Data Things
20160719 23 Research Data Things
 
2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...
2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...
2013 DataCite Summer Meeting - California Digital Library (Joan Starr - Calif...
 
Ict와 사회과학지식간 학제간 연구동향(23 march2013)
Ict와 사회과학지식간 학제간 연구동향(23 march2013)Ict와 사회과학지식간 학제간 연구동향(23 march2013)
Ict와 사회과학지식간 학제간 연구동향(23 march2013)
 
Citizen Science And a Manufacturing Revolution: Major trends research notes
Citizen Science And a Manufacturing Revolution: Major trends research notesCitizen Science And a Manufacturing Revolution: Major trends research notes
Citizen Science And a Manufacturing Revolution: Major trends research notes
 
The Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataThe Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big Data
 
20160414 23 Research Data Things
20160414 23 Research Data Things20160414 23 Research Data Things
20160414 23 Research Data Things
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data Science
 
Big Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&DBig Data Talent in Academic and Industry R&D
Big Data Talent in Academic and Industry R&D
 
BD2K Update
BD2K Update BD2K Update
BD2K Update
 
Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014
 
Data Science BD2K Update for NIH
Data Science BD2K Update for NIH Data Science BD2K Update for NIH
Data Science BD2K Update for NIH
 
Understanding the Big Data Enterprise
Understanding the Big Data EnterpriseUnderstanding the Big Data Enterprise
Understanding the Big Data Enterprise
 
Intro to Data Science Concepts
Intro to Data Science ConceptsIntro to Data Science Concepts
Intro to Data Science Concepts
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
Genome sharing projects around the world nijmegen oct 29 - 2015
Genome sharing projects around the world   nijmegen oct 29 - 2015Genome sharing projects around the world   nijmegen oct 29 - 2015
Genome sharing projects around the world nijmegen oct 29 - 2015
 
Scott Edmunds at OASP Asia: Open (and Big) Data – the next challenge
Scott Edmunds at OASP Asia: Open (and Big) Data – the next challengeScott Edmunds at OASP Asia: Open (and Big) Data – the next challenge
Scott Edmunds at OASP Asia: Open (and Big) Data – the next challenge
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down Here
 
Open Science, Open Access
Open Science, Open AccessOpen Science, Open Access
Open Science, Open Access
 
Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?Is a Biological Database Really Different than a Biological Journal?
Is a Biological Database Really Different than a Biological Journal?
 

Viewers also liked

Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
Stefanie Haustein
 

Viewers also liked (6)

New science communication: Research and Innovation in the Era of the Internet
New science communication: Research and Innovation in the Era of the InternetNew science communication: Research and Innovation in the Era of the Internet
New science communication: Research and Innovation in the Era of the Internet
 
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
Altmetrics: opportunités et défis associés à l’usage des médias sociaux dans ...
 
Research Data Explored: Two Studies on Data Citation & Usage
Research Data Explored: Two Studies on Data Citation & UsageResearch Data Explored: Two Studies on Data Citation & Usage
Research Data Explored: Two Studies on Data Citation & Usage
 
Communication of Science 2.0.1: from the MOOC to DIY
Communication of Science 2.0.1: from the MOOC to DIYCommunication of Science 2.0.1: from the MOOC to DIY
Communication of Science 2.0.1: from the MOOC to DIY
 
Growing Knowledge : Supporting the Digital Researcher
Growing Knowledge : Supporting the Digital Researcher Growing Knowledge : Supporting the Digital Researcher
Growing Knowledge : Supporting the Digital Researcher
 
BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...
BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...
BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...
 

Similar to Presentation of science 2.0 at European Astronomical Society

A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
LIBER Europe
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Peter Löwe
 

Similar to Presentation of science 2.0 at European Astronomical Society (20)

Nicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShowNicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShow
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
 
Science 2.0
Science 2.0Science 2.0
Science 2.0
 
The State of Open Data Report by @figshare
The State of Open Data Report  by @figshareThe State of Open Data Report  by @figshare
The State of Open Data Report by @figshare
 
Nicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do researchNicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do research
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
Winning Horizon 2020 with Open Science
Winning Horizon 2020 with Open ScienceWinning Horizon 2020 with Open Science
Winning Horizon 2020 with Open Science
 
The Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data PilotThe Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data Pilot
 
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
The Horizon2020 Open Data Pilot - OpenAIRE WebinarThe Horizon2020 Open Data Pilot - OpenAIRE Webinar
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
 
Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...Data Science: History repeated? – The heritage of the Free and Open Source GI...
Data Science: History repeated? – The heritage of the Free and Open Source GI...
 
Open Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practicesOpen Data - strategies for research data management & impact of best practices
Open Data - strategies for research data management & impact of best practices
 
European Commission's Open Science Initiative: co-creating added value with data
European Commission's Open Science Initiative: co-creating added value with dataEuropean Commission's Open Science Initiative: co-creating added value with data
European Commission's Open Science Initiative: co-creating added value with data
 
From Open Data to Open Science, by Geoffrey Boulton
 From Open Data to Open Science, by Geoffrey Boulton From Open Data to Open Science, by Geoffrey Boulton
From Open Data to Open Science, by Geoffrey Boulton
 
The Era of Open
The Era of OpenThe Era of Open
The Era of Open
 
Presentation on Open Science and its 'Impacts';
Presentation on Open Science and its 'Impacts'; Presentation on Open Science and its 'Impacts';
Presentation on Open Science and its 'Impacts';
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Benefits and practice of open science
Benefits and practice of open scienceBenefits and practice of open science
Benefits and practice of open science
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
Research Data Alliance Plenary 9: DDRI Working Group Session
Research Data Alliance Plenary 9: DDRI Working Group SessionResearch Data Alliance Plenary 9: DDRI Working Group Session
Research Data Alliance Plenary 9: DDRI Working Group Session
 

More from osimod

Ko presentation
Ko presentationKo presentation
Ko presentation
osimod
 
Citizens
CitizensCitizens
Citizens
osimod
 
Presentation at okioconf14
Presentation at okioconf14Presentation at okioconf14
Presentation at okioconf14
osimod
 
0205 f01 international research roadmap
0205 f01 international research roadmap0205 f01 international research roadmap
0205 f01 international research roadmap
osimod
 
Osimo policy 20odessa
Osimo policy 20odessaOsimo policy 20odessa
Osimo policy 20odessa
osimod
 

More from osimod (20)

Towards Policymaking 2.0
Towards Policymaking 2.0Towards Policymaking 2.0
Towards Policymaking 2.0
 
Osimo openaire seminar
Osimo openaire seminarOsimo openaire seminar
Osimo openaire seminar
 
Methodological note of the Open Science Monitor second version for publication
Methodological note of the Open Science Monitor second version for publicationMethodological note of the Open Science Monitor second version for publication
Methodological note of the Open Science Monitor second version for publication
 
Osm presentation workshop 19 sept 2018
Osm presentation workshop 19 sept 2018Osm presentation workshop 19 sept 2018
Osm presentation workshop 19 sept 2018
 
Ko presentation v2
Ko presentation v2Ko presentation v2
Ko presentation v2
 
Ko presentation
Ko presentationKo presentation
Ko presentation
 
Osimo codagnone
Osimo codagnoneOsimo codagnone
Osimo codagnone
 
Gipo engagement strategy
Gipo engagement strategyGipo engagement strategy
Gipo engagement strategy
 
Citizens
CitizensCitizens
Citizens
 
Presentation at board DKV Seguros
Presentation at board DKV SegurosPresentation at board DKV Seguros
Presentation at board DKV Seguros
 
I city2014
I city2014I city2014
I city2014
 
Ipp2014
Ipp2014Ipp2014
Ipp2014
 
I risultati del progetto Kublai
I risultati del progetto KublaiI risultati del progetto Kublai
I risultati del progetto Kublai
 
Kublai evaluation key findings
Kublai evaluation   key findingsKublai evaluation   key findings
Kublai evaluation key findings
 
Presentation at okioconf14
Presentation at okioconf14Presentation at okioconf14
Presentation at okioconf14
 
UNDP - Open Evidence infographic: How to build an open gov project
UNDP - Open Evidence infographic: How to build an open gov projectUNDP - Open Evidence infographic: How to build an open gov project
UNDP - Open Evidence infographic: How to build an open gov project
 
0205 f01 international research roadmap
0205 f01 international research roadmap0205 f01 international research roadmap
0205 f01 international research roadmap
 
Osimo policy 20odessa
Osimo policy 20odessaOsimo policy 20odessa
Osimo policy 20odessa
 
Osimo crossover-roadmap
Osimo crossover-roadmapOsimo crossover-roadmap
Osimo crossover-roadmap
 
Making eu innovation policies fit for the web def
Making eu innovation policies fit for the web defMaking eu innovation policies fit for the web def
Making eu innovation policies fit for the web def
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Presentation of science 2.0 at European Astronomical Society

  • 1. Science 2.0: discussing the best available evidence David Osimo Open Evidence based on a study for DG RTD 24th January 2013 1
  • 2. Three stories • Galaxyzoo: Galaxyzoo let users classify galaxies – 150K volunteers had already classified more than 10 million images of galaxies. “as accurate as that done by astronomers“. 25+ scientific articles by Galaxy Zoo project (from 2009) • Synaptic Leap: to find an alternative drug treatment for schistosomiasis with fewer side effect. All data and experiments published on Electronic Lab Notebook; social network activated. About 30 people, half from industry, participated. Identified new process and resolving agent. • Excel-gate: Reinhart & Rogoff, 2010: “as countries see debt/GDP going above 90%, growth slows dramatically”. Paper was used as main theoretical justification for austerity. 2013: after getting the original excel file, Herndon et al. discover coding error + data gaps + unconventional weighting. 2
  • 4. Science 2.0: much more than Open Access Open Open access access 4
  • 6. DataDataFigshare.com intensive intensive Sci-starter.com Citizens Citizens science science An emerging ecosystem of services and standards Open Open Runmycode.org code code ArXiv Open Open Myexperiment.org labbooks labbooks //wflows wflows Pre-print Pre-print Open Open Roar.eprints.org access access Datadryad.or Open Open g data data Alternative Alternative Altmetric.com Reputation Reputation systems systems Open Open annotati Openannotation.org annotati on on Scientific Scientific Researchgate.com blogs blogs Collaborative Collaborative Mendeley.com bibliographies bibliographies 6
  • 7. Growing at different speed Trend Status Data Pre-print Mature 694.000 articles in arXiv Open access Fast growing Exponential growth of OA journals. 8/10% of scientific output is OA Data intensive Fast growing 52% of science authors deals with datasets larger than 1Gb Citizen scientist Medium growth 650K Zoouniverse users 500 similar projects on SciStarter Open data Medium growth 20% scientists share data 15% journals require data sharing Reference sharing Medium growth 2 Million users of Mendeley referencesharing tools Open code Sketchy growth 21% of JASA articles make code available 7% journals require code Open Notebook Sketchy growth Isolated projects Natural sciences outrank social science across all trends 7
  • 8. Where The Data Goes Now: > 50 My Papers > 50 My Papers 2 M scientists 2 M scientists 2 M papers/year 2 M papers/year A small portion of data (1-2%?) stored in small, topic-focused data repositories Majority of data (90%?) is stored on local hard drives PDB: PDB: 88,3 kk 88,3 PetDB: PetDB: 1,5 kk 1,5 MiRB: MiRB: 25k 25k Some data (8%?) stored in large, generic data repositories Dryad: Dryad: 7,631 files 7,631 files SedDB: SedDB: 0.6 kk 0.6 TAIR: TAIR: 72,1 kk 72,1 Dataverse: Dataverse: 0.6 M 0.6 M Datacite: Datacite: 1.5 M 1.5 M Source: Anita De Waard 2013
  • 9. Deep implications • New scientific outputs and players: nanopublications, data and code; vertical disintegration of the value chain • Greater role for inductive methods: everything becomes a Genome Project • Scaling serendipity: Big linked data, collaborative annotation, social networking and knowledge mining detect unexpected correlations on a massive scale • Better science: reproducible and truly falsifiable research findings; earlier uncovering of mistakes • More productive science: reusing data and products, crowdsourcing work, reduce time-to-publication 9
  • 10. Europe can lead • European scientific publishers are leading on experimentation with new kind of open and data-intensive services E.g. “Article of the Future project, AppsForScience competition (Elsevier) Thieme ( a small German publisher) data integration • Home to world class science 2.0 startups: Mendeley and ResearchGate are global players in social networking for scientists, Digital Science that recently acquired FigShare Mendeley used by about 2 million researchers, covering 65 million documents vs 49 by commercial databases by Thomson Reuters. Elsevier just bought Mendeley for 50 M Euros. • Home to top citizen science initiatives (GalaxyZoo was launched in Oxford, ExCiteS group and Citizen Cyberscience Centre) • Funding agencies are active in new mandates on openness (e.g. Wellcome Trust, FP7) – open access, open data 10
  • 11. BUT the institutional framework is a bottleneck • Researchers are reluctant to share data and code [1], and to provide open peer review • Current career mechanisms are “publish or perish”. No reward for sharing. • Publishing data and code requires additional work • Publishing intermediate products can actually hinder publication/patenting: sharing is difficult in patentintensive domains • Funding mechanisms are too rigid, roadmap-based and evaluated on articles and patents [1] Wicherts et al., 2011 ; Research Information Network, 2008 ; Campbell , 2002 11
  • 12. Institutional failure and the case for public intervention • Contradictions emerge between individuals’ and societal benefits • Research funders (and publishers) have high leverage on scientific institutions BENEFITS Individual Researchers Institutions Business Publishers Societal benefits Open access ++ + + -- ++ Open data -- -- -- + ++ Open code -- -- -- = ++ Citizen science + = + = + Alternative reputation systems + - + - + Dataintensive + + + + ++ Social media + = = = + 12
  • 13. How to grasp this opportunity? • It’s not about adding a science 2.0 top-down roadmap-based initiative in existing programmes • It’s not about simply letting a thousand flowers flourish bottom-up • It’s about nudging the right institutional rearrangement (Perez) and right system of incentives for the scientific value chain 13
  • 14. Towards research policy 2.0 Recommendation Inspiring example Adopt more flexible reputation mechanisms for scientists From 2013, NSF requires PI to list research “products” rather than “publications” Encourage sharing by regulation Wellcome Trust mandatory data plan Cover the costs of sharing intermediate output such as data Gold access publication costs to be covered in Horizon2020 Develop Innovative infrastructure, Alternative reputation system, tools , methods and standards Openannotation, Datadryad Make IPR more flexible Innocentive.com, Peertopatent.com Increase open-ended funding system FET open, UK Arts council, Inducement prizes Collect better evidence Dedicated data-gathering exercise (a’ la PEW) 14
  • 15. Thanks • Continue the discussion at science20study.wordpress.com • Collect evidence and cases at groups.diigo.com/group/science-20 • Contact david.osimo@tech4i2.com ; katarzyna.szkuta@tech4i2.com ; @osimod 15
  • 17. Emerging impact: a) more productive science – using the same data sets for multiple research. 50% of Hubble papers came from data re-users [1]. – Crowdsourcing work: “thousands recruited in months versus years and billions of data points per person, potential novel discovery in the patterns of large data sets, and the possibility of near real-time testing and application of new medical findings.” [2]. – “cut down the time it takes to go from lab to medicine by 10-15 years with Open Notebook Science”. “because of poor literature analysis tools 20-25% of the work done in his synthetic chemistry lab is unnecessary duplication or could be predicted to fail” [3] – Faster circulaton of high-quality ideas: 70% of publications discussed in blogs are from high-impact journals – Open research solved one-third of a sample of problems that large and well-known R & D-intensive firms had been unsuccessful in solving internally [4] [1] http://archive.stsci.edu/hst/bibliography/pubstat.html [2] http://www.jmir.org/2012/2/e46/ [3] http://science.okfn.org/category/pubs/ [4] Lakhani et al., 2007) 17
  • 18. b) Better science • Greater falsifiability (Popper): move towards reproducible science thanks to publishing data + code in addition to article, • Rapidly uncover mistaken findings (Climategate 2009 or microarray-based clinical trials underway at Duke University) • Data sharing is associated with greater robustness of findings [1]. Sharing data and notes applies to failures, as well as successes • Especially important for computational science “Computational science cannot be elevated to a third branch of the scientific method until it generates routinely verifiable knowledge” [2] [1] Wicherts et al., 2011 [2] Donoho, Stodden, et al. 2009 18
  • 19. c) Greater role of inductive methods • “The end of theory”: “Here’s the evidence, now what is the hypothesis?” • All science becomes computational. 38% of scientists spend more than 1/5 of their time developing software (Merali, 2010). • Greater availability of data collection and datasets increases the utility of inductive methods. Genome project as new paradigm 19
  • 20. d) Scaling serendipity • From penicillin to theory of relativity, serendipity has always been a core component of science • Big linked data, collaborative annotation and knowledge mining of OA articles allow to detect unexpected correlation on a massive scale. Mendeley manages the bibliographies of 2 Million scientists and uses them for suggest further reading. • Emerging evidence that for scholars recommendation is more important than search for references. Social networking and recommendation systems allow scientists to “stumble upon” new evidence • Open research successful solvers solved problems at the boundary or outside of their fields of expertise [1] [1] Lakhani et al., 2007 20
  • 21. e) New outputs and players #beyondthepdf • Nanopublications, datasets, code • Integration of data and code with articles • Reproducible papers and books 21
  • 22. Emerging policies • Funders and publishers have high leverage on researchers • Increasing push towards Open Access from funders • Journals and funding agencies increasingly require data submission and data management plans • From 14 January 2013, NSF grants forms requires PI to list research “products” rather than “publications” • Alternative metrics emerge such as altmetrics and download statistics 22
  • 23. Towards research policy 2.0 Features • Simplified proposals • Rewarding solutions, not proposal • Multi-stage • Open priorities • Flexible and open ended (allowing for • serendipity) • Peer-selection Reputation-based • (funding not to the proposal but to the person) • Multidisciplinarity by design • Flexible IPR • Short project time • Accepting failure • transparency (open monitoring) • Based on social network analysis Examples • • • • • • • • • • • • • • • • • Inducement prizes e.g. http://www.heritagehealthprize.com Seed Capital http://www.ibbt.be/en/istart/our-istart toolbox/iventure) ERC http://erc.europa.eu SBIR http://www.sbir.gov FET OPEN http://cordis.europa.eu/fp7/ict/fetopen/h ome_en.html SME htt://cordis.europa.eu/fetch? CALLER=PROGLINK_PARTNERS&AC TION=D&DOC=1&CAT=PROG&QUERY=012e7c32 4da6:39b1:49a0 957c&RCN=862 IBBT www.ibbt.be Arts council http://www.artscouncil.org.uk/funding/gr ants arts Banca dell’innovazione / Innovation Bank http://italianvalley.wired.it/news/altri/per che-ci 23 serve-una-banca-nazionale-dell-
  • 24. • Last year researchers at one biotech firm, Amgen, found they could reproduce just six of 53 “landmark” studies in cancer research. • Earlier, a group at Bayer, a drug company, managed to repeat just a quarter of 67 similarly important papers. • A leading computer scientist frets that threequarters of papers in his subfield are bunk. • In 2000-10 roughly 80,000 patients took part in clinical trials based on research that was later retracted because of mistakes or improprieties. 24
  • 25. • Conversely, failures to prove a hypothesis are rarely even offered for publication, let alone accepted. “Negative results” now account for only 14% of published papers, down from 30% in 1990. Yet knowing what is false is as important to science as knowing what is true. The failure to report failures means that researchers waste money and effort exploring blind alleys already investigated by other scientists. 25
  • 26. • When a prominent medical journal ran research past other experts in the field, it found that most of the reviewers failed to spot mistakes it had deliberately inserted into papers, even after being told they were being tested. 26