SlideShare ist ein Scribd-Unternehmen logo
1 von 68
How can we make science more
trustworthy and FAIR? Principled
publishing for more evidence
based research
The perspective of transparency…
Scott Edmunds
IARC, 8th July 2019
IARC (15 years ago)
Scientists: need to convince public + politicians
Scientists: need to convince public + politicians
https://www.newsweek.com/pruitt-trump-asbestos-chemicals-trump-962703
#FakeNews
#MakeCarcinogensGreatAgain
https://www.washingtonpost.com/news/morning-mix/wp/2015/11/09/scientist-falsified-data-for-cancer-
research-once-described-as-holy-grail-feds-say
How not to regain trust?
Forensic Bioinformatics: where raw data and reported results
are used to reconstruct what the methods must have been.
https://retractionwatch.com/2011/05/04/the-importance-of-being-reproducible-keith-baggerly-tells-the-anil-potti-story/
How not to regain trust?
http://www.nature.com/news/data-sharing-make-outbreak-research-open-access-1.16966
How not to regain trust?
Data Sharing with Chinese Characteristics
“The 2016 draft HGR Regulation further declares “safeguarding
national security” as one of its legislative purposes, with biosecurity as
a core element of national security.”
“Echoing the repeated warning against illegal seizures of genetic
resources by foreign entities, the drafters identify in particular cross-
border data transfer as a new and covert means of seizure. This
position is in a distinctive contrast with international consensus on
the imperative of genomic data sharing, as recognized under the
Bermuda Principles, the Fort Lauderdale Agreement, and initiatives of
building interoperable rules of sharing, such as the Framework for
Responsible Sharing of Genomic and Health-Related Data of the
GA4GH.
China: concurring regulation of cross-border genomic data sharing for statist control and individual protection
https://doi.org/10.1007/s00439-018-1903-2
How not to regain trust?
https://doi.org/10.1007/s00439-018-1903-2
https://doi.org/10.1038/nature14659
“Based on the same rationale, the MOST launched nationwide audit campaigns in
2011 and 2013 to identify sino-overseas projects that are unauthorized or
uncompliant with state policies.
It is noteworthy that in February 2018, the CAHGR revoked the licenses granted to
two high-profile collaborative projects, which concern the Comparative Genetic Study
of Psychosis in Han Chinese (between UCLA & SJTU) and the Genetic Foundation of
Depression in Chinese Women (between Oxford Uni and PKU), respectively, and
confiscated the exported genomic data (CAHGR 2018). The revocation was made
pursuant to the Administrative License Law, but no specific reasons were disclosed in
the formal decision.”
Data Sharing with Chinese Characteristics
How not to regain trust?
https://www.nature.com/articles/d41586-018-07222-2
The ministry says genomics giant BGI in Shenzhen and Shanghai’s Huashan Hospital were
caught breaking the rules after they put genetic information online without approval. The
data was part of a large international study on the genetics of depression, which was
published in Nature in 2015. The paper was based on anonymized sequence data from more
than 10,000 Chinese women, which BGI acknowledges it did not have permission to publish
in paper's supplementary material.
A spokesperson from BGI says the company has destroyed the data, as request by the
ministry. They say the company has also requested Nature remove the article from its
website.
Open Data saves lives, but kills
candidate gene studies
https://www.theatlantic.com/science/archive/2019/05/waste-1000-studies/589684/
Open Data saves lives, and size matters
+Open
Data
https://doi.org/10.1176/appi.ajp.2018.18070881
(inc. Chinese CONVERGE data)
=
Focusing on unscientific unreproducibile metrics
Incentivising short term-citations
How not to regain trust?
JIFBAIT Network
more
GWAS
GWAS
JIFBAIT NEWS
Arsenic Life forms, will
they take over the planet?
By Melba Ketchum, PhD
Which Overhyped, Unreproducible
Experiment Are You?
Want rapid citations for 2 years only? Carry out this quiz.
You got: STAP Cells
Of course dipping cells in
coffee will make them
pluripotent. Even if the
research gets discredited, it’ll
still get 100’s of citations in
two years.
Publish or impoverish: An investigation of the monetary
reward system of science in China (1999-2016)
https://doi.org/10.1108/AJIM-01-2017-0014
http://www.szhrss.gov.cn/xxgk/zcfgjjd/gcjzyrcgl/201708/t20170831_8317284.htm
Scientists: what we are doing instead
Shenzhen "Peacock" "National leading talent scheme”:
Science/Nature = ¥3M RMB, JCI Q1 = ¥1.6M RMB (1st & corresponding authors only)
Attempts to “game the peer-review system on an industrial
scale”
http://www.scientificamerican.com/article/for-sale-your-name-here-in-a-prestigious-science-journal/
Companies offering authorship of papers made to order by “paper
mills”.
Guaranteed publication in JIF journal, often using fake referees, ID
theft, etc. JIF 1-2 papers = ~$10,000 USD
In China publication + JIF = money = fraud
Do you want to be author of an IF 5.168 paper (OncoTarget)?
Title: “…meta-analysis to evaluate the long-term efficacy of different ****
drugs in the treatment of pancreatic cancer…”
Scientists: what we are doing instead
http://www.518sci.com/index.php?catid=17&ydzt=0-9999&zdprice=0-9999
http://www.scmp.com/comment/insight-opinion/article/1758662/china-must-restructure-its-academic-incentives-
curb-research
Created by skewed incentive systems…
“While we are rightly proud of Hong Kong’s highly regarded and ranked
universities system, we are not immune to the same pressures. While
funders in Europe have moved away from using citation based metrics such
as JIF in their research assessments, the Hong Kong University Grants
Committee states in their Research Assessment Exercise guidelines that they
may informally use it.”
http://goo.gl/zUDEC9
How to regain trust?
Areas we need to tackle to allow citizens to trust us
Open Access - Change incentive
systems away from dead tree
advertising to FAIR data &
reproducibility
Open Science - Increase
transparency & fill the data gaps
Citizen Science - Involve the public
in the scientific process
Provide evidence not advertising
Transparency or bust
Show me the peer reviews
Give me the data/code/protocols
Let me publish replication studies
Buckheit & Donoho: Scholarly articles are merely advertisement of
scholarship. The actual scholarly artifacts, i.e. the data and
computational methods, which support the scholarship, remain largely
inaccessible.
How to regain trust?
GigaScience Ethos/Policies: ‘Impact' is subjective. Data is quantitive.
Reward evidence (data), not advertising
• Data
• Software
• Models
• Pipelines
• Reviews
• Re-use…
= Credit
Rewarding open data & code
http://gigasciencejournal.com/
Since July 2012. Publishes “Data Notes” for CC0 data, “Tech Notes” for OSI software.
Integrated GigaDB repository. DataCite DOIs. No size limits, APC covers storage.
http://gigadb.org/
Rewarding open data & code
http://gigasciencejournal.com/blog/shortcut-from-biorxiv-to-gigascience /
The rise of the pre-print
Increase transparency/speed with bioRxiv
GigaScience embraces
Publons + PrePrint.Space
= credit for reviewers efforts
http://publons.com/
Credit transparency/open peer review
http://preprint.space/byjournal/gigascience
Rewarding & enabling interaction
Building tools (inc Jbrowse for genomes, sketchfab for 3D images) on top of datasets…
CodeOcean widgets for code, “compute capsule” (data+code+environment) run on AWS
[Insert Widget Here]
Aiding reproducibility of imaging studies
OMERO: providing
access to imaging data
Already used by JCB.
View, filter, measure raw
images with direct links
from journal article.
See all image data, not
just cherry picked
examples.
Download and reprocess.
The zoom viewer allows whole-slide images to be explored at cellular resolution in the
context of a web browser, and without need for data download.
This example shows a lymph node section from a breast cancer patient.
These data are available at: http://dx.doi.org/10.5524/100439
The alternative...
...look but don't touch
First journal with deep integration with
Launched 2nd June 2016
Reward better handling of “wet” protocols…
• Create, share, modify forkeable protocols in repo.
• Download & run on smartphone app.
• Widgets embedded in GigaDB
• Get discoverability, credit, DOIs for sharing methods.
• Create your own, or let us set up & you claim.
https://www.protocols.io/groups/gigascience-journal
Transparency to the rescue
Example 1
Germany 2011, >50 dead
To maximize its utility to the research community and aid those fighting
the current epidemic, genomic data is released here into the public domain
under a CC0 license. Until the publication of research papers on the
assembly and whole-genome analysis of this isolate we would ask you to
cite this dataset as:
Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang,
J; Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J;
Peng, Y; Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X;
Chen, F; Yin, X; Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J and the
Escherichia coli O104:H4 TY-2482 isolate genome sequencing consortium
(2011)
Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI
Shenzhen. doi:10.5524/100001
http://dx.doi.org/10.5524/100001
Our first DOI:
To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to
Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
Open Data to the rescue…
Downstream consequences:
“Last summer, biologist Andrew Kasarskis was eager to help decipher the genetic origin of the Escherichia coli
strain that infected roughly 4,000 people in Germany between May and July. But he knew it that might take days
for the lawyers at his company — Pacific Biosciences — to parse the agreements governing how his team could
use data collected on the strain. Luckily, one team had released its data under a Creative Commons licence that
allowed free use of the data, allowing Kasarskis and his colleagues to join the international research effort and
publish their work without wasting time on legal wrangling.”
1. Many Citations 2. Therapeutics (primers, antimicrobials) 3. Platform Comparisons
4. Example for faster & more open science
1.3 The power of intelligently open data
The benefits of intelligently open data were powerfully
illustrated by events following an outbreak of a severe gastro-
intestinal infection in Hamburg in Germany in May 2011. This
spread through several European countries and the US,
affecting about 4000 people and resulting in over 50 deaths. All
tested positive for an unusual and little-known Shiga-toxin–
producing E. coli bacterium. The strain was initially analysed by
scientists at BGI-Shenzhen in China, working together with
those in Hamburg, and three days later a draft genome was
released under an open data licence. This generated interest
from bioinformaticians on four continents. 24 hours after the
release of the genome it had been assembled. Within a week
two dozen reports had been filed on an open-source site
dedicated to the analysis of the strain. These analyses
provided crucial information about the strain’s virulence and
resistance genes – how it spreads and which antibiotics are
effective against it. They produced results in time to help
contain the outbreak. By July 2011, scientists published papers
based on this work. By opening up their early sequencing
results to international collaboration, researchers in Hamburg
produced results that were quickly tested by a wide range of
experts, used to produce new knowledge and ultimately to
control a public health emergency.
Transparency to the rescue
Example 2
Oxford Nanopore in the spotlight, Sept 2014. Does it work?
https://doi.org/10.1111/1755-0998.12324
http://omicsomics.blogspot.com/2014/09/oxford-takes-some-flak-fires-back.html
Nanopore MinION E. Coli genome
released via GigaDB 10-Sep-2014
Curated & converted to ISA-tab, &
worked with EBI to get raw data there
Data Note submitted & preprint version
out 26-Sept-2014
Peer reviewed & published 20-Oct-2014
http://dx.doi.org/10.5524/100102
Would you trust a Chinese sequencer?
Try before you buy: inspect ALL the data yourselves
https://doi.org/10.1093/gigascience/gix024
• Comparisons with Illumina for PE50,
100 & 150
• Raw sequencing data in NCBI SRA
• FASTQ files in GigaDB
• Raw image files & protocols shared
Would you trust a Chinese sequencer?
Open, transparent and peer reviewed benchmarking
https://doi.org/10.1093/gigascience/gix024
http://dx.doi.org/10.5524/review.100698
http://dx.doi.org/10.5524/review.100699Open
Review
Would you trust a Chinese
sequencer?
Transparency to the rescue
Example 3
http://dx.doi.org/10.5524/100034
What can you do with 15TB of open cancer data?
Hong Kong ACRG data in GigaDB
http://dx.doi.org/10.1126/scitranslmed.3006086
What can you do with 15TB of open cancer data?
Hong Kong ACRG data in GigaDB
After open: FAIR
A mnemonic to remember: FAIR
http://www.nature.com/articles/sdata201618
http://www.datafairport.org/
Require stewardship on top of access
A mnemonic to remember: FAIR
http://www.nature.com/articles/sdata201618
http://www.datafairport.org/
Beyond a mnemonic: FAIR ecosystems
FAIR metrics
https://www.go-fair.org/go-fair-initiative/
Beyond a mnemonic: FAIR Evaluation
Evaluating FAIR-Compliance Through an Objective, Automated, Community-Governed
Framework https://www.biorxiv.org/content/early/2018/09/16/418376
DTL/ELIXIR-NL
“Bring Your Own Data Party”
GigaScience/BGI HK
Metabolomics ISA-TAB athon v
More FAIR mnemonics: “BYODs”
Public:
Open Science, the final frontier:
democratising research for citizens
The Hong Kong example…
HK Botanical &
Afforestation Dept.
"The mysterious origin
of the tree & its
magnificent flowers at
once arrest the interest.
The Bauhinia Mystery?
1903
So far, all efforts to identify them with
any foreign species have failed"
Courtesy of: Archives des Missions Etrangère de Paris
http://igg.me/at/bauhinia
TEDx EduHK https://youtu.be/RcBzJI8O4j0
As featured on…
Education: reproducible research
Education: sharing FAIR data
http://dx.doi.org/10.5524/100245
http://dx.doi.org/10.5524/100345
Student power (MSc @ CUHK)
Education: teaching people with the data
Transcriptomes assembled & annotated by MSc students
Looked at GO/KEGG
& TCM compounds
Looked at parental links
(diversity, maternal/paternal)
B. Purpurea = motherB. Variegata = father
Open Science = Science
• Science needed more than ever to tackle grave
public health challenges
• Need to escape from our ivory towers & increase
transparency to regain stakeholder trust
• Take science back to standing on the shoulders of
giants, rather than unFAIR practices
• Choose evidence not branding
• Once we have Open Data, we then need FAIR data
stewardship
• New EU funder rules on open science/OA coming –
preempt FAIR assessment
Help GigaScience make it happen
www.gigasciencejournal.com
Give us your data,
pipelines & papers
scott@gigasciencejournal.com
editorial@gigasciencejournal.com
database@gigasciencejournal.com
Contact us:
Thanks to:
Laurie Goodman, Editor in Chief
Nicole Nogoy, Editor
Hans Zauner, Assistant Editor
Hongling Zhao, Assistant Editor
Peter Li, Lead Data Manager
Chris Hunter, Lead BioCurator
Chris Armit, Data Scientist
Mary Ann Tulli, Data Ediitor
Xiao (Jesse) Si Zhe, Database Developer
Chen Qi, Shenzhen Office.
@GigaScience
facebook.com/GigaScience
http://gigasciencejournal.com/blog/
Follow us:
www.gigasciencejournal.com
www.gigadb.org
+
Weibo
& WeChat

Weitere ähnliche Inhalte

Was ist angesagt?

RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 

Was ist angesagt? (20)

Open PHACTS : Linked Data Future Challenges
Open PHACTS : Linked Data Future ChallengesOpen PHACTS : Linked Data Future Challenges
Open PHACTS : Linked Data Future Challenges
 
Reproducibility
ReproducibilityReproducibility
Reproducibility
 
David Tyrpak CV
David Tyrpak CVDavid Tyrpak CV
David Tyrpak CV
 
When pharmaceutical companies publish large datasets an abundance of riches o...
When pharmaceutical companies publish large datasets an abundance of riches o...When pharmaceutical companies publish large datasets an abundance of riches o...
When pharmaceutical companies publish large datasets an abundance of riches o...
 
Building A Community Resource For The Life Sciences
Building A Community Resource For The Life SciencesBuilding A Community Resource For The Life Sciences
Building A Community Resource For The Life Sciences
 
Improving online chemistry one structure at a time
Improving online chemistry one structure at a timeImproving online chemistry one structure at a time
Improving online chemistry one structure at a time
 
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...
 
Enriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentationEnriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentation
 
Vaccination PowerPoint 3b
Vaccination PowerPoint 3bVaccination PowerPoint 3b
Vaccination PowerPoint 3b
 
Vaccination PowerPoint Second Assignment Four Adjustments
Vaccination PowerPoint Second Assignment Four AdjustmentsVaccination PowerPoint Second Assignment Four Adjustments
Vaccination PowerPoint Second Assignment Four Adjustments
 
Nicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do researchNicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do research
 
April 2020 top read articles in data mining & knowledge management proces...
April 2020 top read articles in data mining & knowledge management proces...April 2020 top read articles in data mining & knowledge management proces...
April 2020 top read articles in data mining & knowledge management proces...
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
 
A brief history of reaction analytics (CINF 144, ACS National Meeting 2018-08...
A brief history of reaction analytics (CINF 144, ACS National Meeting 2018-08...A brief history of reaction analytics (CINF 144, ACS National Meeting 2018-08...
A brief history of reaction analytics (CINF 144, ACS National Meeting 2018-08...
 
Nick Brown Drug Repositioning Informatics
Nick Brown Drug Repositioning InformaticsNick Brown Drug Repositioning Informatics
Nick Brown Drug Repositioning Informatics
 
Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature
 
Leveraging Publicly Accessible Clinical Trails Data Sharing, Dissemination an...
Leveraging Publicly Accessible Clinical Trails Data Sharing, Dissemination an...Leveraging Publicly Accessible Clinical Trails Data Sharing, Dissemination an...
Leveraging Publicly Accessible Clinical Trails Data Sharing, Dissemination an...
 
BioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug DiscoveryBioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug Discovery
 
Aries systems eemug 2021 manuscript eval services panel sci score v2_edits
Aries systems eemug 2021 manuscript eval services panel sci score v2_editsAries systems eemug 2021 manuscript eval services panel sci score v2_edits
Aries systems eemug 2021 manuscript eval services panel sci score v2_edits
 

Ähnlich wie Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR? Principled publishing for more evidence based research

Ähnlich wie Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR? Principled publishing for more evidence based research (20)

Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecutureScott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
 
GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.
 
GigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDB
 
Nicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShowNicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShow
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...
 
HKU Data Curation MLIM7350 Class 7
HKU Data Curation MLIM7350 Class 7HKU Data Curation MLIM7350 Class 7
HKU Data Curation MLIM7350 Class 7
 
Scott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
Scott Edmunds: GigaScience Datacite meeting Rapid Fire TalkScott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
Scott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
 
Big Data, AI, and Pharma
Big Data, AI, and PharmaBig Data, AI, and Pharma
Big Data, AI, and Pharma
 
Scott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sg
Scott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sgScott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sg
Scott Edmunds: Publishing in the Open Data Era, talk at Hackerspace.sg
 
Open Data HK: open science meets open data. A primer from Scott Edmunds
Open Data HK: open science meets open data. A primer from Scott EdmundsOpen Data HK: open science meets open data. A primer from Scott Edmunds
Open Data HK: open science meets open data. A primer from Scott Edmunds
 
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
 
Linked data in industry
Linked data in industryLinked data in industry
Linked data in industry
 
Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...
Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...
Scott Edmunds Open data examples, from the Science as an Open Enterprise sess...
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research data
 
Scott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data era
Scott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data eraScott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data era
Scott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data era
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
Thesis Proposal, as presented for dissertation proposal defense
Thesis Proposal, as presented for dissertation proposal defenseThesis Proposal, as presented for dissertation proposal defense
Thesis Proposal, as presented for dissertation proposal defense
 
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScienceScott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
 

Mehr von GigaScience, BGI Hong Kong

Mehr von GigaScience, BGI Hong Kong (20)

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
 
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventSusanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
 
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
 
Valerie de Anda at #ICG12: A new multi-genomic approach for the study of biog...
Valerie de Anda at #ICG12: A new multi-genomic approach for the study of biog...Valerie de Anda at #ICG12: A new multi-genomic approach for the study of biog...
Valerie de Anda at #ICG12: A new multi-genomic approach for the study of biog...
 

Kürzlich hochgeladen

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 

Kürzlich hochgeladen (20)

9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 

Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR? Principled publishing for more evidence based research

  • 1. How can we make science more trustworthy and FAIR? Principled publishing for more evidence based research The perspective of transparency… Scott Edmunds IARC, 8th July 2019
  • 3. Scientists: need to convince public + politicians
  • 4. Scientists: need to convince public + politicians https://www.newsweek.com/pruitt-trump-asbestos-chemicals-trump-962703 #FakeNews #MakeCarcinogensGreatAgain
  • 6. Forensic Bioinformatics: where raw data and reported results are used to reconstruct what the methods must have been. https://retractionwatch.com/2011/05/04/the-importance-of-being-reproducible-keith-baggerly-tells-the-anil-potti-story/ How not to regain trust?
  • 8. Data Sharing with Chinese Characteristics “The 2016 draft HGR Regulation further declares “safeguarding national security” as one of its legislative purposes, with biosecurity as a core element of national security.” “Echoing the repeated warning against illegal seizures of genetic resources by foreign entities, the drafters identify in particular cross- border data transfer as a new and covert means of seizure. This position is in a distinctive contrast with international consensus on the imperative of genomic data sharing, as recognized under the Bermuda Principles, the Fort Lauderdale Agreement, and initiatives of building interoperable rules of sharing, such as the Framework for Responsible Sharing of Genomic and Health-Related Data of the GA4GH. China: concurring regulation of cross-border genomic data sharing for statist control and individual protection https://doi.org/10.1007/s00439-018-1903-2 How not to regain trust?
  • 9. https://doi.org/10.1007/s00439-018-1903-2 https://doi.org/10.1038/nature14659 “Based on the same rationale, the MOST launched nationwide audit campaigns in 2011 and 2013 to identify sino-overseas projects that are unauthorized or uncompliant with state policies. It is noteworthy that in February 2018, the CAHGR revoked the licenses granted to two high-profile collaborative projects, which concern the Comparative Genetic Study of Psychosis in Han Chinese (between UCLA & SJTU) and the Genetic Foundation of Depression in Chinese Women (between Oxford Uni and PKU), respectively, and confiscated the exported genomic data (CAHGR 2018). The revocation was made pursuant to the Administrative License Law, but no specific reasons were disclosed in the formal decision.” Data Sharing with Chinese Characteristics How not to regain trust?
  • 10. https://www.nature.com/articles/d41586-018-07222-2 The ministry says genomics giant BGI in Shenzhen and Shanghai’s Huashan Hospital were caught breaking the rules after they put genetic information online without approval. The data was part of a large international study on the genetics of depression, which was published in Nature in 2015. The paper was based on anonymized sequence data from more than 10,000 Chinese women, which BGI acknowledges it did not have permission to publish in paper's supplementary material. A spokesperson from BGI says the company has destroyed the data, as request by the ministry. They say the company has also requested Nature remove the article from its website.
  • 11. Open Data saves lives, but kills candidate gene studies https://www.theatlantic.com/science/archive/2019/05/waste-1000-studies/589684/
  • 12. Open Data saves lives, and size matters +Open Data https://doi.org/10.1176/appi.ajp.2018.18070881 (inc. Chinese CONVERGE data) =
  • 13. Focusing on unscientific unreproducibile metrics Incentivising short term-citations How not to regain trust?
  • 14. JIFBAIT Network more GWAS GWAS JIFBAIT NEWS Arsenic Life forms, will they take over the planet? By Melba Ketchum, PhD Which Overhyped, Unreproducible Experiment Are You? Want rapid citations for 2 years only? Carry out this quiz. You got: STAP Cells Of course dipping cells in coffee will make them pluripotent. Even if the research gets discredited, it’ll still get 100’s of citations in two years.
  • 15. Publish or impoverish: An investigation of the monetary reward system of science in China (1999-2016) https://doi.org/10.1108/AJIM-01-2017-0014 http://www.szhrss.gov.cn/xxgk/zcfgjjd/gcjzyrcgl/201708/t20170831_8317284.htm Scientists: what we are doing instead Shenzhen "Peacock" "National leading talent scheme”: Science/Nature = ¥3M RMB, JCI Q1 = ¥1.6M RMB (1st & corresponding authors only)
  • 16. Attempts to “game the peer-review system on an industrial scale” http://www.scientificamerican.com/article/for-sale-your-name-here-in-a-prestigious-science-journal/ Companies offering authorship of papers made to order by “paper mills”. Guaranteed publication in JIF journal, often using fake referees, ID theft, etc. JIF 1-2 papers = ~$10,000 USD In China publication + JIF = money = fraud
  • 17. Do you want to be author of an IF 5.168 paper (OncoTarget)? Title: “…meta-analysis to evaluate the long-term efficacy of different **** drugs in the treatment of pancreatic cancer…” Scientists: what we are doing instead http://www.518sci.com/index.php?catid=17&ydzt=0-9999&zdprice=0-9999
  • 18. http://www.scmp.com/comment/insight-opinion/article/1758662/china-must-restructure-its-academic-incentives- curb-research Created by skewed incentive systems… “While we are rightly proud of Hong Kong’s highly regarded and ranked universities system, we are not immune to the same pressures. While funders in Europe have moved away from using citation based metrics such as JIF in their research assessments, the Hong Kong University Grants Committee states in their Research Assessment Exercise guidelines that they may informally use it.”
  • 20. How to regain trust? Areas we need to tackle to allow citizens to trust us Open Access - Change incentive systems away from dead tree advertising to FAIR data & reproducibility Open Science - Increase transparency & fill the data gaps Citizen Science - Involve the public in the scientific process
  • 21. Provide evidence not advertising Transparency or bust Show me the peer reviews Give me the data/code/protocols Let me publish replication studies Buckheit & Donoho: Scholarly articles are merely advertisement of scholarship. The actual scholarly artifacts, i.e. the data and computational methods, which support the scholarship, remain largely inaccessible. How to regain trust?
  • 22. GigaScience Ethos/Policies: ‘Impact' is subjective. Data is quantitive. Reward evidence (data), not advertising • Data • Software • Models • Pipelines • Reviews • Re-use… = Credit
  • 23. Rewarding open data & code http://gigasciencejournal.com/ Since July 2012. Publishes “Data Notes” for CC0 data, “Tech Notes” for OSI software.
  • 24. Integrated GigaDB repository. DataCite DOIs. No size limits, APC covers storage. http://gigadb.org/ Rewarding open data & code
  • 25. http://gigasciencejournal.com/blog/shortcut-from-biorxiv-to-gigascience / The rise of the pre-print Increase transparency/speed with bioRxiv GigaScience embraces
  • 26. Publons + PrePrint.Space = credit for reviewers efforts http://publons.com/ Credit transparency/open peer review http://preprint.space/byjournal/gigascience
  • 27. Rewarding & enabling interaction Building tools (inc Jbrowse for genomes, sketchfab for 3D images) on top of datasets… CodeOcean widgets for code, “compute capsule” (data+code+environment) run on AWS [Insert Widget Here]
  • 28. Aiding reproducibility of imaging studies OMERO: providing access to imaging data Already used by JCB. View, filter, measure raw images with direct links from journal article. See all image data, not just cherry picked examples. Download and reprocess.
  • 29. The zoom viewer allows whole-slide images to be explored at cellular resolution in the context of a web browser, and without need for data download. This example shows a lymph node section from a breast cancer patient. These data are available at: http://dx.doi.org/10.5524/100439
  • 31. First journal with deep integration with Launched 2nd June 2016 Reward better handling of “wet” protocols… • Create, share, modify forkeable protocols in repo. • Download & run on smartphone app. • Widgets embedded in GigaDB • Get discoverability, credit, DOIs for sharing methods. • Create your own, or let us set up & you claim. https://www.protocols.io/groups/gigascience-journal
  • 32. Transparency to the rescue Example 1
  • 34. To maximize its utility to the research community and aid those fighting the current epidemic, genomic data is released here into the public domain under a CC0 license. Until the publication of research papers on the assembly and whole-genome analysis of this isolate we would ask you to cite this dataset as: Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J; Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y; Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X; Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482 isolate genome sequencing consortium (2011) Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen. doi:10.5524/100001 http://dx.doi.org/10.5524/100001 Our first DOI: To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China. Open Data to the rescue…
  • 35.
  • 36.
  • 37.
  • 38. Downstream consequences: “Last summer, biologist Andrew Kasarskis was eager to help decipher the genetic origin of the Escherichia coli strain that infected roughly 4,000 people in Germany between May and July. But he knew it that might take days for the lawyers at his company — Pacific Biosciences — to parse the agreements governing how his team could use data collected on the strain. Luckily, one team had released its data under a Creative Commons licence that allowed free use of the data, allowing Kasarskis and his colleagues to join the international research effort and publish their work without wasting time on legal wrangling.” 1. Many Citations 2. Therapeutics (primers, antimicrobials) 3. Platform Comparisons 4. Example for faster & more open science
  • 39. 1.3 The power of intelligently open data The benefits of intelligently open data were powerfully illustrated by events following an outbreak of a severe gastro- intestinal infection in Hamburg in Germany in May 2011. This spread through several European countries and the US, affecting about 4000 people and resulting in over 50 deaths. All tested positive for an unusual and little-known Shiga-toxin– producing E. coli bacterium. The strain was initially analysed by scientists at BGI-Shenzhen in China, working together with those in Hamburg, and three days later a draft genome was released under an open data licence. This generated interest from bioinformaticians on four continents. 24 hours after the release of the genome it had been assembled. Within a week two dozen reports had been filed on an open-source site dedicated to the analysis of the strain. These analyses provided crucial information about the strain’s virulence and resistance genes – how it spreads and which antibiotics are effective against it. They produced results in time to help contain the outbreak. By July 2011, scientists published papers based on this work. By opening up their early sequencing results to international collaboration, researchers in Hamburg produced results that were quickly tested by a wide range of experts, used to produce new knowledge and ultimately to control a public health emergency.
  • 40. Transparency to the rescue Example 2
  • 41. Oxford Nanopore in the spotlight, Sept 2014. Does it work? https://doi.org/10.1111/1755-0998.12324 http://omicsomics.blogspot.com/2014/09/oxford-takes-some-flak-fires-back.html
  • 42. Nanopore MinION E. Coli genome released via GigaDB 10-Sep-2014 Curated & converted to ISA-tab, & worked with EBI to get raw data there Data Note submitted & preprint version out 26-Sept-2014 Peer reviewed & published 20-Oct-2014 http://dx.doi.org/10.5524/100102
  • 43. Would you trust a Chinese sequencer?
  • 44. Try before you buy: inspect ALL the data yourselves https://doi.org/10.1093/gigascience/gix024 • Comparisons with Illumina for PE50, 100 & 150 • Raw sequencing data in NCBI SRA • FASTQ files in GigaDB • Raw image files & protocols shared Would you trust a Chinese sequencer?
  • 45. Open, transparent and peer reviewed benchmarking https://doi.org/10.1093/gigascience/gix024 http://dx.doi.org/10.5524/review.100698 http://dx.doi.org/10.5524/review.100699Open Review Would you trust a Chinese sequencer?
  • 46. Transparency to the rescue Example 3
  • 47. http://dx.doi.org/10.5524/100034 What can you do with 15TB of open cancer data? Hong Kong ACRG data in GigaDB
  • 48. http://dx.doi.org/10.1126/scitranslmed.3006086 What can you do with 15TB of open cancer data? Hong Kong ACRG data in GigaDB
  • 50. A mnemonic to remember: FAIR http://www.nature.com/articles/sdata201618 http://www.datafairport.org/ Require stewardship on top of access
  • 51. A mnemonic to remember: FAIR http://www.nature.com/articles/sdata201618 http://www.datafairport.org/
  • 52. Beyond a mnemonic: FAIR ecosystems FAIR metrics https://www.go-fair.org/go-fair-initiative/
  • 53. Beyond a mnemonic: FAIR Evaluation Evaluating FAIR-Compliance Through an Objective, Automated, Community-Governed Framework https://www.biorxiv.org/content/early/2018/09/16/418376
  • 54. DTL/ELIXIR-NL “Bring Your Own Data Party” GigaScience/BGI HK Metabolomics ISA-TAB athon v More FAIR mnemonics: “BYODs”
  • 56. Open Science, the final frontier: democratising research for citizens The Hong Kong example…
  • 57. HK Botanical & Afforestation Dept. "The mysterious origin of the tree & its magnificent flowers at once arrest the interest. The Bauhinia Mystery? 1903 So far, all efforts to identify them with any foreign species have failed"
  • 58. Courtesy of: Archives des Missions Etrangère de Paris
  • 59.
  • 62.
  • 64. Education: sharing FAIR data http://dx.doi.org/10.5524/100245 http://dx.doi.org/10.5524/100345
  • 65. Student power (MSc @ CUHK) Education: teaching people with the data Transcriptomes assembled & annotated by MSc students Looked at GO/KEGG & TCM compounds Looked at parental links (diversity, maternal/paternal) B. Purpurea = motherB. Variegata = father
  • 66. Open Science = Science • Science needed more than ever to tackle grave public health challenges • Need to escape from our ivory towers & increase transparency to regain stakeholder trust • Take science back to standing on the shoulders of giants, rather than unFAIR practices • Choose evidence not branding • Once we have Open Data, we then need FAIR data stewardship • New EU funder rules on open science/OA coming – preempt FAIR assessment
  • 67. Help GigaScience make it happen www.gigasciencejournal.com Give us your data, pipelines & papers scott@gigasciencejournal.com editorial@gigasciencejournal.com database@gigasciencejournal.com Contact us:
  • 68. Thanks to: Laurie Goodman, Editor in Chief Nicole Nogoy, Editor Hans Zauner, Assistant Editor Hongling Zhao, Assistant Editor Peter Li, Lead Data Manager Chris Hunter, Lead BioCurator Chris Armit, Data Scientist Mary Ann Tulli, Data Ediitor Xiao (Jesse) Si Zhe, Database Developer Chen Qi, Shenzhen Office. @GigaScience facebook.com/GigaScience http://gigasciencejournal.com/blog/ Follow us: www.gigasciencejournal.com www.gigadb.org + Weibo & WeChat