SlideShare a Scribd company logo
1 of 55
Progress in Delivering Transparency in
Research Data by the National Center for
Computational Toxicology at the US-EPA
Antony J. Williams, Jeff Edwards, Chris Grulke and John Cowden
National Center for Computational Toxicology, U.S. Environmental Protection Agency, RTP, NC
August 2018
ACS Fall Meeting, Boston
http://www.orcid.org/0000-0002-2668-4821
The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
Disclaimer of Endorsement
• Mention of or referral to commercial
products or services, and/or links to non-
EPA sites does not imply official EPA
endorsement of or responsibility for the
opinions, ideas, data, or products
presented at those locations, or guarantee
the validity of the information provided.
1
• National Center for Computational Toxicology
established in 2005 to integrate:
– High-throughput and high-content technologies
– Modern molecular biology
– Data mining and statistical modeling
– Computational biology and chemistry
• Outputs: a lot of data, models, algorithms,
software applications and publications
• Open Data – we want scientists to interrogate
it, learn from it, develop understanding
National Center for
Computational Toxicology
The CompTox Portal
https://comptox.epa.gov/
3
Downloadable CompTox Data
https://www.epa.gov/chemical-research/downloadable-computational-toxicology-data
4
Deliver Data for Reuse:
DIFFERENT formats
https://www.epa.gov/chemical-research/toxicity-forecaster-toxcasttm-data
Open Data means Reuse
For Science
6
Open Data means Reuse
For Software
7
CompTox Dashboard
https://comptox.epa.gov/dashboard
8
CompTox Dashboard
Chemicals
9
CompTox Dashboard
Products and Use Categories
10
CompTox Dashboard
Assays and Genes
11
Downloadable data in
useful formats
12
EPA FigShare Page
https://epa.figshare.com/
13
Datasets with versioning
14
Measuring Our IMPACT
15
The impact of our paper versus the
impact of our data…187 downloads
16
FAIR Data
17
FAIRsharing.org
18
Examples of our transparency
1. OPERA Prediction Models
19
What are OPERA Models?
20
What are OPERA Models?
Detailed QMRF reports
21
How Transparent in the Publication?
22
On FigShare
23
How Transparent with the Code?
https://github.com/kmansouri/OPERA
24
How widely do we share?
25
How widely do we share?
26
Journals for Data Articles
27
Data Journals Hold Promise
28
Examples of our transparency
2. The Chemical and Products Database
29
Examples of our transparency
2. The Chemical and Products Database
30
What chemicals are in
Arts and Crafts Paint?
31
Examples of our transparency
2. The Chemical and Products Database
32
Are we oversharing?
• We share our datasets
• We share our models
• We share our code
• We share our database schemas
• We share our database dumps
• We are not yet sharing all code under an
application like the CompTox Dashboard..
33
A Recurring Plea for Formats
• Chemical data exchange formats are critical
• We all know it’s imperfect, that there are
efforts afoot, and it’s taking time
34
2015
2018
Use APPROPRIATE formats
• Our databases use chemical structures
• Many journal articles deliver poor data
35
Chemical structures as text
36
How do I extract structures?
• Copy and paste into Excel as a start point
• Assume no loss of formatting!
• Convert SMILES to structures
• But Copy-Paste doesn’t work
37
How do I extract structures?
• Copy and paste into Excel as a start point
• Assume no loss of formatting!
• Convert SMILES to structures
• Copy-Paste doesn’t work
38
Structure Drawings Are Worse
39
Names and CASRNs
are NOT structures
• In our domain most chemicals are text –
chemical names and CAS Numbers
40
And generally problematic…
41
So this is how we publish our
chemical substance data
42
Publish data for your domain
• Push data out in immediately useful formats
– Make multiple formats available as appropriate – SQL
database dumps, Excel files, etc.
– For chemistry - SDF files – includes LAYOUT and data
(but requires cheminformatics tools), Excel files generally
handled at any desktop.
43
So this is how we publish our
chemical substance data
44
When we publish now…
• Add the data as a “list” to our Lists of Chemicals
• Generally store files on our FTP site PLUS
copies in a repository (or two)
• Multiple formats of data as appropriate
– Can be as supplementary data or DOI’ed data files
• DOI’ed data gives altmetrics also..
45
PERSONAL affection for
Open Peer Review
46
Will my data always be available?
• There are no guarantees our data will
always be on all sites. They come and go.
• Some organizations are just as concerned
as you about your data…
47
Future Work
• We have a lot more models to make
available – Open and free web services
• We are planning for embeddable widgets
to access our Open data
• The challenge of versioning – ongoing
curation of data requires version tracking
48
NCCT “InvitroDB_v3”
• The last public release of ToxCast data
(invitroDB_v2) was in 3rd Quarter of 2015
• The next release invitroDB_v3 is Fall 2018
• Data includes new assays, new chemicals,
new pipelining, results of data curation
• Data will also release via CompTox Dashboard
• Data will be available at https://www.epa.gov/chemical-
research/exploring-toxcast-data-downloadable-data
49
NCCT Transcriptomics Data will Deliver
50 Terabytes of Data for Analysis
TruSeq
r2 0.74
TempO-Seq
r2 0.75
Low Coverage
r2 0.83
Currently capable of assigning to >40
MOAs based on transcriptional
responses
MOA Analysis Pipeline
• Large scale screen of 1,000 chemicals (ToxCast I/II)
Additional screens across multiple cell types/lines
• Additional reference chemicals and genetic
perturbations (RNAi/CRISPR/cDNA)
Conclusions
• For publications and for applications deliver
your data in “fit-for-purpose” form
• Multiple formats of data are appropriate
• Consider your audience(s) and “how would
YOU want the data”???
• Not everything can be put in a supplementary
file – use repositories and DOI your data
• Take the benefits of DOIs to measure
altmetrics
51
Conclusion
• NCCT works with transparency in mind – our
data is released for community usage – as data
and in apps
• The CompTox Dashboard is the new application
to surface all data as the architecture expands
• We attempt to deliver data in as transparent a
form as possible for our scientific publications
• We consider long-term access and versioning
in our releases. Lots of work to do…
52
…and let other people use it…
Contact
Antony Williams
US EPA Office of Research and Development
National Center for Computational Toxicology (NCCT)
Williams.Antony@epa.gov
ORCID: https://orcid.org/0000-0002-2668-4821
54

More Related Content

What's hot

ICIC 2013 New Product Introductions ChemAxon
ICIC 2013 New Product Introductions ChemAxonICIC 2013 New Product Introductions ChemAxon
ICIC 2013 New Product Introductions ChemAxonDr. Haxel Consult
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingAnita de Waard
 
Publishing data and code openly
Publishing data and code openlyPublishing data and code openly
Publishing data and code openlyFAIRDOM
 
ICIC 2013 New Product Introductions InfoChem
ICIC 2013 New Product Introductions InfoChemICIC 2013 New Product Introductions InfoChem
ICIC 2013 New Product Introductions InfoChemDr. Haxel Consult
 
2014 genome informatics Linked Data
2014 genome informatics Linked Data2014 genome informatics Linked Data
2014 genome informatics Linked DataENCODE-DCC
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...Dr. Haxel Consult
 
II-SDV 2016 Michael Iarrobino - Improving Text Mining Results with Access to ...
II-SDV 2016 Michael Iarrobino - Improving Text Mining Results with Access to ...II-SDV 2016 Michael Iarrobino - Improving Text Mining Results with Access to ...
II-SDV 2016 Michael Iarrobino - Improving Text Mining Results with Access to ...Dr. Haxel Consult
 
OHDSI CDM Presentation
OHDSI CDM PresentationOHDSI CDM Presentation
OHDSI CDM PresentationAmritansh .
 
II-SDV 2016 Aleksandar Kapisoda, Klaus Kater - Deep Web Search
II-SDV 2016 Aleksandar Kapisoda, Klaus Kater - Deep Web SearchII-SDV 2016 Aleksandar Kapisoda, Klaus Kater - Deep Web Search
II-SDV 2016 Aleksandar Kapisoda, Klaus Kater - Deep Web SearchDr. Haxel Consult
 
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...DATAVERSITY
 
Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Valery Tkachenko
 
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIBICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIBDr. Haxel Consult
 

What's hot (20)

ICIC 2013 New Product Introductions ChemAxon
ICIC 2013 New Product Introductions ChemAxonICIC 2013 New Product Introductions ChemAxon
ICIC 2013 New Product Introductions ChemAxon
 
Does bigger mean better in the world of chemistry databases?
Does bigger mean better in the world of chemistry databases? Does bigger mean better in the world of chemistry databases?
Does bigger mean better in the world of chemistry databases?
 
Overview of open resources to support automated structure verification and e...
Overview of open resources to support automated structure verification  and e...Overview of open resources to support automated structure verification  and e...
Overview of open resources to support automated structure verification and e...
 
Sharing chemical structures with peer reviewed publications
Sharing chemical structures with peer reviewed publications Sharing chemical structures with peer reviewed publications
Sharing chemical structures with peer reviewed publications
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Publishing data and code openly
Publishing data and code openlyPublishing data and code openly
Publishing data and code openly
 
ICIC 2013 New Product Introductions InfoChem
ICIC 2013 New Product Introductions InfoChemICIC 2013 New Product Introductions InfoChem
ICIC 2013 New Product Introductions InfoChem
 
How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...
 
2014 genome informatics Linked Data
2014 genome informatics Linked Data2014 genome informatics Linked Data
2014 genome informatics Linked Data
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
 
II-SDV 2016 Michael Iarrobino - Improving Text Mining Results with Access to ...
II-SDV 2016 Michael Iarrobino - Improving Text Mining Results with Access to ...II-SDV 2016 Michael Iarrobino - Improving Text Mining Results with Access to ...
II-SDV 2016 Michael Iarrobino - Improving Text Mining Results with Access to ...
 
OHDSI CDM Presentation
OHDSI CDM PresentationOHDSI CDM Presentation
OHDSI CDM Presentation
 
II-SDV 2016 Aleksandar Kapisoda, Klaus Kater - Deep Web Search
II-SDV 2016 Aleksandar Kapisoda, Klaus Kater - Deep Web SearchII-SDV 2016 Aleksandar Kapisoda, Klaus Kater - Deep Web Search
II-SDV 2016 Aleksandar Kapisoda, Klaus Kater - Deep Web Search
 
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
US-EPA CompTox Chemicals Dashboard providing access to experimental and predi...
 
The application of text and data mining to enhance the RSC publication archive
The application of text and data mining to enhance the RSC publication archiveThe application of text and data mining to enhance the RSC publication archive
The application of text and data mining to enhance the RSC publication archive
 
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
 
Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...
 
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIBICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
 
Hosting a compound centric community resource for chemistry data
Hosting a compound centric community resource for chemistry dataHosting a compound centric community resource for chemistry data
Hosting a compound centric community resource for chemistry data
 

Similar to Progress in delivering transparency in research data

Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databasesChris Southan
 

Similar to Progress in delivering transparency in research data (20)

Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
 
Delivering chemical-associated data via EPA web applications
Delivering chemical-associated data via EPA web applicationsDelivering chemical-associated data via EPA web applications
Delivering chemical-associated data via EPA web applications
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
 
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
 
Chemistry data delivery from the US-EPA to support environmental chemistry
Chemistry data delivery from the US-EPA to support environmental chemistryChemistry data delivery from the US-EPA to support environmental chemistry
Chemistry data delivery from the US-EPA to support environmental chemistry
 
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Chemis...
Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Chemis...Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Chemis...
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Chemis...
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
 
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...
 
Providing support for JC Bradleys vision of open science using RSC cheminform...
Providing support for JC Bradleys vision of open science using RSC cheminform...Providing support for JC Bradleys vision of open science using RSC cheminform...
Providing support for JC Bradleys vision of open science using RSC cheminform...
 
Cheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting ExposomicsCheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting Exposomics
 
Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databases
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...
 
Delivering on the promise of a chemistry data repository for the world
Delivering on the promise of a chemistry data repository for the worldDelivering on the promise of a chemistry data repository for the world
Delivering on the promise of a chemistry data repository for the world
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 

Recently uploaded

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 

Recently uploaded (20)

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 

Progress in delivering transparency in research data

  • 1. Progress in Delivering Transparency in Research Data by the National Center for Computational Toxicology at the US-EPA Antony J. Williams, Jeff Edwards, Chris Grulke and John Cowden National Center for Computational Toxicology, U.S. Environmental Protection Agency, RTP, NC August 2018 ACS Fall Meeting, Boston http://www.orcid.org/0000-0002-2668-4821 The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
  • 2. Disclaimer of Endorsement • Mention of or referral to commercial products or services, and/or links to non- EPA sites does not imply official EPA endorsement of or responsibility for the opinions, ideas, data, or products presented at those locations, or guarantee the validity of the information provided. 1
  • 3. • National Center for Computational Toxicology established in 2005 to integrate: – High-throughput and high-content technologies – Modern molecular biology – Data mining and statistical modeling – Computational biology and chemistry • Outputs: a lot of data, models, algorithms, software applications and publications • Open Data – we want scientists to interrogate it, learn from it, develop understanding National Center for Computational Toxicology
  • 6. Deliver Data for Reuse: DIFFERENT formats https://www.epa.gov/chemical-research/toxicity-forecaster-toxcasttm-data
  • 7. Open Data means Reuse For Science 6
  • 8. Open Data means Reuse For Software 7
  • 11. CompTox Dashboard Products and Use Categories 10
  • 17. The impact of our paper versus the impact of our data…187 downloads 16
  • 20. Examples of our transparency 1. OPERA Prediction Models 19
  • 21. What are OPERA Models? 20
  • 22. What are OPERA Models? Detailed QMRF reports 21
  • 23. How Transparent in the Publication? 22
  • 25. How Transparent with the Code? https://github.com/kmansouri/OPERA 24
  • 26. How widely do we share? 25
  • 27. How widely do we share? 26
  • 28. Journals for Data Articles 27
  • 29. Data Journals Hold Promise 28
  • 30. Examples of our transparency 2. The Chemical and Products Database 29
  • 31. Examples of our transparency 2. The Chemical and Products Database 30
  • 32. What chemicals are in Arts and Crafts Paint? 31
  • 33. Examples of our transparency 2. The Chemical and Products Database 32
  • 34. Are we oversharing? • We share our datasets • We share our models • We share our code • We share our database schemas • We share our database dumps • We are not yet sharing all code under an application like the CompTox Dashboard.. 33
  • 35. A Recurring Plea for Formats • Chemical data exchange formats are critical • We all know it’s imperfect, that there are efforts afoot, and it’s taking time 34 2015 2018
  • 36. Use APPROPRIATE formats • Our databases use chemical structures • Many journal articles deliver poor data 35
  • 38. How do I extract structures? • Copy and paste into Excel as a start point • Assume no loss of formatting! • Convert SMILES to structures • But Copy-Paste doesn’t work 37
  • 39. How do I extract structures? • Copy and paste into Excel as a start point • Assume no loss of formatting! • Convert SMILES to structures • Copy-Paste doesn’t work 38
  • 41. Names and CASRNs are NOT structures • In our domain most chemicals are text – chemical names and CAS Numbers 40
  • 43. So this is how we publish our chemical substance data 42
  • 44. Publish data for your domain • Push data out in immediately useful formats – Make multiple formats available as appropriate – SQL database dumps, Excel files, etc. – For chemistry - SDF files – includes LAYOUT and data (but requires cheminformatics tools), Excel files generally handled at any desktop. 43
  • 45. So this is how we publish our chemical substance data 44
  • 46. When we publish now… • Add the data as a “list” to our Lists of Chemicals • Generally store files on our FTP site PLUS copies in a repository (or two) • Multiple formats of data as appropriate – Can be as supplementary data or DOI’ed data files • DOI’ed data gives altmetrics also.. 45
  • 47. PERSONAL affection for Open Peer Review 46
  • 48. Will my data always be available? • There are no guarantees our data will always be on all sites. They come and go. • Some organizations are just as concerned as you about your data… 47
  • 49. Future Work • We have a lot more models to make available – Open and free web services • We are planning for embeddable widgets to access our Open data • The challenge of versioning – ongoing curation of data requires version tracking 48
  • 50. NCCT “InvitroDB_v3” • The last public release of ToxCast data (invitroDB_v2) was in 3rd Quarter of 2015 • The next release invitroDB_v3 is Fall 2018 • Data includes new assays, new chemicals, new pipelining, results of data curation • Data will also release via CompTox Dashboard • Data will be available at https://www.epa.gov/chemical- research/exploring-toxcast-data-downloadable-data 49
  • 51. NCCT Transcriptomics Data will Deliver 50 Terabytes of Data for Analysis TruSeq r2 0.74 TempO-Seq r2 0.75 Low Coverage r2 0.83 Currently capable of assigning to >40 MOAs based on transcriptional responses MOA Analysis Pipeline • Large scale screen of 1,000 chemicals (ToxCast I/II) Additional screens across multiple cell types/lines • Additional reference chemicals and genetic perturbations (RNAi/CRISPR/cDNA)
  • 52. Conclusions • For publications and for applications deliver your data in “fit-for-purpose” form • Multiple formats of data are appropriate • Consider your audience(s) and “how would YOU want the data”??? • Not everything can be put in a supplementary file – use repositories and DOI your data • Take the benefits of DOIs to measure altmetrics 51
  • 53. Conclusion • NCCT works with transparency in mind – our data is released for community usage – as data and in apps • The CompTox Dashboard is the new application to surface all data as the architecture expands • We attempt to deliver data in as transparent a form as possible for our scientific publications • We consider long-term access and versioning in our releases. Lots of work to do… 52
  • 54. …and let other people use it…
  • 55. Contact Antony Williams US EPA Office of Research and Development National Center for Computational Toxicology (NCCT) Williams.Antony@epa.gov ORCID: https://orcid.org/0000-0002-2668-4821 54

Editor's Notes

  1. Planned – Need a primary screen that comprehensively covers biological space Metabolic competence Fill critical toxicological targets (e.g., thyroid) and organotypic assays that can be used to assess which molecular changes manifest in adverse phenotypic responses