SlideShare ist ein Scribd-Unternehmen logo
1 von 51
Center for Computational Toxicology and Exposure, US-EPA, RTP, NC
http://www.orcid.org/0000-0002-2668-4821
Delivering chemical-associated data
via EPA web applications
The views expressed in this presentation are those of the authors and do not necessarily reflect the views or policies of the U.S. EPA
Antony J. Williams
Cheminformatics Resources of U.S. Governmental Organizations– May 11th 2022
The State of Internet Chemistry…
• The past two decades has seen an explosion in online data
• There are so many resources to choose from…so much data
• In our world there are dominant aggregating resources done well:
PubChem, ChEMBL, eChemPortal, ECHA
• Do we need yet another online chemistry database?
20 Years of Curating Data in Our Team
The Charge for the
CompTox Chemicals Dashboard
• Develop a “first-stop-shop” for environmental chemical data
to support EPA and partner decision making:
– Centralized location for chemical data (DSSTox foundation)
– Chemistry, exposure, hazard and dosimetry
– Combination of existing data and predictive models
– Publicly accessible, periodically updated, curated
• Easy access to data improves efficiency and ultimately
accelerates chemical risk assessment
CompTox Chemicals Dashboard
https://comptox.epa.gov/dashboard
5
SEARCH
TOX DATA
BIOACTIVITY
SIMILARITY
READ-ACROSS
PUBMED
BATCH SEARCH
Detailed Chemical Pages
• Chemical page: Wikipedia snippet when available, intrinsic
properties, structural identifiers, linked substances
“Executive Summary” regarding
chemical toxicity
• Overview of toxicity-
related info
• Quantitative values
• Physchem. and Fate &
Transport
• Adverse Outcome
Pathway links
• In vitro bioactivity
summary plot
Experimental and Predicted Data
• Physchem and Fate & Transport
experimental and predicted data
• Data extracted (somewhat curated)
from literature and databases
• Multiple prediction algorithms used:
OPERA, TEST, ACD/Labs, EPI Suite
• Data can be downloaded as Excel,
TSV and CSV files
Chemical Hazard Data
ToxVal Database
• >50k chemicals
• >770k tox. values
• >30 sources of data
• ~5k journals cited
• ~70k citations
Safety Data – Thank you PubChem!
Bring together resources
Sources of Exposure to Chemicals
A recent focus on PFAS chemicals
Similar Compounds
Simple cheminformatics search
Relationships in the data
Manual curation and mapping
Markush Chemical representations
• PFOS is a member of linear perfluoroalkyl sulfonates
15
…and their Markush Children…
• Linear perfluoroalkyl sulfonates has children…
Bioactivity Data
ToxCast
ToxCast/Tox21 Data
Full transparency in terms of data
• Concentration Response data
Full access to concentration-response curves
Use Models Derived from the Data
Searching Literature
and the Internet
Literature Searching
• Real-time retrieval of data from PubMed ~30
million abstracts and growing)
• Choose from set of pre-defined queries
• Adjust and fine tune queries based on interests
Literature Searching
• “Sifting” of results using
multiple terms
• Frequency counting terms
• Color highlighting of terms
• Download list to Excel
• Send list to PubMed for
downloading ref. file
• Direct link via PubMed ID
What’s the best way to search the
internet for chemical data?
• We know how complex chemicals identifiers are…
• CASRN(s)
• Hundreds of names (maybe)
• SMILES
• InChIs
• EINECS, EC numbers
• What can WE do to help you navigate the internet?
External Links – Also use Identifiers
Names, CASRN, PubChem IDs, InChIs…
27
External Links
• Links to ~90 websites providing access to
additional data on the chemical of interest
Chemical Lists and
Categories
A List of Lists of Chemicals
https://comptox.epa.gov/dashboard/chemical_lists
The OECD List of PFAS
http://www.oecd.org/chemicalsafety/portal-perfluorinated-chemicals/
31
Example PFAS-UVCBs
32
PFAS List Paper
https://doi.org/10.3389/fenvs.2022.850019
• What makes our efforts
different?
• MANUAL curation work
• Building lists, cross-
referencing, mapping
relationships, sourcing
and curating data
Batch Searching
Batch Searching
• Singleton searches are great but…
• …we generally want data on LOTS of chemicals!
• Typical questions
• What are the structures for a set of chemical names? Set of CASRNs?
• Can I get chemical lists in Excel files? As a list of SMILES strings?
Can I get an SDF file?
• Can I include predicted properties? OPERA? TEST?
• Are “these chemicals” screened in Toxcast?
• I need masses and formulae for a list of chemicals
Batch Search
Batch Search – Excel, CSV, SDF file
Batch Search
Open Data
Exchange
Since our data are Open…
• They flow into other systems for benefit …
• ECHA eChemPortal
• ChemSpider
• EBI’s UniChem
• PubChem
Developing
Cheminformatics
“PoC Modules”
Six Modules
• Hazard Comparison Profiling – profile chemicals based on hazard
• Alerts – structure, substructure, SMARTS based alerts and flags
• Predict – batch prediction using WebTEST (100s of structures)
• Search – structure/substructure/similarity searches
• Standardize – convert structures into QSAR/MS-Ready forms
• ToxPrints – generate ToxPrint substructural fragments and profile
Module 1: Hazard Module
Module 2: Alerts
Module 3: WebTEST Batch Prediction
Module 4: Structure/Substructure/Similarity
Summary and Conclusion
• CompTox Chemicals Dashboard - a
central hub for environmental data
• ~900k chemical substances (1.2M soon)
• Integrating property data, hazard data,
exposure data, in vitro bioactivity data
• Interrogation of bioactivity data -
• Multiple types of searches
• Batch search for thousands of chemicals
• Real-time property and toxicity predictions
• Downloadable files – CSV, TSV and Excel
Some Related Publications of Interest
You want to know more…
• Lots of resources available
• Presentations: https://tinyurl.com/w5hqs55
• Communities of Practice Videos: https://rb.gy/qsbno1
• Manual: https://rb.gy/4fgydc
• Latest News: https://comptox.epa.gov/dashboard/news_info
49
Acknowledgments
• Contact: Williams.Antony@epa.gov
• Feedback and follow-up is
welcomed! Your questions help.
• The dashboard is based on the
efforts of many more team
members than us
• Many collaborators provide data
also
50
EPA’s Center for Computational Toxicology and Exposure
THANK YOU
Panel Discussion Ideas…
• Thank you to the organizers for the conference. It was very
beneficial to catch up on what’s going on across the groups
• For the Panel Discussion today…selfish wants 
• How do we limit rework in terms of repeat curation?
• How do we coordinate development of QSAR data sets (because
QSAR modeling is not the hard part) – DILI data, Log Kow, Solubility
• There would be so much benefit to InChI-Markush. It’s doable…how
can we collectively fund it?

Weitere ähnliche Inhalte

Ähnlich wie Delivering chemical-associated data via EPA web applications

The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...Andrew McEachran
 

Ähnlich wie Delivering chemical-associated data via EPA web applications (20)

Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
 
Data delivery from the US-EPA Center for Computational Toxicology and Exposur...
Data delivery from the US-EPA Center for Computational Toxicology and Exposur...Data delivery from the US-EPA Center for Computational Toxicology and Exposur...
Data delivery from the US-EPA Center for Computational Toxicology and Exposur...
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Chemis...
Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Chemis...Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Chemis...
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Chemis...
 
How to place your research questions or results into the context of the "Lega...
How to place your research questions or results into the context of the "Lega...How to place your research questions or results into the context of the "Lega...
How to place your research questions or results into the context of the "Lega...
 
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
US EPA CompTox Chemicals Dashboard Data Integration Hub to Support Environmen...
 
New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...New developments in delivering public access to data from the National Center...
New developments in delivering public access to data from the National Center...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
 
Success in decision making data relevance curation
Success in decision making data relevance curationSuccess in decision making data relevance curation
Success in decision making data relevance curation
 
Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...
 
New Approach Methods - What is That?
New Approach Methods - What is That?New Approach Methods - What is That?
New Approach Methods - What is That?
 
Delivering access to chemistry and bioassay data from the National Center for...
Delivering access to chemistry and bioassay data from the National Center for...Delivering access to chemistry and bioassay data from the National Center for...
Delivering access to chemistry and bioassay data from the National Center for...
 

Kürzlich hochgeladen

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Silpa
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curveAreesha Ahmad
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfSumit Kumar yadav
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxrohankumarsinghrore1
 

Kürzlich hochgeladen (20)

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curve
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 

Delivering chemical-associated data via EPA web applications

  • 1. Center for Computational Toxicology and Exposure, US-EPA, RTP, NC http://www.orcid.org/0000-0002-2668-4821 Delivering chemical-associated data via EPA web applications The views expressed in this presentation are those of the authors and do not necessarily reflect the views or policies of the U.S. EPA Antony J. Williams Cheminformatics Resources of U.S. Governmental Organizations– May 11th 2022
  • 2. The State of Internet Chemistry… • The past two decades has seen an explosion in online data • There are so many resources to choose from…so much data • In our world there are dominant aggregating resources done well: PubChem, ChEMBL, eChemPortal, ECHA • Do we need yet another online chemistry database?
  • 3. 20 Years of Curating Data in Our Team
  • 4. The Charge for the CompTox Chemicals Dashboard • Develop a “first-stop-shop” for environmental chemical data to support EPA and partner decision making: – Centralized location for chemical data (DSSTox foundation) – Chemistry, exposure, hazard and dosimetry – Combination of existing data and predictive models – Publicly accessible, periodically updated, curated • Easy access to data improves efficiency and ultimately accelerates chemical risk assessment
  • 5. CompTox Chemicals Dashboard https://comptox.epa.gov/dashboard 5 SEARCH TOX DATA BIOACTIVITY SIMILARITY READ-ACROSS PUBMED BATCH SEARCH
  • 6. Detailed Chemical Pages • Chemical page: Wikipedia snippet when available, intrinsic properties, structural identifiers, linked substances
  • 7. “Executive Summary” regarding chemical toxicity • Overview of toxicity- related info • Quantitative values • Physchem. and Fate & Transport • Adverse Outcome Pathway links • In vitro bioactivity summary plot
  • 8. Experimental and Predicted Data • Physchem and Fate & Transport experimental and predicted data • Data extracted (somewhat curated) from literature and databases • Multiple prediction algorithms used: OPERA, TEST, ACD/Labs, EPI Suite • Data can be downloaded as Excel, TSV and CSV files
  • 9. Chemical Hazard Data ToxVal Database • >50k chemicals • >770k tox. values • >30 sources of data • ~5k journals cited • ~70k citations
  • 10. Safety Data – Thank you PubChem! Bring together resources
  • 11. Sources of Exposure to Chemicals
  • 12. A recent focus on PFAS chemicals
  • 14. Relationships in the data Manual curation and mapping
  • 15. Markush Chemical representations • PFOS is a member of linear perfluoroalkyl sulfonates 15
  • 16. …and their Markush Children… • Linear perfluoroalkyl sulfonates has children…
  • 20. Full transparency in terms of data • Concentration Response data
  • 21. Full access to concentration-response curves
  • 22. Use Models Derived from the Data
  • 24. Literature Searching • Real-time retrieval of data from PubMed ~30 million abstracts and growing) • Choose from set of pre-defined queries • Adjust and fine tune queries based on interests
  • 25. Literature Searching • “Sifting” of results using multiple terms • Frequency counting terms • Color highlighting of terms • Download list to Excel • Send list to PubMed for downloading ref. file • Direct link via PubMed ID
  • 26. What’s the best way to search the internet for chemical data? • We know how complex chemicals identifiers are… • CASRN(s) • Hundreds of names (maybe) • SMILES • InChIs • EINECS, EC numbers • What can WE do to help you navigate the internet?
  • 27. External Links – Also use Identifiers Names, CASRN, PubChem IDs, InChIs… 27
  • 28. External Links • Links to ~90 websites providing access to additional data on the chemical of interest
  • 30. A List of Lists of Chemicals https://comptox.epa.gov/dashboard/chemical_lists
  • 31. The OECD List of PFAS http://www.oecd.org/chemicalsafety/portal-perfluorinated-chemicals/ 31
  • 33. PFAS List Paper https://doi.org/10.3389/fenvs.2022.850019 • What makes our efforts different? • MANUAL curation work • Building lists, cross- referencing, mapping relationships, sourcing and curating data
  • 35. Batch Searching • Singleton searches are great but… • …we generally want data on LOTS of chemicals! • Typical questions • What are the structures for a set of chemical names? Set of CASRNs? • Can I get chemical lists in Excel files? As a list of SMILES strings? Can I get an SDF file? • Can I include predicted properties? OPERA? TEST? • Are “these chemicals” screened in Toxcast? • I need masses and formulae for a list of chemicals
  • 37. Batch Search – Excel, CSV, SDF file
  • 40. Since our data are Open… • They flow into other systems for benefit … • ECHA eChemPortal • ChemSpider • EBI’s UniChem • PubChem
  • 42. Six Modules • Hazard Comparison Profiling – profile chemicals based on hazard • Alerts – structure, substructure, SMARTS based alerts and flags • Predict – batch prediction using WebTEST (100s of structures) • Search – structure/substructure/similarity searches • Standardize – convert structures into QSAR/MS-Ready forms • ToxPrints – generate ToxPrint substructural fragments and profile
  • 45. Module 3: WebTEST Batch Prediction
  • 47. Summary and Conclusion • CompTox Chemicals Dashboard - a central hub for environmental data • ~900k chemical substances (1.2M soon) • Integrating property data, hazard data, exposure data, in vitro bioactivity data • Interrogation of bioactivity data - • Multiple types of searches • Batch search for thousands of chemicals • Real-time property and toxicity predictions • Downloadable files – CSV, TSV and Excel
  • 49. You want to know more… • Lots of resources available • Presentations: https://tinyurl.com/w5hqs55 • Communities of Practice Videos: https://rb.gy/qsbno1 • Manual: https://rb.gy/4fgydc • Latest News: https://comptox.epa.gov/dashboard/news_info 49
  • 50. Acknowledgments • Contact: Williams.Antony@epa.gov • Feedback and follow-up is welcomed! Your questions help. • The dashboard is based on the efforts of many more team members than us • Many collaborators provide data also 50 EPA’s Center for Computational Toxicology and Exposure
  • 51. THANK YOU Panel Discussion Ideas… • Thank you to the organizers for the conference. It was very beneficial to catch up on what’s going on across the groups • For the Panel Discussion today…selfish wants  • How do we limit rework in terms of repeat curation? • How do we coordinate development of QSAR data sets (because QSAR modeling is not the hard part) – DILI data, Log Kow, Solubility • There would be so much benefit to InChI-Markush. It’s doable…how can we collectively fund it?