Identification of unknowns in mass spectrometry based non-targeted analyses (NTA) requires the integration of complementary pieces of data to arrive at a confident, consensus structure. Researchers use chemical reference databases, spectral matching, fragment prediction tools, retention time prediction tools, and a variety of other data to arrive at tentative, probable, and confirmed, if possible, identifications. With the diverse, robust data contained within the US EPA’s CompTox Chemistry Dashboard (https://comptox.epa.gov), the goal of this research is to identify and implement a harmonized identification tool and workflow using previously generated chemistry data. Data has been compiled from product use, functional use prediction models, environmental media occurrence prediction models, and PubMed references, among other sources. We will report on our development of a visualization tool whereby users can visualize the relative contribution of identification-based metrics on a list of candidate structures and observe the greatest likelihood of occurrence. These data and visualization tools support NTA identification via the Dashboard and demonstrate an open, accessible tool for all users of HRMS data. This abstract does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Structure identification by Mass Spectrometry Non-Targeted Analysis using the US EPA’s CompTox Chemistry Dashboard
1. Structure identification by Mass
Spectrometry Non-Targeted Analysis
using the US EPA’s CompTox
Chemistry Dashboard
Antony Williams1, Andrew D. McEachran2, Chris Grulke1,
Seth Newton3, Kristin Isaacs3, Katherine Phillips3,
Nancy Baker1 and Jon R. Sobus3
1) National Center for Computational Toxicology, U.S. Environmental Protection Agency, RTP, NC
2) Oak Ridge Institute of Science and Education (ORISE) Research Participant, Research Triangle Park, NC
3) National Exposure Research Laboratory, U.S. Environmental Protection Agency, RTP, NC
March 2018
ACS Spring Meeting, New Orleans
http://www.orcid.org/0000-0002-2668-4821
The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
2. Abstract
• Structure identification by Mass Spectrometry Non-Targeted Analysis using the US EPA’s
CompTox Chemistry Dashboard
•
• Antony J. Williams, Andrew D. McEachran, Chris Grulke, Seth Newton, Kristin Isaacs,
Katherine Phillips, Nancy Baker and Jon Sobus
•
• Identification of unknowns in mass spectrometry based non-targeted analyses (NTA)
requires the integration of complementary pieces of data to arrive at a confident,
consensus structure. Researchers use chemical reference databases, spectral matching,
fragment prediction tools, retention time prediction tools, and a variety of other data to
arrive at tentative, probable, and confirmed, if possible, identifications. With the diverse,
robust data contained within the US EPA’s CompTox Chemistry Dashboard
(https://comptox.epa.gov), the goal of this research is to identify and implement a
harmonized identification tool and workflow using previously generated chemistry data.
Data has been compiled from product use, functional use prediction models, environmental
media occurrence prediction models, and PubMed references, among other sources. We
will report on our development of a visualization tool whereby users can visualize the
relative contribution of identification-based metrics on a list of candidate structures and
observe the greatest likelihood of occurrence. These data and visualization tools support
NTA identification via the Dashboard and demonstrate an open, accessible tool for all users
of HRMS data. This abstract does not necessarily represent the views or policies of the
U.S. Environmental Protection Agency.
1
3. The CompTox Chemistry Dashboard
• A publicly accessible website delivering access:
– ~760,000 chemicals with related property data
– Experimental and predicted physicochemical property data
– Experimental Human and Ecological hazard data
– Integration to “biological assay data” for 1000s of chemicals
– Information regarding consumer products containing chemicals
– Links to other agency websites and public data resources
– “Literature” searches for chemicals using public resources
– “Batch searching” for thousands of chemicals
– Real time prediction of physchem and toxicity endpoints
– DOWNLOADABLE Open Data for reuse and repurposing
2
8. What’s under the Data Tabs?
Focus on Hydraulic Fracturing Chemicals
7
9. Surfacing Lists of Chemicals
• Specific subsets of chemicals, “lists”, can
be displayed on the dashboard
• If there are chemicals that map together
then these link to existing:
– Property data
– Hazard data
– Exposure data
– In vitro bioassay data
– Documents and Literature
8
27. Suspect Screening and Non-Targeted
Analysis Workflow
26
DSSTox Chemical Database
“Molecular Features”
Extracted Samples
Raw Samples
Raw Features
Matched Formulas
Mapped Structures
Prioritized Structures
(using ToxPi)
Confirmed Structures
(using ToxCast standards)
Processed Features
Prioritized Features
Predicted Formulas
Candidate Structures
Sorted Structures
Predicted Retention Times
Predicted/Observed Functional Use
Top Candidate Structure(s)
Suspect Screening Non-Targeted Analysis
Predicted Concentrations
Predicted/Observed Media Occurrence
Predicted Mass Spectra
Methodological Concordance
Red = Analytical Chemistry
Blue = Data Processing & Analysis
Green = Informatics & Web Services
Purple = Mathematical & QSPR Modeling
Color Key
Content from J. Sobus, US-EPA-NERL
28. The Dashboard to Support
MS-Analysis
27
MS-Ready
Structures
Underpin
Analysis
37. Acknowledgments
• Hydraulic Fracturing Drinking Water Study Team
• The CompTox Chemistry Dashboard team
• Todd Martin (TEST Predictions), US-EPA-NRMRL
• NERL colleagues:
– Jon Sobus, Elin Ulrich, Mark Strynar, Seth Newton (NTA Analysis)
– Katherine Phillips, Kathie Dionisio, Kristin Isaacs (Consumer Products
Database)
• Emma Schymanski – Luxembourg Center for
Systems Biomedicine (MS-ready/NTA)
36
38. Contact
Antony Williams
US EPA Office of Research and Development
National Center for Computational Toxicology (NCCT)
Williams.Antony@epa.gov
ORCID: https://orcid.org/0000-0002-2668-4821
37