SlideShare a Scribd company logo
1 of 24
Louis Groff, Hannah Liberatore, Seth Newton, Jon Sobus
Semi-Quantitative Non-Targeted Analysis as
a Rapid Risk Prioritization Tool:
A Proof of Concept Using Activated Carbon Drinking
Water Filters
Office of Research and Development
Center for Computational Toxicology and Exposure
April 7, 2021
https://orcid.org/0000-0002-6357-0508
Why Does EPA Need Measurement Data?
• Measurement data needed to assess chemical
safety
• Regulate chemicals, manage exposures, ensure
compliance under several federal statutes
Chemical
Monitoring
Needs
Exposure
Assessment
Dose-
Response
Assessment
Risk
Assessment
Traditional Targeted Analysis
Mid
Concentration
Low
Concentration
High
Concentration
Measure
Volatiles in
Drinking Water
Calibration Curves Allow
Accurate Quantification of
Individual Analytes
Purchase standards for
quantitation
Different Signal
Intensities Across
Compounds
Limitations of Targeted Analysis
•Environmental & biological samples are
typically highly complex mixtures
•Contain diverse arrays of known and
unknown chemicals (100s-1000s per sample)
•Targeted confirmation/quantitation of all
compounds-of-interest not remotely feasible
General NTA Workflow
C o n c e n tra tio n
F
r
e
q
u
e
n
c
y
C o n c e n tra tio n
F
r
e
q
u
e
n
c
y
Media Sample Extraction & Cleanup Prepared Sample MS Analysis
Concentration
Estimates for
Prepared Samples
Concentration
Estimates for
Media Samples
Semi-Quant. (SQ) NTA is a Multi-Step Process
MS Data
• Current SQ-NTA methods have not sought to estimate
media concentrations
• Cannot interpret NTA data in a risk-based context
• Need ways to defensibly approximate media concentration
• Proof-of-concept approach using GC-HRMS of
volatiles in tap water
• Brita filters employed to collect media samples
• Large-volume water samples (380 L over lifetime of filter)
• Suitable for low-concentration contaminants
• Allows preconcentration of analytes on filter
• Low shipping costs
7
SQ NTA:
Need for Rapid Prioritization Methods
• Spiked test filters with mix of standard VOCs + PAHs at 3 concentrations
• 49 volatiles/semi-volatiles + 24 polycyclic aromatic hydrocarbons (PAHs)
• Performed GC-HRMS on neat standards and spiked filter extracts
LOW MID
HIGH
8
GC-HRMS Standard Calibrations
Mean
Intensity
Concentration
• Split Injection
• PTV (programmable
temperature
vaporizing) inlet
• 60 °C Ramped
at 10 °C/s to
290 °C for
transfer
• TG-5SilMS capillary
column
• Oven temperature
program
• 35 °C Ramped
at 10 °C/min to
295 °C
• Electron ionization (EI)
source
• Orbitrap mass analyzer
• Acquisition range:
40-550 m/z
• Volatile range
observable by GC
9
GC-HRMS Instrumental Parameters
• Thermo TraceFinder GC-MS Deconvolution
plug-in
• NTA approach to detecting compounds
• Accurate mass tolerance: 5 ppm
• S/N threshold: 10:1
• TIC intensity threshold: 500,000
• Ion overlap: 99%
• Compound identification and RT alignment
across samples
• NIST 2017 EI-MS reference library
• Results filtered to include only peaks with assigned
mainlib library matches
• Reverse search index (RSI) score: ≥800
• High-resolution filtering (HRF) score: ≥85
• Total score: ≥85
Figure from: Basic Methodological Strategies in Metabolomic Research. (2013). In N. Lutz, J. Sweedler, & R. Wevers (Eds.),
Methodologies for Metabolomics: Experimental Strategies and Techniques (pp. 1-76). Cambridge: Cambridge University Press.
10
Identifying Chemicals:
NTA Data Processing Workflow
66/73 compound IDs (standards)
35/73 compound IDs (extracts)*
*matrix effects
C o n c e n tra tio n
F
r
e
q
u
e
n
c
y
C o n c e n tra tio n
F
r
e
q
u
e
n
c
y
Media Sample Extraction & Cleanup Prepared Sample MS Analysis
Concentration
Estimates for
Prepared Samples
Concentration
Estimates for
Media Samples
SQ NTA is a Multi-Step Process
MS Data
𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝐹𝑎𝑐𝑡𝑜𝑟 (𝑅𝐹) =
𝐾𝑛𝑜𝑤𝑛 𝐶𝑜𝑛𝑐. 𝑆𝑢𝑟𝑟𝑜𝑔𝑎𝑡𝑒
𝑂𝑏𝑠. 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑆𝑢𝑟𝑟𝑜𝑔𝑎𝑡𝑒
𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝑜𝑛𝑐. 𝑈𝑛𝑘𝑛𝑜𝑤𝑛 = 𝑂𝑏𝑠. 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑈𝑛𝑘𝑛𝑜𝑤𝑛 × 𝑅𝐹
Building a Simple SQ Model Using a Single Surrogate
Response Factor
“Single Surrogate”  known chemical spiked
at known conc. with observed intensity
“Unknowns”  tentatively identified
chemicals with unknown conc. and observed
intensities
Prediction Error Using Single Surrogate
Response Factor
• 𝐸𝑟𝑟𝑜𝑟 𝑅𝑎𝑡𝑖𝑜 =
𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝑜𝑛𝑐.
𝐾𝑛𝑜𝑤𝑛 𝐶𝑜𝑛𝑐.
• Using a single surrogate results in error
ratios that span around two orders of
magnitude
• Using this SQ approach, we can
underestimate by an order of magnitude or
overestimate by an order of magnitude
95%-fold range = 114×
Building a More Complex Model:
Relationship Between Intensity and Retention
Time
RT: 0.00 - 26.65
0 2 4 6 8 10 12 14 16 18 20 22 24 26
Time (min)
10
20
30
40
50
60
70
80
90
100
11.87
11.51
25.37
22.43
21.91
25.29
20.33
16.85 20.57
19.09
13.13
13.35
15.21
5.59 11.37
8.90
6.54
8.74 17.18
9.05
9.42
7.94
2.70 18.86
7.13
9.96
4.91 15.99 25.15
4.50
17.72
2.43
14.62
0.06
NL:
1.31E9
TIC MS
8260+
PAHs_2pp
mACE_001
• Found that Intensity Increases as Retention Time Increases at the same concentration
• Can utilize to improve model predictions
Building a More Complex SQ Model
∆ RTUnknown1
∆ RTUnknown2
“unknowns”
Single
Surrogate
97.5th
2.5th
Use to bound Concentrations According
to Absolute Retention Time Difference
Regression Equation for 95% Prediction Interval:
log 𝐸𝑅95% 𝑓𝑜𝑙𝑑 = m ∗ log |∆𝑅𝑇| + 𝑏
*Can pick any fold range
95%, 99%, 99.9%, etc.
Implementing the Model for Prediction (Step 1)
𝐶𝑏𝑜𝑢𝑛𝑑 = 10 log(𝐶𝑝𝑟𝑒𝑑) ±(log(𝐸𝑅95% 𝑓𝑜𝑙𝑑) 2)
C o n c e n tra tio n
F
r
e
q
u
e
n
c
y
C o n c e n tra tio n
F
r
e
q
u
e
n
c
y
Media Sample Extraction & Cleanup Prepared Sample MS Analysis
Concentration
Estimates for
Prepared Samples
Concentration
Estimates for
Media Samples
SQ NTA is a Multi-Step Process
MS Data
18
Why is “Recovery” a Critical Parameter?
Example Prioritization Using Tap Water Filters
Prepared Solution Conc.
Media Conc.
Adjust by concentration factor
Compare to EPA
Max Contaminant Levels (MCL)
𝐶𝑜𝑛𝑐 𝐹𝑎𝑐𝑡𝑜𝑟 =
𝑉𝑜𝑙. 𝐹𝑖𝑙𝑡𝑒𝑟𝑒𝑑 𝑇𝑎𝑝 𝑊𝑎𝑡𝑒𝑟
𝑉𝑜𝑙. 𝐸𝑥𝑡𝑟𝑎𝑐𝑡
From Brita Extracts
Example Prioritization Using Tap Water Filters
Margin of Recovery (MoR)
Calculated for risk prioritization
𝑀𝑜𝑅 =
𝑈𝑝𝑝𝑒𝑟 𝐶𝑚𝑒𝑑𝑖𝑎
𝑀𝐶𝐿
× 100
Priority Levels:
Low  MoR < 1%
Moderate  1% ≤ MoR < 100%
High  MoR ≥ 100%
Moderate & High priority
candidates for targeted analyses
From Brita Extracts
*Simulated examples of
high-priority chemcals
Conceptual Model for Interpretation
Planned Activities
• Finalize semi-quant models for GC & LC platforms
• Examine platform transferability for semi-quant models
• Apply models to existing data (products & media)
• Develop pipeline from ToxCast AC50 (or other NAM-based hazard
metrics) to lower bound media conc.
• Incorporate into EPA NTA WebApp
EPA ORD (cont.)
Kathie Dionisio
Chris Grulke
Kamel Mansouri*
Andrew McEachran*
Ann Richard
Adam Swank
John Wambaugh
Antony Williams
EPA ORD
Seth Newton
Jon Sobus
Hannah Liberatore
Hussein Al-Ghoul*
Alex Chao*
Jarod Grossman*
Kristin Isaacs
Sarah Laughlin*
Charles Lowe
James McCord
Kelsey Miller
Jeff Minucci
Katherine Phillips
Allison Phillips*
Tom Purucker
Randolph Singh*
Mark Strynar
Elin Ulrich
Nelson Yeung*
* = ORISE/ORAU
This work was
supported, in
part, by ORD’s
Pathfinder
Innovation
Program (PIP)
and an ORD
EMVL award
Agilent
Jarod Grossman
Andrew McEachran
GDIT
Ilya Balabin
Tom Transue
Tommy Cathey
Contributing Researchers
Questions?
The views expressed in this presentation are those of the
author and do not necessarily represent the views or policies
of the U.S. Environmental Protection Agency.
Groff.Louis@epa.gov

More Related Content

Recently uploaded

Recently uploaded (20)

celebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hour
celebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hourcelebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hour
celebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hour
 
Regional Snapshot Atlanta Aging Trends 2024
Regional Snapshot Atlanta Aging Trends 2024Regional Snapshot Atlanta Aging Trends 2024
Regional Snapshot Atlanta Aging Trends 2024
 
Top Rated Pune Call Girls Dapodi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Dapodi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Dapodi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Dapodi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
 
Top Rated Pune Call Girls Bhosari ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...
Top Rated  Pune Call Girls Bhosari ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...Top Rated  Pune Call Girls Bhosari ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...
Top Rated Pune Call Girls Bhosari ⟟ 6297143586 ⟟ Call Me For Genuine Sex Ser...
 
VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...
VIP Model Call Girls Lohegaon ( Pune ) Call ON 8005736733 Starting From 5K to...
 
The U.S. Budget and Economic Outlook (Presentation)
The U.S. Budget and Economic Outlook (Presentation)The U.S. Budget and Economic Outlook (Presentation)
The U.S. Budget and Economic Outlook (Presentation)
 
Zechariah Boodey Farmstead Collaborative presentation - Humble Beginnings
Zechariah Boodey Farmstead Collaborative presentation -  Humble BeginningsZechariah Boodey Farmstead Collaborative presentation -  Humble Beginnings
Zechariah Boodey Farmstead Collaborative presentation - Humble Beginnings
 
Booking open Available Pune Call Girls Shukrawar Peth 6297143586 Call Hot In...
Booking open Available Pune Call Girls Shukrawar Peth  6297143586 Call Hot In...Booking open Available Pune Call Girls Shukrawar Peth  6297143586 Call Hot In...
Booking open Available Pune Call Girls Shukrawar Peth 6297143586 Call Hot In...
 
Call On 6297143586 Viman Nagar Call Girls In All Pune 24/7 Provide Call With...
Call On 6297143586  Viman Nagar Call Girls In All Pune 24/7 Provide Call With...Call On 6297143586  Viman Nagar Call Girls In All Pune 24/7 Provide Call With...
Call On 6297143586 Viman Nagar Call Girls In All Pune 24/7 Provide Call With...
 
Postal Ballots-For home voting step by step process 2024.pptx
Postal Ballots-For home voting step by step process 2024.pptxPostal Ballots-For home voting step by step process 2024.pptx
Postal Ballots-For home voting step by step process 2024.pptx
 
WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.
WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.
WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.
 
The NAP process & South-South peer learning
The NAP process & South-South peer learningThe NAP process & South-South peer learning
The NAP process & South-South peer learning
 
2024: The FAR, Federal Acquisition Regulations - Part 29
2024: The FAR, Federal Acquisition Regulations - Part 292024: The FAR, Federal Acquisition Regulations - Part 29
2024: The FAR, Federal Acquisition Regulations - Part 29
 
Item # 4 - 231 Encino Ave (Significance Only).pdf
Item # 4 - 231 Encino Ave (Significance Only).pdfItem # 4 - 231 Encino Ave (Significance Only).pdf
Item # 4 - 231 Encino Ave (Significance Only).pdf
 
Election 2024 Presiding Duty Keypoints_01.pdf
Election 2024 Presiding Duty Keypoints_01.pdfElection 2024 Presiding Duty Keypoints_01.pdf
Election 2024 Presiding Duty Keypoints_01.pdf
 
2024: The FAR, Federal Acquisition Regulations, Part 30
2024: The FAR, Federal Acquisition Regulations, Part 302024: The FAR, Federal Acquisition Regulations, Part 30
2024: The FAR, Federal Acquisition Regulations, Part 30
 
Financing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCCFinancing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCC
 
An Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCCAn Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCC
 
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Top Rated Pune Call Girls Hadapsar ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...
Top Rated  Pune Call Girls Hadapsar ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...Top Rated  Pune Call Girls Hadapsar ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...
Top Rated Pune Call Girls Hadapsar ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...
 

Featured

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 

Semi-Quantitative Non-Targeted Analysis as a Rapid Risk Prioritization Tool: A Proof of Concept Using Activated Carbon Drinking Water Filters

  • 1. Louis Groff, Hannah Liberatore, Seth Newton, Jon Sobus Semi-Quantitative Non-Targeted Analysis as a Rapid Risk Prioritization Tool: A Proof of Concept Using Activated Carbon Drinking Water Filters Office of Research and Development Center for Computational Toxicology and Exposure April 7, 2021 https://orcid.org/0000-0002-6357-0508
  • 2. Why Does EPA Need Measurement Data? • Measurement data needed to assess chemical safety • Regulate chemicals, manage exposures, ensure compliance under several federal statutes Chemical Monitoring Needs Exposure Assessment Dose- Response Assessment Risk Assessment
  • 3. Traditional Targeted Analysis Mid Concentration Low Concentration High Concentration Measure Volatiles in Drinking Water Calibration Curves Allow Accurate Quantification of Individual Analytes Purchase standards for quantitation Different Signal Intensities Across Compounds
  • 4. Limitations of Targeted Analysis •Environmental & biological samples are typically highly complex mixtures •Contain diverse arrays of known and unknown chemicals (100s-1000s per sample) •Targeted confirmation/quantitation of all compounds-of-interest not remotely feasible
  • 6. C o n c e n tra tio n F r e q u e n c y C o n c e n tra tio n F r e q u e n c y Media Sample Extraction & Cleanup Prepared Sample MS Analysis Concentration Estimates for Prepared Samples Concentration Estimates for Media Samples Semi-Quant. (SQ) NTA is a Multi-Step Process MS Data
  • 7. • Current SQ-NTA methods have not sought to estimate media concentrations • Cannot interpret NTA data in a risk-based context • Need ways to defensibly approximate media concentration • Proof-of-concept approach using GC-HRMS of volatiles in tap water • Brita filters employed to collect media samples • Large-volume water samples (380 L over lifetime of filter) • Suitable for low-concentration contaminants • Allows preconcentration of analytes on filter • Low shipping costs 7 SQ NTA: Need for Rapid Prioritization Methods
  • 8. • Spiked test filters with mix of standard VOCs + PAHs at 3 concentrations • 49 volatiles/semi-volatiles + 24 polycyclic aromatic hydrocarbons (PAHs) • Performed GC-HRMS on neat standards and spiked filter extracts LOW MID HIGH 8 GC-HRMS Standard Calibrations Mean Intensity Concentration
  • 9. • Split Injection • PTV (programmable temperature vaporizing) inlet • 60 °C Ramped at 10 °C/s to 290 °C for transfer • TG-5SilMS capillary column • Oven temperature program • 35 °C Ramped at 10 °C/min to 295 °C • Electron ionization (EI) source • Orbitrap mass analyzer • Acquisition range: 40-550 m/z • Volatile range observable by GC 9 GC-HRMS Instrumental Parameters
  • 10. • Thermo TraceFinder GC-MS Deconvolution plug-in • NTA approach to detecting compounds • Accurate mass tolerance: 5 ppm • S/N threshold: 10:1 • TIC intensity threshold: 500,000 • Ion overlap: 99% • Compound identification and RT alignment across samples • NIST 2017 EI-MS reference library • Results filtered to include only peaks with assigned mainlib library matches • Reverse search index (RSI) score: ≥800 • High-resolution filtering (HRF) score: ≥85 • Total score: ≥85 Figure from: Basic Methodological Strategies in Metabolomic Research. (2013). In N. Lutz, J. Sweedler, & R. Wevers (Eds.), Methodologies for Metabolomics: Experimental Strategies and Techniques (pp. 1-76). Cambridge: Cambridge University Press. 10 Identifying Chemicals: NTA Data Processing Workflow 66/73 compound IDs (standards) 35/73 compound IDs (extracts)* *matrix effects
  • 11. C o n c e n tra tio n F r e q u e n c y C o n c e n tra tio n F r e q u e n c y Media Sample Extraction & Cleanup Prepared Sample MS Analysis Concentration Estimates for Prepared Samples Concentration Estimates for Media Samples SQ NTA is a Multi-Step Process MS Data
  • 12. 𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝐹𝑎𝑐𝑡𝑜𝑟 (𝑅𝐹) = 𝐾𝑛𝑜𝑤𝑛 𝐶𝑜𝑛𝑐. 𝑆𝑢𝑟𝑟𝑜𝑔𝑎𝑡𝑒 𝑂𝑏𝑠. 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦 𝑆𝑢𝑟𝑟𝑜𝑔𝑎𝑡𝑒 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝑜𝑛𝑐. 𝑈𝑛𝑘𝑛𝑜𝑤𝑛 = 𝑂𝑏𝑠. 𝐼𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑦𝑈𝑛𝑘𝑛𝑜𝑤𝑛 × 𝑅𝐹 Building a Simple SQ Model Using a Single Surrogate Response Factor “Single Surrogate”  known chemical spiked at known conc. with observed intensity “Unknowns”  tentatively identified chemicals with unknown conc. and observed intensities
  • 13. Prediction Error Using Single Surrogate Response Factor • 𝐸𝑟𝑟𝑜𝑟 𝑅𝑎𝑡𝑖𝑜 = 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝐶𝑜𝑛𝑐. 𝐾𝑛𝑜𝑤𝑛 𝐶𝑜𝑛𝑐. • Using a single surrogate results in error ratios that span around two orders of magnitude • Using this SQ approach, we can underestimate by an order of magnitude or overestimate by an order of magnitude 95%-fold range = 114×
  • 14. Building a More Complex Model: Relationship Between Intensity and Retention Time RT: 0.00 - 26.65 0 2 4 6 8 10 12 14 16 18 20 22 24 26 Time (min) 10 20 30 40 50 60 70 80 90 100 11.87 11.51 25.37 22.43 21.91 25.29 20.33 16.85 20.57 19.09 13.13 13.35 15.21 5.59 11.37 8.90 6.54 8.74 17.18 9.05 9.42 7.94 2.70 18.86 7.13 9.96 4.91 15.99 25.15 4.50 17.72 2.43 14.62 0.06 NL: 1.31E9 TIC MS 8260+ PAHs_2pp mACE_001 • Found that Intensity Increases as Retention Time Increases at the same concentration • Can utilize to improve model predictions
  • 15. Building a More Complex SQ Model ∆ RTUnknown1 ∆ RTUnknown2 “unknowns” Single Surrogate 97.5th 2.5th Use to bound Concentrations According to Absolute Retention Time Difference Regression Equation for 95% Prediction Interval: log 𝐸𝑅95% 𝑓𝑜𝑙𝑑 = m ∗ log |∆𝑅𝑇| + 𝑏 *Can pick any fold range 95%, 99%, 99.9%, etc.
  • 16. Implementing the Model for Prediction (Step 1) 𝐶𝑏𝑜𝑢𝑛𝑑 = 10 log(𝐶𝑝𝑟𝑒𝑑) ±(log(𝐸𝑅95% 𝑓𝑜𝑙𝑑) 2)
  • 17. C o n c e n tra tio n F r e q u e n c y C o n c e n tra tio n F r e q u e n c y Media Sample Extraction & Cleanup Prepared Sample MS Analysis Concentration Estimates for Prepared Samples Concentration Estimates for Media Samples SQ NTA is a Multi-Step Process MS Data
  • 18. 18 Why is “Recovery” a Critical Parameter?
  • 19. Example Prioritization Using Tap Water Filters Prepared Solution Conc. Media Conc. Adjust by concentration factor Compare to EPA Max Contaminant Levels (MCL) 𝐶𝑜𝑛𝑐 𝐹𝑎𝑐𝑡𝑜𝑟 = 𝑉𝑜𝑙. 𝐹𝑖𝑙𝑡𝑒𝑟𝑒𝑑 𝑇𝑎𝑝 𝑊𝑎𝑡𝑒𝑟 𝑉𝑜𝑙. 𝐸𝑥𝑡𝑟𝑎𝑐𝑡 From Brita Extracts
  • 20. Example Prioritization Using Tap Water Filters Margin of Recovery (MoR) Calculated for risk prioritization 𝑀𝑜𝑅 = 𝑈𝑝𝑝𝑒𝑟 𝐶𝑚𝑒𝑑𝑖𝑎 𝑀𝐶𝐿 × 100 Priority Levels: Low  MoR < 1% Moderate  1% ≤ MoR < 100% High  MoR ≥ 100% Moderate & High priority candidates for targeted analyses From Brita Extracts *Simulated examples of high-priority chemcals
  • 21. Conceptual Model for Interpretation
  • 22. Planned Activities • Finalize semi-quant models for GC & LC platforms • Examine platform transferability for semi-quant models • Apply models to existing data (products & media) • Develop pipeline from ToxCast AC50 (or other NAM-based hazard metrics) to lower bound media conc. • Incorporate into EPA NTA WebApp
  • 23. EPA ORD (cont.) Kathie Dionisio Chris Grulke Kamel Mansouri* Andrew McEachran* Ann Richard Adam Swank John Wambaugh Antony Williams EPA ORD Seth Newton Jon Sobus Hannah Liberatore Hussein Al-Ghoul* Alex Chao* Jarod Grossman* Kristin Isaacs Sarah Laughlin* Charles Lowe James McCord Kelsey Miller Jeff Minucci Katherine Phillips Allison Phillips* Tom Purucker Randolph Singh* Mark Strynar Elin Ulrich Nelson Yeung* * = ORISE/ORAU This work was supported, in part, by ORD’s Pathfinder Innovation Program (PIP) and an ORD EMVL award Agilent Jarod Grossman Andrew McEachran GDIT Ilya Balabin Tom Transue Tommy Cathey Contributing Researchers
  • 24. Questions? The views expressed in this presentation are those of the author and do not necessarily represent the views or policies of the U.S. Environmental Protection Agency. Groff.Louis@epa.gov

Editor's Notes

  1. It is our job at EPA to assess chemical safety. As part of that effort, chemical risk assessments are mandated through federal statues including but not limited to TSCA, FIFRA and the Safe Drinking Water Act. The support of these and other federal statutes hinges upon the acquisition of measurement data to support dose-response modeling and exposure assessments, since both of these activities are required for chemical risk assessment.
  2. Targeted analysis methods are the gold standard for generating chemical measurement data from environmental and biological samples. These methods generally utilize sophisticated analytical instruments, such as gas or liquid chromatographs, used for compound separation, and mass spectrometers, used for compound detection. When developing targeted methods, chemical standards are prepared for analytes of interest at multiple dilutions and then carefully measured using analytical instrumentation. The instrument response increases as concentration increases, which defines the compound specific calibration curves. Once calibration curves are generated, we can examine any unknown sample of interest and accurately estimate the concentration of the target compound in that sample given the observed intensity. Importantly, as shown here, individual chemicals can yield very different signal intensities at a given concentration. Thus, for the purposes of accurate quantitation, it is highly beneficial to utilize compound-specific calibration curves.
  3. With all targeted analyses, chemicals of interest are first selected, standards are then acquired, methods are then developed using the procured standards, and unknown samples are finally examined using the optimized method and generated calibration curve. While this produces high-quality measurement data, this approach is only practical in terms of time and cost for relatively few known compounds with readily available standards. Typical environmental and biological samples contain hundreds to thousands of chemicals per sample, including both known chemicals and unknown chemicals. Further, the amount of chemicals with available standards is an exceedingly small percentage of all known chemicals. So, in many cases, an optimized targeted method is not an option for most chemicals in a sample. To reconcile these issues, we look to non-targeted analysis (NTA) methods.
  4. With NTA, chemicals are not selected a priori. Rather, samples of interest are selected and analyzed using highly sophisticated mass spectrometry platforms. High-resolution mass spectrometers, in particular, can generate hundreds of gigabytes of data on unknown chemical features in any given sample set. In this graphic, the unknown chemical features are depicted as individual peaks in a composite extracted ion chromatogram. The job of the analyst is to assign a chemical identity to each feature. This is much easier said than done. Thus, the analyst first prioritizes the full set of observed features based on abundance, detection frequency, a specific chemical signature, or the results of statistical comparisons against control samples. The analyst then uses accurate mass measures and observed isotope patterns to propose chemical formulae, and eventually structures, for priority features. This is a difficult process, as many chemical structures can share a common formula. Once we tentatively assign structures, we attempt to estimate the concentration in the original medium using a variety of semi-quantitative techniques, which is the step we will focus on in this presentation. If concentration estimates suggest a possible health risk, we identify the source(s) of the chemical, postulate the relevant exposure pathway(s), and inform the eventual safety evaluation.
  5. This is an alternate view of an NTA workflow, with specific emphasis on the multiple steps involved in semi-quantitation. Again, NTA begins with samples of interest. In nearly all cases, collected samples must be prepared in the laboratory prior to analysis via mass spectrometry. Sample preparation generally involves an extraction and clean-up step, with only a small portion of the prepared sample ultimately injected into the mass spectrometer. The mass spectrometry data is used to generate tentative chemical assignments for observed features. Once chemical IDs are tentatively assigned, we can begin semi-quantitation procedures. The first step in semi-quantitation is estimating chemical concentration in the prepared sample. This is where most semi-quantitative studies stop. However, for semi-quantitative NTA to be used in support of risk-based decisions, they must be extended to the original sampled media. The procedures that follow will guide us through to generating semi-quantitative concentration estimates first for our prepared samples, and then ultimately for the original sampled media.
  6. Within existing semi-quantitative methods, a clear practical need exists to move from simple estimates in prepared solution to statistically defensible estimates of media concentration. Without this extension into media, we have no means of prioritizing NTA results in a risk-based context since risk metrics are expressed relative to media concentrations. We approached this problem, as a proof-of-concept, by studying volatiles in tap water, as collected on Brita point-of-use filters, and analyzed via gas chromatography-high resolution mass spectrometry. Brita filters are a compelling sampling medium as large volumes of tap water can be concentrated onto a faucet-mountable carbon filter, and later extracted and analyzed via NTA. This method allows for the collection of a diverse array of chemicals and while removing the needs to sample large volumes of water and pre-concentrate chemicals of interest prior to analysis.
  7. For the proof-of-concept experiment, we spiked blank carbon filters with standard mixtures of volatiles, semi-volatiles, and PAHs at three concentration levels. In total there were 49 volatiles and semi-volatiles, and 24 PAHs. We then performed GC-HRMS on the standard solutions and the carbon filter extracts.
  8. The next two slides are predominantly for those who might be interested in knowing the instrumental parameters of the GC-HRMS experiments, which I won’t detail for now. But please feel free to follow up with me after the presentation if you’re interested in learning more about the GC-HRMS analysis.
  9. Again, for the sake of timing, I’ll skip the parameter detailing here, but the main idea is that we used a commercial method to deconvolve spectral features into individual chemical signals for suspect screening/chemical identification. Using this commercial NTA screening method, we were able to successfully identify 66 of the 73 known compounds in the neat standards, and 35 of the 73 spiked compounds in the carbon filter extracts. The fewer number of compounds identified in the extracts can be ascribed to matrix effects following the extraction and cleanup process.
  10. So, moving on to how we actually perform our semi-quantitative estimates first in prepared samples…
  11. The simplest semi-quant model is based on the use of a single surrogate – that is, one chemical where we know its concentration ahead of time, and have measured its intensity on the instrument. We treat the remaining chemicals as unknowns, with observed intensities but unknown concentrations. From this single surrogate data, we can generate what is called a response factor, or a ratio of concentration over intensity, which tells us how much we expect the instrument response to change with a corresponding change of concentration. From there, we can multiply the response factor by the remaining observed intensities to generate semi-quantitative concentration estimates for all tentatively identified chemicals in the prepared sample.
  12. The accuracy of these estimates can be assessed using an error ratio, which is the ratio of the predicted concentration over the known concentration. Generally, for the single-surrogate response factor method, error ratios ranged from an order-of-magnitude above to an order-of-magnitude below the true value. Importantly, in a true unknown sample, we would have no idea when a given semi-quantitative estimate was over- or under-estimated. This is clearly an unacceptable outcome if the desire is to use semi-quantitative estimates to support risk-based decisions.
  13. To try to improve our prediction accuracy, we considered alternative methods of estimating concentration, and noticed a beneficial relationship between observed intensity and observed retention time. Specifically, at the same concentrations, intensity appears to increase with increasing retention time.
  14. So we sought to model the relationship between the prediction error and difference in retention time, or delta RT, between neighboring chemicals. We figured that if we chose a surrogate closer in retention time proximity to its neighbors, we could reduce prediction error and get better concentration estimates. Thus, for all possible choices of surrogate, we examined the observed error ratio as a function of delta RT. The middle and right plots illustrate that as retention time between neighboring chemicals increases, our error ratio increases linearly. If delta RT between a surrogate and unknown is under roughly a minute, our prediction error is improved compared to a mid-range surrogate response factor, up to a factor of 25 improvement. An added benefit of this new model is that the regression in the right plot allows for the calculation of statistically defensible error bounds about each concentration prediction. So, if we know the delta RT between the chosen surrogate and the tentatively identified chemical, we can provide bounded concentrations in the prepared solution.
  15. Here is an example of using a single surrogate in conjunction with the delta RT model applied to the Brita filter extracts in order to estimate bounded concentration predictions, ranked by their upper bound concentrations, which were calculated using a 95% prediction interval. It is worth noting that any confidence level can be selected based on what is most appropriate for a given application. For this example application, individual spot predictions may be above or below the true concentrations, but are contained within the prediction interval approximately 95% of the time.
  16. With these bounded estimates in hand, the next step is to interpret the results with respect to the original sampled medium.
  17. To do that, we need to know the percent recovery of each chemical off the carbon filter. This is a critical parameter because it can contribute to orders-of-magnitude differences in final concentration estimates. Theoretically, a lower bound on media concentration is given if we recovered 100% of the chemical off the filter. In that case, the media concentration is equivalent to the prepared sample concentration. However, we can’t assume a lower bound on percent recovery, which has dramatic implication on the estimation of the final media concentration. Take, for example, two chemicals both estimated to have an upper bound prepared sample concentration of 120 uM. Chemical A has a recovery of 100% and chemical B a recovery of 1%. Based upon these recoveries, chemical A would have an estimated media concentration of 120 uM and chemical B an estimated media concentration of 12,000 uM. This massive disparity shows the importance of understanding recovery. Since we can’t bound recovery on the low end, we can’t bound media concentration the high end. This clearly has implications for risk-assessment activities. As such, a different method is needed to link semi-quantitative estimates in prepared solution to the final media concentrations.
  18. For this proof-of-concept work with Brita filters, to get an initial prioritization of chemicals for further analysis, we first need to adjust the prepared sample concentration estimates by a concentration factor that accounts for the large volume of tap water that was processed through the Brita filter and eventually reduced down to a small volume of extract. With that adjustment, we have upper bound media concentrations that we can compare to existing regulatory levels for initial prioritization. In this case, we chose EPA Maximum Contaminant Levels, or MCLs, since they are a relevant hazard metric for tap water.
  19. How we accomplish this prioritization is by calculating a new metric called Margin of Recovery, which is the ratio of upper bound media concentration to the MCL, expressed as a percentage for a given compound (or subset of compounds for the cases of Xylenes and Trihalomethanes). If we calculate an MoR of 100%, that means the estimated concentration in the prepared sample equals the MCL. Thus, a 100% recovery would be needed for these values to match. This is entirely plausible for routine laboratory analyses of drinking water, and the compound would therefore be considered high priority. If we have an MoR of 0.01%, that means that a 0.01% recovery would be needed for the prepared sample concentration to match the MCL. While a recovery of 0.01% is possible, it’s very unlikely, and thus, this compound is considered low priority. The take-away from this analysis is that recovery cannot currently be estimated, nor can it be bounded to enable meaningful estimation of chemical concentration in media. Thus, the MoR approach provides a means to relate upper bound concentration estimates in prepared samples to media levels of interest.
  20. Clearly, when using NTA to identify contaminants-of-emerging concern, we won’t have existing MCLs to compare against using the MoR technique. Thus, we have developed a framework for a data analysis pipeline between the exposure assessment side of this effort and the dose-response modeling side of this effort. In lieu of MCLs (since those take years to determine), we could use something like ToxCast AC50s or other NAM-based hazard metrics to work our way down through dose-response modeling to a lower bound media concentration that would pose hazard for the most sensitive populations. To link this up with the SEEM framework, which is the primary modeling construct of the ExpoCast team, we’d need a model to predict the percent recovery. This is years off at best, and will require a large amount of training data given the various chemical extraction techniques. Thus, the MoR approach is likely the best way to make immediate use of semi-quantitative predictions from NTA experiments. With the MoR technique in hand, we can perform initial prioritizations on various environmental and biological samples, reducing the list of all possible chemicals down to a handful of priority chemicals that require further investigation.
  21. Other future work to achieve these goals includes finalizing our semi-quant models for both gas- and liquid chromatography platforms, which require slightly different semi-quantitative methods. We further wish to examine whether or not the use of different analytical platforms (that is, different instruments from different vendors) affect the accuracy of our predictions. Longer-term objectives include developing the data processing pipeline to link HT hazard and dose-response data to semi-quant NTA predictions, as well incorporating the semi-quant methods into our NTA Web Application, which is currently used for many NTA studies within ORD.
  22. I’d like to thank everyone at EPA who has contributed to this work thus far, listed here, and I welcome any questions you might have. Thank you for your time and attention!