SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
The Interplay between Chemical Properties
and Screening Data


                    Andy Pope
 Platform Technology & Science, GlaxoSmithKline,
               Collegeville PA, USA

               MipTec 2011, Basel
               Sept. 20-22, 2011
Compound properties aren’t what they used to be…




                                                                                                                  MW
                                                           ClogP
ClogP




                                                                         Properties vs Phase*
                           MW

cLogP (median)                  MW (median)                        *Adapted from Blake JF, Medicinal Chemistry,
 Failed candidate = 3.9         Failed candidate = 432           2005, 1, 649-655
 Marketed drug = 2.5            Marketed drug = 349
And we have known this for a while. …
Drug discovery chemical property space - Some critical
                     factors . …

                             - Chemistry methods
      Drug                   - Chemistry “culture”
   candidates
                                             - Hit ID libraries




           - Screening methods
           - SAR data
Drug discovery chemical property space - Some critical
                     factors . …

                             - Chemistry methods     - Efficiency concepts
      Drug                   - Chemistry “culture”   - Property guides/rules
   candidates
                                             - Hit ID libraries
                                                       - Rigorous property rules
                                                       - Fragments
                                                       - Lead-like, “Beautiful”




           - Screening methods
           - SAR data
Does assay data influence discovery chemical property
                  space occupancy?

                  (…or vice versa)
Large Scale analysis of High Throughput Screening Data
 HTS at GSK
  330 screens of >500,000 cpds, 2005-2010

  Single concentration primary data (10 uM) re-analysed

  Compound results binned according to simple compound properties

  Meta-data (e.g. target class, screening technology) curated



 Academic screening centers (MLPCN)
  ~100 screens with >250,000 cpds tested & deposited to PubChem BioAssay
   from major NIH funded screening centers (NCGC, Scripps, Broad)

  Single concentration data re-analysed using same methods as GSK data
The GSK HTS Process

              Primary Screen
            (10 uM – singlicate)       ~          Entire collection
                                       2          (100%)
Statistical separation
   from null effect                                  Chemical clustering
                                     ~2 million      if hit rate >1%
     population



               Confirmation
            (10 uM – duplicate)
                                                  Potential actives
                                                   (<1%)
   Eliminate false                    ~20,000
   positives from                                    Chemical clustering for
       primary                                       diversity & property
                                                     sampling

              Dose response
           (11 pt 3-fold dilution)
                                                  “Real“ hits
                                                  (<0.1%)
                                      ~2,000
HTS Hit Marking Processes
  ---- Binned Raw HTS data
  ---- Fit with raw mean and std. deviation                                                  Normal Distribution:
  ---- Fit with robust mean & std. deviation          3 x RSD cut-off                                                   x       2

                                                   “miss”       “hit”                                       1           2   2
                                                                                             Px                     e
                                                                                                                2
                                                                                                        2

                                                                                               Raw mean = 1.0
                                                                                               Raw SD = 11.1
                                     % Compounds


                                                                                               Robust mean = -0.3
                                                                                               Robust SD = 5.5
% Compounds




                                                                 weak hits, artefacts, and     Blue & black curves are
                                                                 statistical “noise”           normal distribution fits
                                                                                               using mean & SD


                                                                        potent hits (& artefacts)




                                                            RESPONSE (% control)

                    RESPONSE (% control)
Frequency (% cpds/bin)   Typical HTS observed data distribution vs. fit




----                   Binned Raw HTS data                  Effect (% control)
----                   Robust distribution fit
----                   Hit cut-off (mean + 3 x RSD)
                                                      Note; representative selection of individual
----                   Residual (raw – fit)
                                                      screens from ~330 analyzed
Frequency (% cpds in bin)     Observed data distribution vs. fit – zoom




----                   Binned Raw HTS data                    Effect (% control)
----                   Robust distribution fit
----                   Hit cut-off (mean + 3 x RSD)   Note; representative selection of
----                   Residual (raw – fit)           individual screens from ~330 analyzed
Screen cut-off (mean + 3 x RSD)
                                      GSK HTS campaigns 2005-2010




                                  Average robust Z’ of assay during HTS production
Looking for property trends in the GSK HTS dataset
                                                                   The total polar surface area (tPSA) is
                                                                   defined as the surface sum over all polar
 e.g. Compound total polar surface area;                           atoms
                                                                   < 60 A2 predicts brain penetration
Aggregate results from all                                         > 140 A2 predicts poor cell penetration
330 campaigns 2005-2010
with >500K tests                 Compounds with tPSA 80-85 Å2

                              26M measured responses in this bin
                                   - 485k marked as “hit”

                               Hit rate = 100*(485k/26M) = 1.86%
              Hit Rate (%)




                                                                         - Hit rate for Compounds in
                                                                           specific tPSA bin




                             Polar Surface Area (tPSA, Å2)
Compound shapeliness and flexibility
                                                            fCsp3 captures “shapeliness” of a compound
                                                               - Weak positive correlation with MW
                                                               - More irregular 3D shape  lower hit probability
 Hit Rate (%)




                                                            Flexibility = Percentage of a compound’s bonds that are
                                                             rotatable
                                                                - light decrease in HR with Flexibility
                                                                - No correlation with MW or ClogP


                Fraction of carbons that are sp3 (fCsp3)
Hit Rate (%)




                             Flexibility
Compound Size (MW)
            HTS hit rates rises significantly
                 with increasing compound MW                                               Middle 80% of Cpds
                                                                                             270    470




                                                                                                                Cumulative % Cpds
                                                                        % Cpds in MW Bin
                                                           4.0%
Hit Rate (%)




                                                   2.62%



                                1.50%                                                              MW
                   1.2%
                                                                   Overall Hit rate rises 1.7-fold across
                                                                    the middle 80% of the screening deck
                                                                       i.e. 70% rise in hit rate from MW = 270 to
                          Molecular Weight (MW)                       MW = 470

                                                                   3.3-fold rise across full MW range
               - Only bins containing 1M or more
                 records are shown
Compound Lipophilicity (ClogP)
 HTS hit rates rises sharply with
                                                                                      Middle 80% of Cpds
               increasing compound lipophilicity                                          1           5




                                                                                                           Cumulative % Cpds
                                                                % Cpds in ClogP Bin
                                                   4.5%




                                           3.31%
Hit Rate (%)




                                                                                              ClogP

                             1.14%
                  1.1%
                                                           Overall hit rate rises 2.9-fold across the
                                                            middle 80% of the screening deck
                                                              i.e. from ClogP = 1  5
                                ClogP                      4.1-fold rise across full ClogP range

    - Only bins containing 1M or more
      records are shown
Promiscuity v. Molecular Properties
             The prevalence of promiscuous compounds rises sharply with size and
                               lipophilicity
                                 • Hit Frequency Index (HFI)= % of SS HTS campaigns that a compound give activity >cut-off
                                 • “Promiscuous” compound  HFI ≥ 10% (having seen at least 50 campaigns)
  % of Promiscuous Compounds




                                                                                                % of Promiscuous Compounds




                                                                                                                                     % Rise in Promiscuity
                                                                        % Rise in Promiscuity



                                          Molecular Weight                                                                   cLogP

Across the middle 80% of the screening deck …
    • Large compounds are 4-fold more likely to have high HFI than small ones (MW: 270  470)
    • Lipophilic compounds are 10-fold more likely to have high HFI than polar ones ones (cLogP: 1  5)
Property distributions vs. promiscuity - cLogP
                      Compounds                                                              Compounds hitting
                      hitting ~1 target                                                      >10% of targets
cLogP




                                                                                                   Note; Compounds
                                                                                                   required to have been
                                                                                                   run in 50 HTS and
                                                                                                   yielded > 50% effect in
                                                                                                   a single screen to be
                                                                                                   included




                   Frequency at bin >     Frequency at bin >   Frequency at bin >   Frequency at bin >




                                             Inhibition frequency Index* (%)

        *Inhibition frequency index (IFI) = % of screens where cpd yielded
        >50% inhibition, where total screens run => 50
The “Dark” Matter
        – Compounds which have not yielded >50% effect once in >50 screens




                                         Molecular Weight (Da)
cLogP
Translation of biases to full-curve follow-up

                      Property bias in primary HTS hit marking are propagated forward to dose-
                       response follow-up
                                                              SS testing
                                                              FC testing
                                                              FC – SS differential
% Compounds Tested




                                                                       % Compounds Tested


                                          cLogP                                                        Molecular Weight

                        Elevated testing of large, lipophilic                               Reduced testing of small, polar compounds
                        compounds in the full-curve phase of HTS                            in the full-curve phase of HTS

                     Note; Plots represent data from 402M single-concentration responses &
                     2.1M full-curve results
Property Trends; translation to dose response
                        Property effects contribute to hits at all effect levels
                            - i.e not just hits on the statistical margins

                        Property-dependence decreases through the HTS process

                         Standard 3SD SS Hits                                        Standard 3SD SS Hits
                         Top 0.1% of SS Responses                                    Top 0.1% of SS Responses
                         % of cpds with IC50 <= 10 uM                                % of compounds with IC50 <= 10 uM
  % Lift in Hit Rate




                                                                % Lift in Hit Rate

                                       ClogP                                                      MW

From *ClogP = 1  5:                                    From *MW = 270  470:
• 3SD: 2.9X rise in Hit Rate                            • 3SD: 1.8X rise in Hit Rate
• Top 0.1%: 2.2X rise                                   • Top 0.1%: 1.3X rise
• FC Active: 1.5X rise                                  • FC Active: 1.2X rise

                             *Across the middle 80% of the deck,….
Property response of individual screens is highly variable
                                    e.g. Screens with largest response to cLogP
Hit rate as % of HR at cLogP =3.5




                                                                  cLogP
Property response of individual screens is highly variable
                                    e.g. Screens with smallest response to cLogP
Hit rate as % of HR at cLogP =3.5




                                                                cLogP
Assay Technology

                                                       Colored by Hit
                                                       rate (%)
Hit rate as % of HR at cLogP =3.5




                                       cLogP
Target Class
                                                   Colored by Hit
                                                   rate (%)
Hit rate as % of HR at cLogP =3.5




                                    cLogP
Improving hit marking
- reducing bias towards high cLogP, MW hits

 Virtual partitioning of collection according to property
  - e.g. sub-collections in different cLogP ranges

 Change the hit calling method, so this takes properties as well as % effect into
  account.
   - e.g. calculate hit cut-off’s bases on BEI/LEI etc.
   - “scalar” methods based on correcting the observed biases



And..improving assays and the collection based on awareness of
these biases
Improving hit marking – Property Biasing

              Mean + 3 x RSD cut-off




                                                           Hit Rate (%)
                                                                          Ordinary HTS Hit Marking
                                                                          Property-biased Hit Marking
                                    More attractive
                                    properties
% Compounds




                                     - promote                                 MW




                                      Less attractive


                                                        Hit Rate (%)
                                      properties
                                       - demote

                                                                          Ordinary HTS Hit Marking
                                                                          Property-biased Hit Marking

                      RESPONSE (% control)
                                                                               ClogP
Improving hit marking HitProperty Binning
                                     Property-biased – Marking
               Sub-divide screening data into bins of compounds with similar properties
                 - apply 3 x rsd hit cut-offs to each bin

                Consensus method combines approaches – routinely implemented

                                   Response                                            Response
                                   Property-Binned stats                               Property-Binned stats
                                   Property Consensus                                  Property Consensus




                                                             Hit Rate (%)
Hit Rate (%)




                       Bin 1;         Bin 2;        Bin 3;                    Bin 1;           Bin 2;            Bin 3;
                     Low MW,       Medium MW,     High MW,                  Low MW,         Medium MW,         High MW,
                       cLogP          cLogP         cLogP                     cLogP            cLogP             cLogP


                                   MW                                                         ClogP
Evolving the screening collection to smaller, more polar
                                                    lead-like space
                                  GSK’s Compound Collection Enhancement (CCE) strategy has biased the HTS deck
                                   towards decreased size and lipophilicity with the aim of improving chemical starting
                                   points
                                Compounds tested in HTS




                                                                                       % Compounds Exceeding Property Limit
                                      - 2004
(% of total compounds in HTS)




                                      - 2010
                                     - 2010 <> 2004


                                                                                                                                     ClogP > 5



                                                                                                                                     MW > 500




                                                                                                                                                 New
                                                                                                                                                 2011

                                                          ClogP                                                               Year


                                                             CCE Acquisition, Property Bounds
                                                             2004-05: Lipinski criteria (MW<500, ClogP<5)
                                                             Most recently: MW<360, ClogP<3
                                                             Inclusion of DPU lead-op cpds: MW<500, ClogP<5
Property trends in MLPCN Screening Data
                Primary data from around 100 Academic HTS campaigns obtained from
                 PubChem BioAssay

                Lipophilicity – similar to GSK HTS                            Compound size – little effect




                                                  3.80%




                                                               Hit Rate (%)
Hit Rate (%)




                                                                                              Pretty flat
                                                                                                            2.27%
                                                                                      2.14%



                             1.28%




                                  ClogP                                                          (MW)

                                           GSK screening deck (>50 HTSs, 2.01M cpds)
                                               ClogP = 0.00835*MW – 0.058, R2 = 0.18
                                           PubChem Compounds (405k)
                                               ClogP = 0.00554*MW + 0.97, R2 = 0.09
MLPCN Screening Data – Property Trends

                        Example Individual screen responses to cLogP




                                     Trellis by individual screens
3 x rsd hit rate (%)




                                                cLogP
Small Beautiful Set Screening

SBS = Subset of the HTS deck which spans the
gap between HTS and fragments
                          HTS collection (2M)


      Filtered on;
        - size and lipophilicity
           • 10 ≤ HAC ≤ 28 and -2 ≤ ClogP ≤ 3, bounded                     (MW)
        - “promiscuity” – frequent-hitters are eliminated
           • IFI ≤ 3% (IFI = Inhibition Frequency Index, 3SD hit cutoff)
        - hit explosion opportunity
           • Near Neighbor Count ≥ 20 (in GSK registry
        - “shapliness”
           • fCsp3 ≥ 0.3 (i.e. ≥ 30% of carbon atoms must be sp3)
       - acquisition sub-structural filters
       - “greedy” diversity selection (no compounds >0.9 similar )


                                                                            ClogP
                  SBS2 = ~75,000 compounds

    Tested at higher concentration (e.g. 100-200 uM)
Conclusions
 Standard HTS processes favor the selection of larger, more lipophilic
  compounds

 There are no clear trends between this behavior and assay technology or
  target class

 Methods have been developed which (to some extent) compensate for
  property biases to ensure that attractive lead like molecules are selected
    - Overall hit rate in relation to downstream triage capacity is also critical
    - Aspire to hit rate to as close to “authentic pharmacology” rate as possible


 Changing the trajectory of discovery chemical space requires an interplay
  between the composition of chemical libaries, assay practice, hit analysis
  and downstream Hit to Lead and Lead to Candidate chemistry practice
Acknowledgements

Pat Brady                             Tony Jurewicz        James Chan
Darren Green                          Glenn Hofmann        Snehal Bhatt
Stephen Pickett                       Stan Martens         Amy Quinn
Sunny Hung                            Jeff Gross           Geoff Quinique
Subhas Chakravorty                    Zining Wu            Bob Hertzberg
Nicola Richmond                       Mehu Patel
Jesus Herranz                         Emilio Diez
Gonzalo Colmeranjo-Sanchez            Julio Martin-Plaza

   …and numerous others who contributed to the 300+ HTS
  campaigns run by GSK 2005-2010…..




     Screening & Compound Profiling
Backups
Year of Screen

                                                     Colored by Hit
                                                     rate (%)
Hit rate as % of HR at cLogP =3.5




                                     cLogP
Promiscuity v. Molecular Properties – Molecular weight
                                      Compounds                                                              Compounds hitting
                                      hitting ~1 target                                                      >10% of targets
Molecular Weight (Da)




                                                                                                                   Note; Compounds
                                                                                                                   required to have been
                                                                                                                   run in 50 HTS and
                                                                                                                   yielded > 50% effect in
                                                                                                                   a single screen to be
                                                                                                                   included
                                   Frequency at bin >     Frequency at bin >   Frequency at bin >   Frequency at bin >




                                                            Inhibition frequency Index (%)
                        *Inhibition frequency index (IFI) = % of screens where cpd yielded
                        >50% inhibition, where total screens run => 50
GSK HTS campaigns 2005-2010
                Hit cut-off (% effect @ 10 uM)   Hit rate (% of compounds) > cut-off
Number of Screens




                                                 Number of Screens



                         Mean + 3 *RSD of                            % compounds with effect
                       sample data (% control)                           > mean + 3 *RSD
Validation and robustness methods cannot detect
        Property-biases
            Compound sets used to test robustness of assays and
            validate screening process reflect current compound
            acquistion practice, not the collection as tested
cLogP




                                     MW
Dose Response Data – Property Trends
                           Is the observed size & lipophilicity bias in HTS single-shot testing an artifact
                            of false positives, e.g. experimental “noise”?




                                                                                                                                                           % Rise in Active Rate
% of Tests Yielding pXC50 ≥ 5




                                                                   % Rise in Active Rate



                                                                                           % of Tests Yielding pXC50 ≥ 5




                                                                                                                                   % Rise in Active Rate
                                           Molecular Weight                                                                cLogP

                                 No, size and lipophilicity dependence is still observed in the rate of
                                  identifying compounds at 10uM activity or better
Molecular Property Correlations in GSKscreen

         Table below shows the correlation coefficients (R2) between particular molecular
        properties and MW/ClogP, along with whether the correlation is positive or
        negative (i.e. the sign of the slope in a linear regression)
         This data is computed using 2.09M compounds comprising GSKscreen
                                  Property      R2, ± vs MW   R2, ± vs ClogP
                                    MW             1, +          0.21, +
                                    ClogP         0.21, +         1.0, +
                                     HAC          0.92, +        0.19, +
                                    fCsp3         0.15, +         0.00
                                  RotBonds        0.36, +        0.04, +
                                    tPSA          0.16, +        0.08, -
                                    Chiral        0.02, +         0.00
                                 HetAtmRatio      0.02, -        0.34, -
                                 Complexity       0.31, +        0.02, +
                                  Flexibility     0.02, +         0.00
                                 AromRings        0.22, +        0.16, +
                                    HBA           0.11, +        0.10, -
                                    HBD           0.01, +        0.02, -
Across 2.09M cpds in GSKscreen

Weitere ähnliche Inhalte

Andere mochten auch

A_Pope_RQRM_LeadDisc_June_2016
A_Pope_RQRM_LeadDisc_June_2016A_Pope_RQRM_LeadDisc_June_2016
A_Pope_RQRM_LeadDisc_June_2016Andrew Pope
 
Screening heuristics pope-final
Screening heuristics pope-finalScreening heuristics pope-final
Screening heuristics pope-finalandypopeuk
 
Haapalinna iddst 2014 handout
Haapalinna iddst 2014 handoutHaapalinna iddst 2014 handout
Haapalinna iddst 2014 handoutAntti Haapalinna
 
Idealp Pharma Hits &amp; Leads Optimisation Case Study 1
Idealp Pharma Hits &amp; Leads Optimisation Case Study 1Idealp Pharma Hits &amp; Leads Optimisation Case Study 1
Idealp Pharma Hits &amp; Leads Optimisation Case Study 1MehdiChelbi
 
Structure Based Drug Design
Structure Based Drug DesignStructure Based Drug Design
Structure Based Drug Designdikheidi
 

Andere mochten auch (6)

A_Pope_RQRM_LeadDisc_June_2016
A_Pope_RQRM_LeadDisc_June_2016A_Pope_RQRM_LeadDisc_June_2016
A_Pope_RQRM_LeadDisc_June_2016
 
Screening heuristics pope-final
Screening heuristics pope-finalScreening heuristics pope-final
Screening heuristics pope-final
 
Haapalinna iddst 2014 handout
Haapalinna iddst 2014 handoutHaapalinna iddst 2014 handout
Haapalinna iddst 2014 handout
 
Idealp Pharma Hits &amp; Leads Optimisation Case Study 1
Idealp Pharma Hits &amp; Leads Optimisation Case Study 1Idealp Pharma Hits &amp; Leads Optimisation Case Study 1
Idealp Pharma Hits &amp; Leads Optimisation Case Study 1
 
Structure Based Drug Design
Structure Based Drug DesignStructure Based Drug Design
Structure Based Drug Design
 
Drug discovery hit to lead
Drug discovery hit to leadDrug discovery hit to lead
Drug discovery hit to lead
 

Kürzlich hochgeladen

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Interplay between screenng_data_and _properties_pope

  • 1. The Interplay between Chemical Properties and Screening Data Andy Pope Platform Technology & Science, GlaxoSmithKline, Collegeville PA, USA MipTec 2011, Basel Sept. 20-22, 2011
  • 2. Compound properties aren’t what they used to be… MW ClogP ClogP Properties vs Phase* MW cLogP (median) MW (median) *Adapted from Blake JF, Medicinal Chemistry,  Failed candidate = 3.9  Failed candidate = 432 2005, 1, 649-655  Marketed drug = 2.5  Marketed drug = 349
  • 3. And we have known this for a while. …
  • 4. Drug discovery chemical property space - Some critical factors . … - Chemistry methods Drug - Chemistry “culture” candidates - Hit ID libraries - Screening methods - SAR data
  • 5. Drug discovery chemical property space - Some critical factors . … - Chemistry methods - Efficiency concepts Drug - Chemistry “culture” - Property guides/rules candidates - Hit ID libraries - Rigorous property rules - Fragments - Lead-like, “Beautiful” - Screening methods - SAR data
  • 6. Does assay data influence discovery chemical property space occupancy? (…or vice versa)
  • 7. Large Scale analysis of High Throughput Screening Data HTS at GSK  330 screens of >500,000 cpds, 2005-2010  Single concentration primary data (10 uM) re-analysed  Compound results binned according to simple compound properties  Meta-data (e.g. target class, screening technology) curated Academic screening centers (MLPCN)  ~100 screens with >250,000 cpds tested & deposited to PubChem BioAssay from major NIH funded screening centers (NCGC, Scripps, Broad)  Single concentration data re-analysed using same methods as GSK data
  • 8. The GSK HTS Process Primary Screen (10 uM – singlicate) ~ Entire collection 2 (100%) Statistical separation from null effect Chemical clustering ~2 million if hit rate >1% population Confirmation (10 uM – duplicate) Potential actives (<1%) Eliminate false ~20,000 positives from Chemical clustering for primary diversity & property sampling Dose response (11 pt 3-fold dilution) “Real“ hits (<0.1%) ~2,000
  • 9. HTS Hit Marking Processes ---- Binned Raw HTS data ---- Fit with raw mean and std. deviation Normal Distribution: ---- Fit with robust mean & std. deviation 3 x RSD cut-off x 2 “miss” “hit” 1 2 2 Px e 2 2 Raw mean = 1.0 Raw SD = 11.1 % Compounds Robust mean = -0.3 Robust SD = 5.5 % Compounds weak hits, artefacts, and Blue & black curves are statistical “noise” normal distribution fits using mean & SD potent hits (& artefacts) RESPONSE (% control) RESPONSE (% control)
  • 10. Frequency (% cpds/bin) Typical HTS observed data distribution vs. fit ---- Binned Raw HTS data Effect (% control) ---- Robust distribution fit ---- Hit cut-off (mean + 3 x RSD) Note; representative selection of individual ---- Residual (raw – fit) screens from ~330 analyzed
  • 11. Frequency (% cpds in bin) Observed data distribution vs. fit – zoom ---- Binned Raw HTS data Effect (% control) ---- Robust distribution fit ---- Hit cut-off (mean + 3 x RSD) Note; representative selection of ---- Residual (raw – fit) individual screens from ~330 analyzed
  • 12. Screen cut-off (mean + 3 x RSD) GSK HTS campaigns 2005-2010 Average robust Z’ of assay during HTS production
  • 13. Looking for property trends in the GSK HTS dataset The total polar surface area (tPSA) is defined as the surface sum over all polar e.g. Compound total polar surface area; atoms < 60 A2 predicts brain penetration Aggregate results from all > 140 A2 predicts poor cell penetration 330 campaigns 2005-2010 with >500K tests Compounds with tPSA 80-85 Å2 26M measured responses in this bin - 485k marked as “hit” Hit rate = 100*(485k/26M) = 1.86% Hit Rate (%) - Hit rate for Compounds in specific tPSA bin Polar Surface Area (tPSA, Å2)
  • 14. Compound shapeliness and flexibility  fCsp3 captures “shapeliness” of a compound - Weak positive correlation with MW - More irregular 3D shape  lower hit probability Hit Rate (%)  Flexibility = Percentage of a compound’s bonds that are rotatable - light decrease in HR with Flexibility - No correlation with MW or ClogP Fraction of carbons that are sp3 (fCsp3) Hit Rate (%) Flexibility
  • 15. Compound Size (MW)  HTS hit rates rises significantly with increasing compound MW Middle 80% of Cpds 270 470 Cumulative % Cpds % Cpds in MW Bin 4.0% Hit Rate (%) 2.62% 1.50% MW 1.2%  Overall Hit rate rises 1.7-fold across the middle 80% of the screening deck i.e. 70% rise in hit rate from MW = 270 to Molecular Weight (MW) MW = 470  3.3-fold rise across full MW range - Only bins containing 1M or more records are shown
  • 16. Compound Lipophilicity (ClogP)  HTS hit rates rises sharply with Middle 80% of Cpds increasing compound lipophilicity 1 5 Cumulative % Cpds % Cpds in ClogP Bin 4.5% 3.31% Hit Rate (%) ClogP 1.14% 1.1%  Overall hit rate rises 2.9-fold across the middle 80% of the screening deck i.e. from ClogP = 1  5 ClogP  4.1-fold rise across full ClogP range - Only bins containing 1M or more records are shown
  • 17. Promiscuity v. Molecular Properties  The prevalence of promiscuous compounds rises sharply with size and lipophilicity • Hit Frequency Index (HFI)= % of SS HTS campaigns that a compound give activity >cut-off • “Promiscuous” compound  HFI ≥ 10% (having seen at least 50 campaigns) % of Promiscuous Compounds % of Promiscuous Compounds % Rise in Promiscuity % Rise in Promiscuity Molecular Weight cLogP Across the middle 80% of the screening deck … • Large compounds are 4-fold more likely to have high HFI than small ones (MW: 270  470) • Lipophilic compounds are 10-fold more likely to have high HFI than polar ones ones (cLogP: 1  5)
  • 18. Property distributions vs. promiscuity - cLogP Compounds Compounds hitting hitting ~1 target >10% of targets cLogP Note; Compounds required to have been run in 50 HTS and yielded > 50% effect in a single screen to be included Frequency at bin > Frequency at bin > Frequency at bin > Frequency at bin > Inhibition frequency Index* (%) *Inhibition frequency index (IFI) = % of screens where cpd yielded >50% inhibition, where total screens run => 50
  • 19. The “Dark” Matter – Compounds which have not yielded >50% effect once in >50 screens Molecular Weight (Da) cLogP
  • 20. Translation of biases to full-curve follow-up  Property bias in primary HTS hit marking are propagated forward to dose- response follow-up SS testing FC testing FC – SS differential % Compounds Tested % Compounds Tested cLogP Molecular Weight Elevated testing of large, lipophilic Reduced testing of small, polar compounds compounds in the full-curve phase of HTS in the full-curve phase of HTS Note; Plots represent data from 402M single-concentration responses & 2.1M full-curve results
  • 21. Property Trends; translation to dose response  Property effects contribute to hits at all effect levels - i.e not just hits on the statistical margins  Property-dependence decreases through the HTS process Standard 3SD SS Hits Standard 3SD SS Hits Top 0.1% of SS Responses Top 0.1% of SS Responses % of cpds with IC50 <= 10 uM % of compounds with IC50 <= 10 uM % Lift in Hit Rate % Lift in Hit Rate ClogP MW From *ClogP = 1  5: From *MW = 270  470: • 3SD: 2.9X rise in Hit Rate • 3SD: 1.8X rise in Hit Rate • Top 0.1%: 2.2X rise • Top 0.1%: 1.3X rise • FC Active: 1.5X rise • FC Active: 1.2X rise *Across the middle 80% of the deck,….
  • 22. Property response of individual screens is highly variable e.g. Screens with largest response to cLogP Hit rate as % of HR at cLogP =3.5 cLogP
  • 23. Property response of individual screens is highly variable e.g. Screens with smallest response to cLogP Hit rate as % of HR at cLogP =3.5 cLogP
  • 24. Assay Technology Colored by Hit rate (%) Hit rate as % of HR at cLogP =3.5 cLogP
  • 25. Target Class Colored by Hit rate (%) Hit rate as % of HR at cLogP =3.5 cLogP
  • 26. Improving hit marking - reducing bias towards high cLogP, MW hits  Virtual partitioning of collection according to property - e.g. sub-collections in different cLogP ranges  Change the hit calling method, so this takes properties as well as % effect into account. - e.g. calculate hit cut-off’s bases on BEI/LEI etc. - “scalar” methods based on correcting the observed biases And..improving assays and the collection based on awareness of these biases
  • 27. Improving hit marking – Property Biasing Mean + 3 x RSD cut-off Hit Rate (%) Ordinary HTS Hit Marking Property-biased Hit Marking More attractive properties % Compounds - promote MW Less attractive Hit Rate (%) properties - demote Ordinary HTS Hit Marking Property-biased Hit Marking RESPONSE (% control) ClogP
  • 28. Improving hit marking HitProperty Binning Property-biased – Marking Sub-divide screening data into bins of compounds with similar properties - apply 3 x rsd hit cut-offs to each bin  Consensus method combines approaches – routinely implemented Response Response Property-Binned stats Property-Binned stats Property Consensus Property Consensus Hit Rate (%) Hit Rate (%) Bin 1; Bin 2; Bin 3; Bin 1; Bin 2; Bin 3; Low MW, Medium MW, High MW, Low MW, Medium MW, High MW, cLogP cLogP cLogP cLogP cLogP cLogP MW ClogP
  • 29. Evolving the screening collection to smaller, more polar lead-like space  GSK’s Compound Collection Enhancement (CCE) strategy has biased the HTS deck towards decreased size and lipophilicity with the aim of improving chemical starting points Compounds tested in HTS % Compounds Exceeding Property Limit - 2004 (% of total compounds in HTS) - 2010 - 2010 <> 2004 ClogP > 5 MW > 500 New 2011 ClogP Year CCE Acquisition, Property Bounds 2004-05: Lipinski criteria (MW<500, ClogP<5) Most recently: MW<360, ClogP<3 Inclusion of DPU lead-op cpds: MW<500, ClogP<5
  • 30. Property trends in MLPCN Screening Data  Primary data from around 100 Academic HTS campaigns obtained from PubChem BioAssay Lipophilicity – similar to GSK HTS Compound size – little effect 3.80% Hit Rate (%) Hit Rate (%) Pretty flat 2.27% 2.14% 1.28% ClogP (MW)  GSK screening deck (>50 HTSs, 2.01M cpds) ClogP = 0.00835*MW – 0.058, R2 = 0.18  PubChem Compounds (405k) ClogP = 0.00554*MW + 0.97, R2 = 0.09
  • 31. MLPCN Screening Data – Property Trends  Example Individual screen responses to cLogP Trellis by individual screens 3 x rsd hit rate (%) cLogP
  • 32. Small Beautiful Set Screening SBS = Subset of the HTS deck which spans the gap between HTS and fragments HTS collection (2M)  Filtered on; - size and lipophilicity • 10 ≤ HAC ≤ 28 and -2 ≤ ClogP ≤ 3, bounded (MW) - “promiscuity” – frequent-hitters are eliminated • IFI ≤ 3% (IFI = Inhibition Frequency Index, 3SD hit cutoff) - hit explosion opportunity • Near Neighbor Count ≥ 20 (in GSK registry - “shapliness” • fCsp3 ≥ 0.3 (i.e. ≥ 30% of carbon atoms must be sp3) - acquisition sub-structural filters - “greedy” diversity selection (no compounds >0.9 similar ) ClogP SBS2 = ~75,000 compounds Tested at higher concentration (e.g. 100-200 uM)
  • 33. Conclusions  Standard HTS processes favor the selection of larger, more lipophilic compounds  There are no clear trends between this behavior and assay technology or target class  Methods have been developed which (to some extent) compensate for property biases to ensure that attractive lead like molecules are selected - Overall hit rate in relation to downstream triage capacity is also critical - Aspire to hit rate to as close to “authentic pharmacology” rate as possible  Changing the trajectory of discovery chemical space requires an interplay between the composition of chemical libaries, assay practice, hit analysis and downstream Hit to Lead and Lead to Candidate chemistry practice
  • 34. Acknowledgements Pat Brady Tony Jurewicz James Chan Darren Green Glenn Hofmann Snehal Bhatt Stephen Pickett Stan Martens Amy Quinn Sunny Hung Jeff Gross Geoff Quinique Subhas Chakravorty Zining Wu Bob Hertzberg Nicola Richmond Mehu Patel Jesus Herranz Emilio Diez Gonzalo Colmeranjo-Sanchez Julio Martin-Plaza …and numerous others who contributed to the 300+ HTS campaigns run by GSK 2005-2010….. Screening & Compound Profiling
  • 36. Year of Screen Colored by Hit rate (%) Hit rate as % of HR at cLogP =3.5 cLogP
  • 37. Promiscuity v. Molecular Properties – Molecular weight Compounds Compounds hitting hitting ~1 target >10% of targets Molecular Weight (Da) Note; Compounds required to have been run in 50 HTS and yielded > 50% effect in a single screen to be included Frequency at bin > Frequency at bin > Frequency at bin > Frequency at bin > Inhibition frequency Index (%) *Inhibition frequency index (IFI) = % of screens where cpd yielded >50% inhibition, where total screens run => 50
  • 38. GSK HTS campaigns 2005-2010 Hit cut-off (% effect @ 10 uM) Hit rate (% of compounds) > cut-off Number of Screens Number of Screens Mean + 3 *RSD of % compounds with effect sample data (% control) > mean + 3 *RSD
  • 39. Validation and robustness methods cannot detect Property-biases Compound sets used to test robustness of assays and validate screening process reflect current compound acquistion practice, not the collection as tested cLogP MW
  • 40. Dose Response Data – Property Trends  Is the observed size & lipophilicity bias in HTS single-shot testing an artifact of false positives, e.g. experimental “noise”? % Rise in Active Rate % of Tests Yielding pXC50 ≥ 5 % Rise in Active Rate % of Tests Yielding pXC50 ≥ 5 % Rise in Active Rate Molecular Weight cLogP  No, size and lipophilicity dependence is still observed in the rate of identifying compounds at 10uM activity or better
  • 41. Molecular Property Correlations in GSKscreen  Table below shows the correlation coefficients (R2) between particular molecular properties and MW/ClogP, along with whether the correlation is positive or negative (i.e. the sign of the slope in a linear regression)  This data is computed using 2.09M compounds comprising GSKscreen Property R2, ± vs MW R2, ± vs ClogP MW 1, + 0.21, + ClogP 0.21, + 1.0, + HAC 0.92, + 0.19, + fCsp3 0.15, + 0.00 RotBonds 0.36, + 0.04, + tPSA 0.16, + 0.08, - Chiral 0.02, + 0.00 HetAtmRatio 0.02, - 0.34, - Complexity 0.31, + 0.02, + Flexibility 0.02, + 0.00 AromRings 0.22, + 0.16, + HBA 0.11, + 0.10, - HBD 0.01, + 0.02, - Across 2.09M cpds in GSKscreen