SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Real World Applications of
Proteochemometric Modeling
The Design of Enzyme Inhibitors and
Ligands of G-Protein Coupled Receptors
Contents
• Our current approach to Proteochemometric Modeling
• Part I: PCM applied to non-nucleoside reverse
  transcriptase inhibitors and HIV mutants
• Part II: PCM applied to small molecules and the
  Adenosine receptors
• Conclusions
What is PCM ?
 • Proteochemometric modeling needs both a ligand
   descriptor and a target descriptor
 • Descriptors need to be compatible with each other and
   need to be compatible with machine learning
   technique...




                                                      Bio-Informatics



GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
What is PCM ?
 • Proteochemometric modeling needs both a ligand
   descriptor and a target descriptor
 • Descriptors need to be compatible with each other and
   need to be compatible with machine learning
   technique...




                                                      Bio-Informatics



GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
What is PCM ?
 • Proteochemometric modeling needs both a ligand
   descriptor and a target descriptor
 • Descriptors need to be compatible with each other and
   need to be compatible with machine learning
   technique...




                                                      Bio-Informatics



GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
Ligand Descriptors
• Scitegic Circular Fingerprints
  ▫ Circular, substructure based
    fingerprints
  ▫ Maximal diameter of 3 bonds from
    central atom
  ▫ Each substructure is converted to a
    molecular feature
Target Descriptors
• Select binding site residues from full protein
  sequence
• Each unique hashed feature represents one
  amino acid type (comparable with circular
  fingerprints)
Machine Learning
• Using R-statistics as integrated with Pipeline Pilot
  ▫ Version 2.11.1 (64-bits)
• Sampled several machine learning techniques
  ▫ SVM
     Final method of choice
  ▫ PLS
  ▫ Random Forest
Real World Applications of PCM
• Part I: PCM of NNRTIs (analog series) on 14 mutants
  ▫ Output variable: pEC50
  ▫ Data set provided by Tibotec
  ▫ Prospectively validated
• Part II: PCM of small molecules on the Adenosine
  receptors
  ▫   Output variable pKi
  ▫   ChEMBL_04 / StarLite
  ▫   Both human and rat data combined
  ▫   Prospectively validated
Part I: PCM applied to NNRTIs
Which inhibitor(s) show(s) the best activity spectrum and
can proceed in drug development?
• 451 HIV Reverse Transcriptase   Sequence
                                             Mean    StdDev
                                                              n
                                             pEC50   pEC50
  (RT) inhibitors                  1 (wt)     8.3      0.6    451
• 14 HIV RT sequences                2        6.9      0.7    259
                                     3        7.6      0.6    444
  ▫ Between zero and 13 point        4        7.5      0.7    443
    mutations (at NNRTI              5        7.4      0.8    429
                                     6        6.0      0.6    316
    binding site)                    7        6.5      0.6    99
  ▫ Large differences in             8        6.9      0.7    147
    compound activity on             9        8.3      0.6    222
                                    10        7.9      0.7    252
    different sequences              11       7.5      0.7    257
                                    12        8.0      0.6    242
                                    13        7.4      0.8    244
                                    14        8.2      0.8    220
Binding Site
• Selected binding site based on point mutations present
  in the different strains
• 24 residues were selected
Used model to predict missing values

C
o
m
p
o
u
n
d
s



         Mutants

    Original Dataset   Completed with model
Prospective Validation
• Compounds have been
  experimentally validated
  ▫ Predictions where pEC50
    differs two sd from
    compound average
     (69 compound outliers)
  ▫ Predictions where pEC50
    differs two sd from
    sequence average
     (61 sequence outliers)
• Assay validation
                               Completed with model
Prospective Validation

• Model:
  ▫ R02 = 0.69
  ▫ RMSE = 0.62 log units

• Assay Validation
  ▫ R02 = 0.88
  ▫ RMSE = 0.50 log units
The Applicability Domain Concept Still
Holds in Target Space
• Prediction error similarity shows a direct correlation with
  average sequence similarity to training set

                                             R022
                                              R0     RMSE
                 1                                                              1


               0.8                                                              0.8

        R0 2   0.6                                                              0.6
                                                                                       RMSE
               0.4                                                              0.4


               0.2                                                              0.2


                 0                                                              0


               -0.2                                                             -0.2
                      0.5    0.6          0.7          0.8          0.9     1

                            Average Sequence Similarity with Training Set
The Applicability Domain Concept Still
Holds in Target Space
• Prediction error similarity shows a direct correlation with
  average sequence similarity to training set

                                             R022
                                              R0     RMSE
                 1                                                              1


               0.8                                                              0.8

        R0 2   0.6                                                              0.6
                                                                                       RMSE
               0.4                                                              0.4


               0.2                                                              0.2


                 0                                                              0


               -0.2                                                             -0.2
                      0.5    0.6          0.7          0.8          0.9     1

                            Average Sequence Similarity with Training Set
The Applicability Domain Concept Still
Holds in Target Space
• Prediction error similarity shows a direct correlation with
  average sequence similarity to training set

                                             R022
                                              R0     RMSE
                 1                                                              1


               0.8                                                              0.8

        R0 2   0.6                                                              0.6
                                                                                       RMSE
               0.4                                                              0.4


               0.2                                                              0.2


                 0                                                              0


               -0.2                                                             -0.2
                      0.5    0.6          0.7          0.8          0.9     1

                            Average Sequence Similarity with Training Set
Does PCM outperform scaling and QSAR?
• PCM outperforms QSAR models trained with identical
  descriptors on the same set
• When considering outliers, PCM outperforms scaling
• PCM can be applied to previously unseen mutants

     Validation                     pEC50            10-NN     10-NN     10-NN
                     Assay   PCM              QSAR
    Experiment                      scaling          (both)   (target)   (cmpd)


   R02 (Full plot)   0.88    0.69    0.69     0.31   0.41      0.21       0.28

   R02 (Outliers)    0.88    0.61    0.59     0.36   0.34      0.32       0.18




  RMSE (Full plot)   0.50    0.62    0.57     0.96   0.90      1.29       1.16

  RMSE (Outliers)    0.50    0.52    0.58     1.06   0.72      1.39       1.29
Model Interpretation (Sequences)
• Effect of mutation presence on compound pEC50
• High impact mutations are K101P, V179I and V179F
Model Interpretation (Compounds)
• Effect of substructure presence on compound pEC50
Model Interpretation (Compounds)
• Example of positively correlated substructure and
  negatively correlated substructure
Conclusions

• PCM can guide inhibitor design by predicting bioactivity
  profiles, as applied here to NNRTIs

• We have shown prospectively that the performance of
  PCM approaches assay reproducibility (RMSE 0.62 vs
  0.50)

• Interpretation allows selection between preferred
  chemical substructures and substructures to be avoided
Part II: PCM applied to the Adenosine
Receptors
• Model based on public data (ChEMBL_04)
• Included:
  ▫ Human receptor data
  ▫ (Historic) Rat receptor data
• Defined a single binding site (including ELs)
  ▫ Based on crystal structure 3EML and translated selected
    residues through MSA to other receptors
• Looking for novel A2A receptor ligands taking SAR
  information from other adenosine receptor subtypes
  into account
Selected Binding Site
Adenosine Receptor Data Set
• Little overlap between species
• Validation set consists of 4556 decoys and 43 known
  actives
                                                  External
Receptor   Human   Rat    Overlap   Range (pKi)                Decoy
                                                  Validation

  A1       1635    2216    147       4.5 - 9.7       130       1139


  A2A      1526    2051    215      4.5 - 10.5       57        1139


  A2B       780    803      79       4.5 - 9.7       11        1139


  A3       1661    327      82      4.5 - 10.0       255       1139
In-silico validation
• External validation on in
  house compound collection
  ▫ Lower quality data set leads
    to less predictive model
  ▫ Inclusion of Rat data
    improves model (RMSE 0.82
    vs 0.87)
• Our final model is able to
  separate actives from
  decoys
  ▫ 33 of the 43 known actives
    were in the top 50
Prospective Validation
• Scanned ChemDiv supplier database ( > 790,000 cmpds)
• Selected 55 compounds with focus on diverse chemistry
  ▫ Compounds were tested in-vitro
Conclusions

• We have found novel compounds active (in the
  nanomolar range) on the A2A receptor
  ▫ Hit rate ~11 %

• PCM models benefit from addition of similar targets from
  other species (RMSE improves from 0.87 to 0.82)

• PCM models can make robust predictions, even when
  trained on data from different labs
Further discussion
• Poster # 47 A. Hendriks, G.J.P. van Westen et al.
  ▫ Proteochemometric Modeling as a Tool to Predict Clinical
    Response to Antiretroviral Therapy Based on the Dominant
    Patient HIV Genotype

• Poster # 51 E.B. Lenselink, G.J.P. van Westen et al.
  ▫ A Global Class A GPCR Proteochemometric Model: A
    Prospective Validation

• Poster # 54 R.F. Swier, G.J.P. van Westen et al.
  ▫ 3D-neighbourhood Protein Descriptors for Proteochemometric
    Modeling
Acknowledgements


•   Prof. Ad IJzerman     •   Prof. Herman van Vlijmen
•   Andreas Bender        •   Joerg Wegner
•   Olaf van den Hoven    •   Anik Peeters
•   Rianne van der Pijl   •   Peggy Geluykens
•   Thea Mulder           •   Leen Kwanten
•   Henk de Vries         •   Inge Vereycken
•   Alwin Hendriks
•   Bart Lenselink
•   Remco Swier
Real World Applications of
Proteochemometric Modeling
The Design of Enzyme Inhibitors and
Ligands of G-Protein Coupled Receptors
Leave One Sequence Out
• By leaving out one sequence in training and validating a
  trained model on that sequence, model performance on
  novel mutants is emulated
Best performing compounds

Sequence Compound with highest    Activity      Full Model      Difference
               pEC50              (pEC50)        (pEC50)       (Activity and
                                                                 Model)
  All             326            8.39(± 0.61)   8.53(± 0.73)        0.14
   1              365                9.16           9.55           0.39
   2              221                8.19           8.38            0.19
   3               79                8.71           8.81            0.10
   4              321                8.83           8.79           0.04
   5              321                9.12           8.73           0.39
   6              221                8.01           7.93           0.08
   7              364              untested         7.50            n/a
   8              221              untested         8.42            n/a
   9              365              untested         9.43            n/a
  10              326              untested         9.23            n/a
  11              151                9.05           8.86            0.19
  12              321              untested         9.29            n/a
  13              100                9.06           8.87            0.19
  14               79                9.51           9.62            0.11
                                                  Average           0.18
Worst performing compounds

Sequence Compound with Lowest    Activity     Full Model     Difference
               pEC50             (pEC50)       (pEC50)      (Activity and
                                                              Model)
  All            109            5.85(±0.54)   5.82(±0.66)       0.03
   1             248                6.09          6.01          0.08
   2             109              untested        4.87           n/a
   3             422              untested        5.78           n/a
   4              84                5.84          5.67           0.17
   5              84                5.65          5.54           0.11
   6             109                4.60         4.06           0.54
   7             439                5.01          5.20           0.19
   8              84                4.74          5.20          0.46
   9             248              untested        5.96           n/a
  10             181                5.82          6.01           0.19
  11             181                5.42          5.61           0.19
  12             109                5.90         6.09            0.19
  13             181                5.11          5.29           0.18
  14             181                5.62          5.81           0.19
                                                Average          0.21

Weitere ähnliche Inhalte

Andere mochten auch (6)

Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminology
 
Introductory Lecture to Applied Mathematics Stream
Introductory Lecture to Applied Mathematics StreamIntroductory Lecture to Applied Mathematics Stream
Introductory Lecture to Applied Mathematics Stream
 
Applied Statistics - Introduction
Applied Statistics - IntroductionApplied Statistics - Introduction
Applied Statistics - Introduction
 
Problems statistics 1
Problems statistics 1Problems statistics 1
Problems statistics 1
 
Applied Statistics : Sampling method & central limit theorem
Applied Statistics : Sampling method & central limit theoremApplied Statistics : Sampling method & central limit theorem
Applied Statistics : Sampling method & central limit theorem
 
Role of Statistics in Scientific Research
Role of Statistics in Scientific ResearchRole of Statistics in Scientific Research
Role of Statistics in Scientific Research
 

Ähnlich wie 9th ICCS Noordwijkerhout

Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
pmaloney1
 
Mirthe mar12
Mirthe mar12Mirthe mar12
Mirthe mar12
jwzweck
 
Review solar prediction iea 07-06
Review solar prediction iea 07-06Review solar prediction iea 07-06
Review solar prediction iea 07-06
IrSOLaV Pomares
 
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia ws
Yu Suzuki
 
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
TERN Australia
 
Paper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculatorPaper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculator
Sérgio Sacani
 
Shape contexts
Shape contextsShape contexts
Shape contexts
huebesao
 
adc converter basics
adc converter basicsadc converter basics
adc converter basics
hacker1500
 
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware Classification
Silvio Cesare
 

Ähnlich wie 9th ICCS Noordwijkerhout (20)

Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
 
Towards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of ModularityTowards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of Modularity
 
Machine learning projects with r
Machine learning projects with rMachine learning projects with r
Machine learning projects with r
 
Keynote HotSWUp 2012
Keynote HotSWUp 2012Keynote HotSWUp 2012
Keynote HotSWUp 2012
 
Mirthe mar12
Mirthe mar12Mirthe mar12
Mirthe mar12
 
Gregoire H&N
Gregoire H&NGregoire H&N
Gregoire H&N
 
Review solar prediction iea 07-06
Review solar prediction iea 07-06Review solar prediction iea 07-06
Review solar prediction iea 07-06
 
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia ws
 
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
 
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
 
Paper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculatorPaper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculator
 
Shape contexts
Shape contextsShape contexts
Shape contexts
 
Wikimedia Conference 2009 presentation
Wikimedia Conference 2009 presentationWikimedia Conference 2009 presentation
Wikimedia Conference 2009 presentation
 
NumXL 1.55 LYNX release notes
NumXL 1.55 LYNX release notesNumXL 1.55 LYNX release notes
NumXL 1.55 LYNX release notes
 
Quality by Design : Design Space
Quality by Design :  Design SpaceQuality by Design :  Design Space
Quality by Design : Design Space
 
adc converter basics
adc converter basicsadc converter basics
adc converter basics
 
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware Classification
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
 
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
 

Kürzlich hochgeladen

Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
adilkhan87451
 

Kürzlich hochgeladen (20)

Call Girls Hosur Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Hosur Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Hosur Just Call 9630942363 Top Class Call Girl Service Available
 
Top Rated Bangalore Call Girls Majestic ⟟ 9332606886 ⟟ Call Me For Genuine S...
Top Rated Bangalore Call Girls Majestic ⟟  9332606886 ⟟ Call Me For Genuine S...Top Rated Bangalore Call Girls Majestic ⟟  9332606886 ⟟ Call Me For Genuine S...
Top Rated Bangalore Call Girls Majestic ⟟ 9332606886 ⟟ Call Me For Genuine S...
 
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
Best Rate (Guwahati ) Call Girls Guwahati ⟟ 8617370543 ⟟ High Class Call Girl...
 
Call Girls Vadodara Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Vadodara Just Call 8617370543 Top Class Call Girl Service AvailableCall Girls Vadodara Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Vadodara Just Call 8617370543 Top Class Call Girl Service Available
 
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
 
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
The Most Attractive Hyderabad Call Girls Kothapet 𖠋 9332606886 𖠋 Will You Mis...
 
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
 
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟ 9332606886 ⟟ Call Me For G...
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟  9332606886 ⟟ Call Me For G...Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟  9332606886 ⟟ Call Me For G...
Top Rated Bangalore Call Girls Ramamurthy Nagar ⟟ 9332606886 ⟟ Call Me For G...
 
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service AvailableCall Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 8617370543 Top Class Call Girl Service Available
 
Model Call Girls In Chennai WhatsApp Booking 7427069034 call girl service 24 ...
Model Call Girls In Chennai WhatsApp Booking 7427069034 call girl service 24 ...Model Call Girls In Chennai WhatsApp Booking 7427069034 call girl service 24 ...
Model Call Girls In Chennai WhatsApp Booking 7427069034 call girl service 24 ...
 
Call Girls Guntur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Guntur  Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Guntur  Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Guntur Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
 
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
 
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
Night 7k to 12k Chennai City Center Call Girls 👉👉 7427069034⭐⭐ 100% Genuine E...
 
Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...
Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...
Manyata Tech Park ( Call Girls ) Bangalore ✔ 6297143586 ✔ Hot Model With Sexy...
 
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any TimeTop Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
Top Quality Call Girl Service Kalyanpur 6378878445 Available Call Girls Any Time
 
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
 
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
 
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
 
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on WhatsappMost Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
 

9th ICCS Noordwijkerhout

  • 1. Real World Applications of Proteochemometric Modeling The Design of Enzyme Inhibitors and Ligands of G-Protein Coupled Receptors
  • 2. Contents • Our current approach to Proteochemometric Modeling • Part I: PCM applied to non-nucleoside reverse transcriptase inhibitors and HIV mutants • Part II: PCM applied to small molecules and the Adenosine receptors • Conclusions
  • 3. What is PCM ? • Proteochemometric modeling needs both a ligand descriptor and a target descriptor • Descriptors need to be compatible with each other and need to be compatible with machine learning technique... Bio-Informatics GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
  • 4. What is PCM ? • Proteochemometric modeling needs both a ligand descriptor and a target descriptor • Descriptors need to be compatible with each other and need to be compatible with machine learning technique... Bio-Informatics GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
  • 5. What is PCM ? • Proteochemometric modeling needs both a ligand descriptor and a target descriptor • Descriptors need to be compatible with each other and need to be compatible with machine learning technique... Bio-Informatics GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
  • 6. Ligand Descriptors • Scitegic Circular Fingerprints ▫ Circular, substructure based fingerprints ▫ Maximal diameter of 3 bonds from central atom ▫ Each substructure is converted to a molecular feature
  • 7. Target Descriptors • Select binding site residues from full protein sequence • Each unique hashed feature represents one amino acid type (comparable with circular fingerprints)
  • 8. Machine Learning • Using R-statistics as integrated with Pipeline Pilot ▫ Version 2.11.1 (64-bits) • Sampled several machine learning techniques ▫ SVM  Final method of choice ▫ PLS ▫ Random Forest
  • 9. Real World Applications of PCM • Part I: PCM of NNRTIs (analog series) on 14 mutants ▫ Output variable: pEC50 ▫ Data set provided by Tibotec ▫ Prospectively validated • Part II: PCM of small molecules on the Adenosine receptors ▫ Output variable pKi ▫ ChEMBL_04 / StarLite ▫ Both human and rat data combined ▫ Prospectively validated
  • 10. Part I: PCM applied to NNRTIs Which inhibitor(s) show(s) the best activity spectrum and can proceed in drug development? • 451 HIV Reverse Transcriptase Sequence Mean StdDev n pEC50 pEC50 (RT) inhibitors 1 (wt) 8.3 0.6 451 • 14 HIV RT sequences 2 6.9 0.7 259 3 7.6 0.6 444 ▫ Between zero and 13 point 4 7.5 0.7 443 mutations (at NNRTI 5 7.4 0.8 429 6 6.0 0.6 316 binding site) 7 6.5 0.6 99 ▫ Large differences in 8 6.9 0.7 147 compound activity on 9 8.3 0.6 222 10 7.9 0.7 252 different sequences 11 7.5 0.7 257 12 8.0 0.6 242 13 7.4 0.8 244 14 8.2 0.8 220
  • 11. Binding Site • Selected binding site based on point mutations present in the different strains • 24 residues were selected
  • 12. Used model to predict missing values C o m p o u n d s Mutants Original Dataset Completed with model
  • 13. Prospective Validation • Compounds have been experimentally validated ▫ Predictions where pEC50 differs two sd from compound average  (69 compound outliers) ▫ Predictions where pEC50 differs two sd from sequence average  (61 sequence outliers) • Assay validation Completed with model
  • 14. Prospective Validation • Model: ▫ R02 = 0.69 ▫ RMSE = 0.62 log units • Assay Validation ▫ R02 = 0.88 ▫ RMSE = 0.50 log units
  • 15. The Applicability Domain Concept Still Holds in Target Space • Prediction error similarity shows a direct correlation with average sequence similarity to training set R022 R0 RMSE 1 1 0.8 0.8 R0 2 0.6 0.6 RMSE 0.4 0.4 0.2 0.2 0 0 -0.2 -0.2 0.5 0.6 0.7 0.8 0.9 1 Average Sequence Similarity with Training Set
  • 16. The Applicability Domain Concept Still Holds in Target Space • Prediction error similarity shows a direct correlation with average sequence similarity to training set R022 R0 RMSE 1 1 0.8 0.8 R0 2 0.6 0.6 RMSE 0.4 0.4 0.2 0.2 0 0 -0.2 -0.2 0.5 0.6 0.7 0.8 0.9 1 Average Sequence Similarity with Training Set
  • 17. The Applicability Domain Concept Still Holds in Target Space • Prediction error similarity shows a direct correlation with average sequence similarity to training set R022 R0 RMSE 1 1 0.8 0.8 R0 2 0.6 0.6 RMSE 0.4 0.4 0.2 0.2 0 0 -0.2 -0.2 0.5 0.6 0.7 0.8 0.9 1 Average Sequence Similarity with Training Set
  • 18. Does PCM outperform scaling and QSAR? • PCM outperforms QSAR models trained with identical descriptors on the same set • When considering outliers, PCM outperforms scaling • PCM can be applied to previously unseen mutants Validation pEC50 10-NN 10-NN 10-NN Assay PCM QSAR Experiment scaling (both) (target) (cmpd) R02 (Full plot) 0.88 0.69 0.69 0.31 0.41 0.21 0.28 R02 (Outliers) 0.88 0.61 0.59 0.36 0.34 0.32 0.18 RMSE (Full plot) 0.50 0.62 0.57 0.96 0.90 1.29 1.16 RMSE (Outliers) 0.50 0.52 0.58 1.06 0.72 1.39 1.29
  • 19. Model Interpretation (Sequences) • Effect of mutation presence on compound pEC50 • High impact mutations are K101P, V179I and V179F
  • 20. Model Interpretation (Compounds) • Effect of substructure presence on compound pEC50
  • 21. Model Interpretation (Compounds) • Example of positively correlated substructure and negatively correlated substructure
  • 22. Conclusions • PCM can guide inhibitor design by predicting bioactivity profiles, as applied here to NNRTIs • We have shown prospectively that the performance of PCM approaches assay reproducibility (RMSE 0.62 vs 0.50) • Interpretation allows selection between preferred chemical substructures and substructures to be avoided
  • 23. Part II: PCM applied to the Adenosine Receptors • Model based on public data (ChEMBL_04) • Included: ▫ Human receptor data ▫ (Historic) Rat receptor data • Defined a single binding site (including ELs) ▫ Based on crystal structure 3EML and translated selected residues through MSA to other receptors • Looking for novel A2A receptor ligands taking SAR information from other adenosine receptor subtypes into account
  • 25. Adenosine Receptor Data Set • Little overlap between species • Validation set consists of 4556 decoys and 43 known actives External Receptor Human Rat Overlap Range (pKi) Decoy Validation A1 1635 2216 147 4.5 - 9.7 130 1139 A2A 1526 2051 215 4.5 - 10.5 57 1139 A2B 780 803 79 4.5 - 9.7 11 1139 A3 1661 327 82 4.5 - 10.0 255 1139
  • 26. In-silico validation • External validation on in house compound collection ▫ Lower quality data set leads to less predictive model ▫ Inclusion of Rat data improves model (RMSE 0.82 vs 0.87) • Our final model is able to separate actives from decoys ▫ 33 of the 43 known actives were in the top 50
  • 27. Prospective Validation • Scanned ChemDiv supplier database ( > 790,000 cmpds) • Selected 55 compounds with focus on diverse chemistry ▫ Compounds were tested in-vitro
  • 28. Conclusions • We have found novel compounds active (in the nanomolar range) on the A2A receptor ▫ Hit rate ~11 % • PCM models benefit from addition of similar targets from other species (RMSE improves from 0.87 to 0.82) • PCM models can make robust predictions, even when trained on data from different labs
  • 29. Further discussion • Poster # 47 A. Hendriks, G.J.P. van Westen et al. ▫ Proteochemometric Modeling as a Tool to Predict Clinical Response to Antiretroviral Therapy Based on the Dominant Patient HIV Genotype • Poster # 51 E.B. Lenselink, G.J.P. van Westen et al. ▫ A Global Class A GPCR Proteochemometric Model: A Prospective Validation • Poster # 54 R.F. Swier, G.J.P. van Westen et al. ▫ 3D-neighbourhood Protein Descriptors for Proteochemometric Modeling
  • 30. Acknowledgements • Prof. Ad IJzerman • Prof. Herman van Vlijmen • Andreas Bender • Joerg Wegner • Olaf van den Hoven • Anik Peeters • Rianne van der Pijl • Peggy Geluykens • Thea Mulder • Leen Kwanten • Henk de Vries • Inge Vereycken • Alwin Hendriks • Bart Lenselink • Remco Swier
  • 31. Real World Applications of Proteochemometric Modeling The Design of Enzyme Inhibitors and Ligands of G-Protein Coupled Receptors
  • 32.
  • 33. Leave One Sequence Out • By leaving out one sequence in training and validating a trained model on that sequence, model performance on novel mutants is emulated
  • 34. Best performing compounds Sequence Compound with highest Activity Full Model Difference pEC50 (pEC50) (pEC50) (Activity and Model) All 326 8.39(± 0.61) 8.53(± 0.73) 0.14 1 365 9.16 9.55 0.39 2 221 8.19 8.38 0.19 3 79 8.71 8.81 0.10 4 321 8.83 8.79 0.04 5 321 9.12 8.73 0.39 6 221 8.01 7.93 0.08 7 364 untested 7.50 n/a 8 221 untested 8.42 n/a 9 365 untested 9.43 n/a 10 326 untested 9.23 n/a 11 151 9.05 8.86 0.19 12 321 untested 9.29 n/a 13 100 9.06 8.87 0.19 14 79 9.51 9.62 0.11 Average 0.18
  • 35. Worst performing compounds Sequence Compound with Lowest Activity Full Model Difference pEC50 (pEC50) (pEC50) (Activity and Model) All 109 5.85(±0.54) 5.82(±0.66) 0.03 1 248 6.09 6.01 0.08 2 109 untested 4.87 n/a 3 422 untested 5.78 n/a 4 84 5.84 5.67 0.17 5 84 5.65 5.54 0.11 6 109 4.60 4.06 0.54 7 439 5.01 5.20 0.19 8 84 4.74 5.20 0.46 9 248 untested 5.96 n/a 10 181 5.82 6.01 0.19 11 181 5.42 5.61 0.19 12 109 5.90 6.09 0.19 13 181 5.11 5.29 0.18 14 181 5.62 5.81 0.19 Average 0.21