SlideShare ist ein Scribd-Unternehmen logo
1 von 71
Downloaden Sie, um offline zu lesen
Importance of PROCESS is not less than PRODUCT5/27/2014 1
Computer Aided Drug Design:
QSAR Related Methods
Jahan B Ghasemi
DDSLab K N Toosi Univ of Tech.
Tehran, Iran
5/27/2014 Importance of PROCESS is not less than PRODUCT
Topics in this Talk
are:
General
Introduction
Some of These
QSAR Steps:
3
Data Pre-Processing
Normalization
Standardization
Variable Selection
Subset Selection
Outlier Detection
Multivariate
Analysis
MLR
PCA
PLS
SVM
ANN
CART
Molecular
Descriptors
Constitutional
Electronic
Geometrical
Hydrophobic
Lipophilicity
Solubility
Steric
Quantum
Chemical
Topological
Molecular Structures
OC1=CC=CC=C1
1D
2D
3D
Statistical
Evaluation
R
R2
Q2
MSE
RMSE
PRESS
Importance of PROCESS is not less than PRODUCT
"Well begun is half done“ Aristotle
Renes Descartes in 1619 Quantitative
Measurement in Science
Research
Types
Inductive
Approach
Deductive
Approach
Abductive
Approach
5/27/2014 4
General
Introduction
Importance of PROCESS is not less than
PRODUCT
Theory
Hypothesis
Confirmation
Observation
Theory
Hypothesis
Observation
Pattern
Induction is usually described as moving from the specific to the general, while deduction begins
with the general and ends with the specific.
Arguments based on laws, rules and accepted principles are generally used for Deductive
Reasoning. Observations tend to be used for Inductive Arguments.
5/27/2014
-Metrics as soft-computing or soft-modeling are Inductive Research Approaches. Uncertainty
Are humans
natural logic
reasoners?
No!!!
5
5/27/2014 Importance of PROCESS is not less than PRODUCT
What Do We Need to Know in a Successful
QSAR Modeling as a Drug Design Tool?
6
I- Math-Science or Informatique or Informatics
Aspect
Linear Algebra
Vectors, Matrices,
Tensors…
Homogenous and regular linear and
nonlinear simultaneous equations
Graph Theory
Maximal Subgraph
Clique Detection
Multivariate Statistical
Analysis
Column Space, Row SpacePattern Recognition
(Dis)Similarity
Distance Metrics, Euclidean,
Manhattan, Mahalanobis
Fingerprints, Tanimoto,
Jaccard
Supervised and Unsupervised Pattern Recognition
Clustering, Agglomerative(bottom up), Divisive(top down)
MLR, PCA, PLS
Optimization
Selection of the most
informative variables,
GA
Selection of the most representative
objects, KS
Function minimization, Newton,
Gauss-Newton, Marquradt-Levenberg
Computer
Computer
Graphic
HPC
5/27/2014 Importance of PROCESS is not less than PRODUCT 7
5/27/2014 Importance of PROCESS is not less than PRODUCT
II-Bio-Science
Aspect
Chemistry
Organic Chemistry
Quantum/Molecular Mechanics
Forcefield, Conformer, Bioactive
Conformer
Medicinal Chemistry
Biology
Molecular Biology
Systems Biology
Pharmacology
Pharmacokinetics
Pharmacodynamics
Toxicity
ADMET
8
Combination
of I and II
OMICS
Bioinformatics
Proteomics
Metabolomics
Genomics
Metrics
Biometrics
Chemometrics
Technometrics
Chem(o)informatics
5/27/2014 Importance of PROCESS is not less than PRODUCT 9
QSAR is related to
the most of –
OMICS and –
METRICS
routines
Bio-
Science
Part Start
Here:
5/27/2014 Importance of PROCESS is not less than PRODUCT 10
Chemical Space
(Gathering Information from All Involved Species)
Aggregation
Host-Guest
Complex
Receptor-
Inhibitor
Complex
Macromolecules
Protein
Receptor
Host
Small
Molecules
Guest
Ligand
Inhibitor
5/27/2014 Importance of PROCESS is not less than PRODUCT 11
Chemical Space
Chemical Information
Information
due to
Macromolecule
Structure
Information
due to
Aggregation Structure
Information
Due to
Small Molecule
Structure
5/27/2014 Importance of PROCESS is not less than PRODUCT 12
To have
and use
Chemical
Space:
Extract and Convert
Chemical
Information
to
Numerical Values
We Are Calling
These Numerical
Values:
Molecular
Descriptors
5/27/2014 Importance of PROCESS is not less than PRODUCT 13
Descriptors should
be associated with
the following
desirable features:
Easy Interpretation
Show Correlation with a Property
Discrimination of Isomers
Independence
Simplicity
Not to be based on properties
Not to be trivially related to other descriptors
Allow for efficient construction
Use familiar structural concepts
Show gradual change with gradual change in structures
5/27/2014Importance of PROCESS is not less than PRODUCT
End Points to
Be Modeled
Chemical
properties
Boiling point
Retention time
Dielectric constant
Diffusion coefficient
Dissociation constant
Melting point
Reactivity
Solubility
Stability
Thermodynamic properties
Viscosity
5/27/2014Importance of PROCESS is not less than PRODUCT
End Points to
Be Modeled
Biological
Properties
Bioconcentration
Biodegradation
Carcinogenicity
Drug metabolism and clearance
Inhibition constant
Mutagenicity
Permeability
Blood brain barrier
Skin
Pharmacokinetics
Receptor binding
5/27/2014Importance of PROCESS is not less than PRODUCT
There are more
than 5500 Mol.
Des. BUT!
Why do we need more
and more Molecular
Descriptors?
Each molecular descriptor takes into account a small
part of the whole chemical information contained
into the real molecule and, as a consequence, the
number of descriptors is continuously increasing
with the increasing request of deeper investigations
on chemical and biological systems.
Different descriptors have independent methods or
perspectives to view a molecule, taking into account
the various features of chemical structure. Molecular
descriptors have now become some of the most
important variables used in molecular modeling,
and, consequently, managed by statistics,
chemometrics, and chemoinformatics.
5/27/2014 Importance of PROCESS is not less than PRODUCT 17
Molecular Descriptors
Cost to Generate:
Cheap Expensive
5/27/2014 Importance of PROCESS is not less than PRODUCT 18
Molecular
Descriptors
How to Calculate Molecular
Descriptors?
By Hand! By Software
Dragon SYBYL
PaDEL-
Descriptor
AdrianaCode
5/27/2014 Importance of PROCESS is not less than PRODUCT 19
Molecular Descriptors
Classes!
Different
Classes?
Yes
How many?
Many classes
What are the bases of
Classification?
Based of
Dimensionality
0D-4D
Geometric Constitutional Topological
Quantum
Chemical
etc….
Based of Origin
Theoretical Experimental
Both!
5/27/2014 Importance of PROCESS is not less than PRODUCT 20
Molecular
Descriptors
Do they have equal importance?
0D<1D<2D<2.5D<3D<4D…<nD
Low Information Content High Information Content
5/27/2014 Importance of PROCESS is not less than PRODUCT 21
Now We Have Molecular Descriptors and Chemical,
Molecular or Information Space
But first define and introduce:
Objects=
Molecules
Variables=
Descriptors
Object to Variable ratio ≥ 4
Why? Least-Squares Need
it!
5/27/2014 Importance of PROCESS is not less than PRODUCT 22
5/27/2014 Importance of PROCESS is not less than PRODUCT 23
Math-Science Part
Start Here: Using
a Very Efficient
Way to Show
Chemical
Information:
Matrix-Vector
Objects
as rows
Variables as Columns
1
2
3
.
.
.
.
.
.
.
.
.
.
n
1 2 3 . . . . . . . . . m
Objects
as rows
1
2
3
.
.
.
.
.
.
.
.
.
.
n
Preprocessing
On End Point
Vector y
nM unit
log Transformation
To Linearized the
Variation
To Have LFER
InterpretationMean Centering
Autoscaling
On Molecular
Descriptors Matrix
X
Mean Centering-
Has its general purpose
Autoscaling
Has its general purpose
Outlier Detection AD
Dimensionality
Reduction
PCA
5/27/2014 Importance of PROCESS is not less than PRODUCT 25
Geometrical Interpretation of Information Matrix
Spaces
Row
Space
Column Space:
Object Map
Metrics
Distances
Euclidean
and….
Classes Clusters Groups
5/27/2014 Importance of PROCESS is not less than PRODUCT 26
Row Space!
Is it informative? How? What does it mean? How can we use it?
On
O1
O2
Each Point is a Vector!
m-dimensional space Sm
n- points pattern Pn
Importance of PROCESS is not less than PRODUCT5/27/2014 27
Column Space
Objects Map Scientists(Chemists, Biologists..) are interest in!!!
Is it informative? How? What does it mean? How can we use it?
Vn
V1
V2
Class I or Group I
Class II or Group II
Each Point is a Vector!
n-dimensional space Sn
m- points pattern Pm
Importance of PROCESS is not less than PRODUCT5/27/2014 28
QSAR Model Building
Based on Molecular Geometry
2D-QSAR 2.5D-QSAR 3D-QSAR
5/27/2014 Importance of PROCESS is not less than PRODUCT 29
QSAR Model
Building
Type of Mapping Function
A Crucial Decision
Linear
MLR kNN PLS
Nonlinear
ANN SVM
Linear+Non-
Linear
DT + other Tree
and Ensemble
Methods
5/27/2014 Importance of PROCESS is not less than PRODUCT 30
QSAR Model Building
Object Selection-Data Splitting-Train-Test Sets
To have Good 1- Representative and 2- Diversity
y-Based Method
Randomly Evenly
X-Based Methods
Random
Selection
kNN
Selection
Similarity Principle
KS,SOM, LMD,
Duplex, MDC
5/27/2014 Importance of PROCESS is not less than PRODUCT 31
QSAR Model Building
Variable Selection
Filters
(Subjective)
Uninformative Variable Elimination (UVE)
Correlation Ranking (CR)
Wrappers
(Objective)
GA-PLS
Embedded
(Selection+Mapping Integrated)
Stepwise Selection
RM, ERM, FFD
5/27/2014 Importance of PROCESS is not less than PRODUCT 32
QSAR Model Building
Model Validation- There are different Criteria in the Literatures
Residual
Analysis
Analysis of
Varaince
Applicability
Domain
Residual Leverage
Good
Leverage
Bad
Leverage
Q_Residual T2
_Hotelling
Model Precision(Confidence
Intervals of Model Parameters)
Bootstrap
Resampling
Jackknife
Resampling
Model
Accuracy(Predic
tion Error)
Internal
Validation
Cross
Validation
Leave One
Out
Leave
Many Out
Scrambling
X-
randomization y-randomization
External
Validation
External and
Fully Unseen or
Independent Data
Set
5/27/2014 Importance of PROCESS is not less than PRODUCT
Final word on Validation: The
external Independent Unseen Data
Set Is Mandatory for a Successful
QSAR Model: Do you know why?
Local-X-Global or Induction
Research has Uncertainty
33
Purposes OF
QSAR:
Rational
Identification of
New Leads with:
Pharmacological,
Biocidal or
Pesticidal
Activity.
Optimization of
New Leads with:
Pharmacological,
Biocidal or
Pesticidal
Activity.
The Rational
Design of:
Surface-active
agents, Perfumes,
Dyes, and Fine
Chemicals. 5/27/2014Importance of PROCESS is not less than PRODUCT
Purposes OF
QSAR:
The Selection of
Compounds with
Optimal
Pharmacokinetic
Properties.
The Prediction of a
variety of Physico-
chemical Properties
of Molecules.
The Prediction of
the Fate of
Molecules.
The Rationalization
and Prediction of
the Combined
Effects of
Molecules.
5/27/2014Importance of PROCESS is not less than PRODUCT
Purposes OF
QSAR:
The Identification
of Hazardous
Compounds at
Early Stages.
The Designing out
of Toxicity and
Side-Effects in
New Compounds.
The Prediction of
Toxicity of
Compounds to
Humans.
The Prediction of
Toxicity to
Environmental
Species.
5/27/2014Importance of PROCESS is not less than PRODUCT
Original
Data Set
Curated
Dataset
Split into
training, test
and external
validation set
Multiple
Training
Sets
Y-Randomization
Combi-QSAR modeling
Multiple
Test Sets
Activity
Prediction
Only Retain
Models that
pass both
internal and
external
accuracy
filters
Validated
Predictive
models with
High Internal
and External
Accuracy
External
Validation using
Applicability
Domain
Virtual Screening
Using Applicability
Domain
Experimental
Validation
The Most Rigorous and Currently Accepted QSAR Methodology
5/27/2014Importance of PROCESS is not less than PRODUCT
5/27/2014 Importance of PROCESS is not less than PRODUCT
ASmallQuestion!!!
Why is QSAR alive in spite of the existence of very
strong rivals like Docking, MDs, Pharmacophore, SB
and LB methods?
Modeling and taking into account all pharmacological
phenomena is:
Nearly or totally impossible even in high level and
advanced research laboratories.
38
5/27/2014 Importance of PROCESS is not less than PRODUCT 39
Thank You All!
1
2
a
d
c
b
Which one would
be the third point?
a, b, c or d?
1 and 2 have the largest distance.
They are firstly selected. Then
distance between of all unselected
points and all selected points
calculated.
Calculate distances 1a and 2a then min(1a,2a)= 2a.
Calculate distances 1b and 2b then min(1b,2b)= 2b.
Calculate distances 1c and 2c then min(1c,2c)= 1c.
Calculate distances 1d and 2d then min(1d,2d)= 1d.
Max(min(1a,2a),min(1b,2b),min(1c,2c),min(1d,2d))=1d
Then the point d is selected as the Third Point and so on…
1a
2a
1b
2b
1c
2c1d
2d
KSA Graphical Algorithm
5/27/2014 40Importance of PROCESS is not less than PRODUCT
5/27/2014 Importance of PROCESS is not less than PRODUCT
Applicability Domain
41
Q Residuals and Hotelling T2
5/27/2014 Importance of PROCESS is not less than PRODUCT 42
5/27/2014 Importance of PROCESS is not less than PRODUCT 44
5/27/2014 Importance of PROCESS is not less than PRODUCT
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
0
2000
4000
6000
8000
10000
12000
1 2 3 4 5 6 7 8 9 10 11
Original Data
log Values
45
Activity Descr 1 Descr 2 … Descr m
Y1 X11 X12 … X1m
Y2 X21 X22 … X2m
… … … … …
Yn Xn1 Xn2 … Xnm
Yi = a0 + a1 Xi1 + a2 Xi2 +…+ am Xim
Don’t consider the nonlinearity effects
Multiple Linear Regression (MLR)
465/27/2014 Importance of PROCESS is not less than PRODUCT
nnn FqtqtqtY  2211
• t latent variables or scores
• q loading vectors
Partial Least Square (PLS)
Robust with respect to collinear descriptors
Only one model optimization parameter (LV’s )
Fast computational 47
48
Works on Similarity Principle
A compound in space close to, its kNN compounds from the training set and predicts the activity
class that is most highly represented among these neighbors.
The k-NN scheme is
sensitive: 1-
Distance Metric 2-
Number of training
compounds 3- k can
be optimized to
yield best results.
5/27/2014 Importance of PROCESS is not less than PRODUCT
The k-Nearest Neighbor Method kNN
Artificial Neural Network (ANN)
495/27/2014 Importance of PROCESS is not less than PRODUCT
DescriptorsorOriginalSpace
NonlinearorHiddenSpace
PropertiesBeingPredicted






otherwise
if


 
0
:Only the points outside the ε-tube are penalized in a
linear fashion
ε-Insensitive Loss Function
Support Vector Regression (SVR)
Support Vector Classification (SVC)
505/27/2014 Importance of PROCESS is not less than PRODUCT
Non-linear SVMs
 Datasets that are linearly separable with some noise work out great:
 But what are we going to do if the dataset is just too hard?
 How about… mapping data to a higher-dimensional space:
0
x
0 x
0 x
x2
5/27/2014 Importance of PROCESS is not less than PRODUCT 51
Non-linear SVMs: Feature spaces
 General idea: the original input space can always be mapped to some higher-
dimensional feature space where the training set is separable:
Φ: x → φ(x)
5/27/2014 Importance of PROCESS is not less than PRODUCT 52
Decision Trees as a Greedy Algorithm:
CART: Classification and regression Tree
Binary recursive partitioning tree
 Best First
 Left Right
 Up down
 Here the Variable to classify
Audience! Here the First
Variable is “Biologist or Not”?
Why? We are in Bio-Dept.
535/27/2014 Importance of PROCESS is not less than PRODUCT
3D-QSAR
Notes
Advantages over 2D-QSAR
No reliance on experimental values
Can be applied to molecules with unusual substituents
Not restricted to molecules of the same structural class in
(Pharmacophre 3D-QSAR case)
Predictive capability
5/27/2014 Importance of PROCESS is not less than PRODUCT 54
No experimental constants or measurements are involved
Properties are known as ‘Fields’
Steric field - defines the size and shape of the molecule
Electrostatic field - defines electron rich/poor regions of molecule
3D-QSAR
Comparative molecular field analysis (CoMFA) - Tripos
Build each molecule using modelling software
Identify the active conformation for each molecule
Identify the pharmacophore
Method
NHCH3
OH
HO
HO
Active conformation
Build 3D
model
Define pharmacophore
5/27/2014 Importance of PROCESS is not less than PRODUCT 55
3D-QSAR
Method
NHCH3
OH
HO
HO
Active conformation
Build 3D
model
Define pharmacophore
5/27/2014 Importance of PROCESS is not less than PRODUCT 56
Comparative molecular field analysis (CoMFA) - Tripos
Build each molecule using modelling software
Identify the active conformation for each molecule
Identify the pharmacophore
3D-QSAR
•Place the pharmacophore into a lattice of grid points
Method
•Each grid point defines a point in space
Grid points
.
.
.
.
.
5/27/2014 Importance of PROCESS is not less than PRODUCT 57
3D-QSAR
Method
•Each grid point defines a point in space
Grid points
.
.
.
.
.
•Position molecule to match the pharmacophore
5/27/2014 Importance of PROCESS is not less than PRODUCT 58
3D-QSAR
•A probe atom is placed at each grid point in turn
Method
•Probe atom = a proton or sp3 hybridised carbocation
.
.
.
.
.
Probe atom
5/27/2014 Importance of PROCESS is not less than PRODUCT 59
3D-QSAR
•A probe atom is placed at each grid point in turn
Method
•Measure the steric or electrostatic interaction of the probe atom
with the molecule at each grid point
.
.
.
.
.
Probe atom
5/27/2014 Importance of PROCESS is not less than PRODUCT 60
3D-QSAR
Method
Compound Biological Steric fields (S) Electrostatic fields (E)
activity at grid points (001-998) at grid points (001-098)
S001 S002 S003 S004 S005 etc E001 E002 E003 E004 E005 etc
1 5.1
2 6.8
3 5.3
4 6.4
5 6.1
Tabulate fields for each compound at each grid point
Partial least squares analysis (PLS)
QSAR equation Activity = aS001 + bS002 +……..mS998 + nE001 +…….+yE998 + z
. .
.
.
.
5/27/2014 Importance of PROCESS is not less than PRODUCT 62
3D-QSAR
•Define fields using contour maps round a representative molecule
Method
5/27/2014 Importance of PROCESS is not less than PRODUCT 63
A procedure based on the information included in the
MIF
generating a handful of informative variables,
independent of the location of the molecules within the
grid
Two main steps of the procedure of transformation:
 Field filtering
 Maximum auto-cross correlation(MACC2) encoding.
2 means distance between two points in the space.
2.5D-QSAR or GRIND methodology
5/27/2014 Importance of PROCESS is not less than PRODUCT 64
MACC2 transform
 The MACC transform has
maximum value of the products of
the two i and j field values, found
at each different rij distance.
 Here the colors represent the
activity of the compounds (blue
inactive, red active)
 33 means the energy products
produced by two N1 probes
 8 means the 8th variable of auto-
correlogram 33
5/27/2014 Importance of PROCESS is not less than PRODUCT 65
GRID interaction fields
calculated using the N1 probe:
positive (yellow) interactions
describe unfavorable and
negative (blue) interactions
describe favorable interactions
they should have low
energy values
(representing highly
favorable interactions)
they should be as far as
possible one from each
other.
5/27/2014 Importance of PROCESS is not less than PRODUCT 66
5/27/2014 Importance of PROCESS is not less than PRODUCT 67
Each number are corresponds to
a specific distance of the fields
5/27/2014 Importance of PROCESS is not less than PRODUCT 68
5/27/2014 Importance of PROCESS is not less than PRODUCT 69
5/27/2014 Importance of PROCESS is not less than PRODUCT 70
5/27/2014 Importance of PROCESS is not less than PRODUCT 71
5/27/2014 Importance of PROCESS is not less than PRODUCT 72
One of the unique features of the MACC
transform is that it is possible to trace back the
variables that generated this "most intense"
interaction.
5/27/2014 Importance of PROCESS is not less than PRODUCT 73
VRS

Weitere ähnliche Inhalte

Was ist angesagt?

conformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mappingconformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mapping
Vishakha Giradkar
 

Was ist angesagt? (20)

MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
MOLECULAR DOCKING AND RELATED DRUG DESIGN ACHIEVEMENTS
 
The quest for novel chemical matter ~de novo drug design by sumiran
The quest for novel chemical matter ~de novo drug design by sumiranThe quest for novel chemical matter ~de novo drug design by sumiran
The quest for novel chemical matter ~de novo drug design by sumiran
 
Fbdd
FbddFbdd
Fbdd
 
Hansch and Free-Wilson QSAR Models
Hansch and Free-Wilson QSAR ModelsHansch and Free-Wilson QSAR Models
Hansch and Free-Wilson QSAR Models
 
Rational drug design method
Rational drug design methodRational drug design method
Rational drug design method
 
MOLECULAR DOCKING
MOLECULAR DOCKINGMOLECULAR DOCKING
MOLECULAR DOCKING
 
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARMDENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
DENOVO DRUG DESIGN AS PER PCI SYLLABUS M.PHARM
 
De novo drug design
De novo drug designDe novo drug design
De novo drug design
 
Rational drug design
Rational drug designRational drug design
Rational drug design
 
conformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mappingconformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mapping
 
Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)
 
(Kartik Tiwari) Denovo Drug Design.pptx
(Kartik Tiwari) Denovo Drug Design.pptx(Kartik Tiwari) Denovo Drug Design.pptx
(Kartik Tiwari) Denovo Drug Design.pptx
 
Quantitative structure activity relationships
Quantitative structure  activity relationshipsQuantitative structure  activity relationships
Quantitative structure activity relationships
 
In Silico methods for ADMET prediction of new molecules
 In Silico methods for ADMET prediction of new molecules In Silico methods for ADMET prediction of new molecules
In Silico methods for ADMET prediction of new molecules
 
Basics Of Molecular Docking
Basics Of Molecular DockingBasics Of Molecular Docking
Basics Of Molecular Docking
 
In silico drug design/Molecular docking
In silico drug design/Molecular dockingIn silico drug design/Molecular docking
In silico drug design/Molecular docking
 
Virtual Screening in Drug Discovery
Virtual Screening in Drug DiscoveryVirtual Screening in Drug Discovery
Virtual Screening in Drug Discovery
 
De Novo Drug Design
De Novo Drug DesignDe Novo Drug Design
De Novo Drug Design
 
MOLECULAR DOCKING.pptx
MOLECULAR DOCKING.pptxMOLECULAR DOCKING.pptx
MOLECULAR DOCKING.pptx
 
MOLECULAR DOCKING
MOLECULAR DOCKINGMOLECULAR DOCKING
MOLECULAR DOCKING
 

Ähnlich wie Computer Aided Drug Design QSAR Related Methods

Analytical QBD -CPHI 25-27 July R00
Analytical QBD  -CPHI 25-27 July R00Analytical QBD  -CPHI 25-27 July R00
Analytical QBD -CPHI 25-27 July R00
Vijay Dhonde
 
How predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarHow predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinar
Ann-Marie Roche
 
An Ontology-underpinned Decision-Support System for Wastewater management
An Ontology-underpinned Decision-Support System for Wastewater managementAn Ontology-underpinned Decision-Support System for Wastewater management
An Ontology-underpinned Decision-Support System for Wastewater management
Luigi Ceccaroni
 
Idss for evaluating & selecting is project hepu deng santoso
Idss for evaluating & selecting is project  hepu deng santosoIdss for evaluating & selecting is project  hepu deng santoso
Idss for evaluating & selecting is project hepu deng santoso
Anita Carollin
 
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
MelisaRubio1
 
Final presentation construction collaboration
Final presentation construction collaborationFinal presentation construction collaboration
Final presentation construction collaboration
ShivamDwivedi14770
 

Ähnlich wie Computer Aided Drug Design QSAR Related Methods (20)

Multivariate Chemical Space
Multivariate Chemical SpaceMultivariate Chemical Space
Multivariate Chemical Space
 
Mining the LET Performance in Generating Prediction Models for OTDSS
Mining the LET Performance in Generating Prediction Models for OTDSSMining the LET Performance in Generating Prediction Models for OTDSS
Mining the LET Performance in Generating Prediction Models for OTDSS
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar
 
Basics of QSAR Modeling
Basics of QSAR ModelingBasics of QSAR Modeling
Basics of QSAR Modeling
 
Analytical QBD -CPHI 25-27 July R00
Analytical QBD  -CPHI 25-27 July R00Analytical QBD  -CPHI 25-27 July R00
Analytical QBD -CPHI 25-27 July R00
 
39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf
39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf
39584792-Optimization-of-Chemical-Processes-Edgar-Himmelblau.pdf
 
Analytical QbD
Analytical QbDAnalytical QbD
Analytical QbD
 
Analytical QbD
Analytical QbDAnalytical QbD
Analytical QbD
 
Analytical QbD
Analytical QbDAnalytical QbD
Analytical QbD
 
How predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarHow predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinar
 
An Ontology-underpinned Decision-Support System for Wastewater management
An Ontology-underpinned Decision-Support System for Wastewater managementAn Ontology-underpinned Decision-Support System for Wastewater management
An Ontology-underpinned Decision-Support System for Wastewater management
 
Establishing weightages of Criteria and Key Aspects for Quality Assessment of...
Establishing weightages of Criteria and Key Aspects for Quality Assessment of...Establishing weightages of Criteria and Key Aspects for Quality Assessment of...
Establishing weightages of Criteria and Key Aspects for Quality Assessment of...
 
A KPI-based process monitoring and fault detection framework for large-scale ...
A KPI-based process monitoring and fault detection framework for large-scale ...A KPI-based process monitoring and fault detection framework for large-scale ...
A KPI-based process monitoring and fault detection framework for large-scale ...
 
Idss for evaluating & selecting is project hepu deng santoso
Idss for evaluating & selecting is project  hepu deng santosoIdss for evaluating & selecting is project  hepu deng santoso
Idss for evaluating & selecting is project hepu deng santoso
 
Survey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesSurvey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction Techniques
 
7th sem open elective list and syllabus
7th sem open elective list and syllabus7th sem open elective list and syllabus
7th sem open elective list and syllabus
 
Feature selection in multimodal
Feature selection in multimodalFeature selection in multimodal
Feature selection in multimodal
 
2021 ME.pdf
2021 ME.pdf2021 ME.pdf
2021 ME.pdf
 
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
Microeconometrics. Methods and applications by A. Colin Cameron, Pravin K. Tr...
 
Final presentation construction collaboration
Final presentation construction collaborationFinal presentation construction collaboration
Final presentation construction collaboration
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Computer Aided Drug Design QSAR Related Methods

  • 1. Importance of PROCESS is not less than PRODUCT5/27/2014 1
  • 2. Computer Aided Drug Design: QSAR Related Methods Jahan B Ghasemi DDSLab K N Toosi Univ of Tech. Tehran, Iran
  • 3. 5/27/2014 Importance of PROCESS is not less than PRODUCT Topics in this Talk are: General Introduction Some of These QSAR Steps: 3 Data Pre-Processing Normalization Standardization Variable Selection Subset Selection Outlier Detection Multivariate Analysis MLR PCA PLS SVM ANN CART Molecular Descriptors Constitutional Electronic Geometrical Hydrophobic Lipophilicity Solubility Steric Quantum Chemical Topological Molecular Structures OC1=CC=CC=C1 1D 2D 3D Statistical Evaluation R R2 Q2 MSE RMSE PRESS
  • 4. Importance of PROCESS is not less than PRODUCT "Well begun is half done“ Aristotle Renes Descartes in 1619 Quantitative Measurement in Science Research Types Inductive Approach Deductive Approach Abductive Approach 5/27/2014 4 General Introduction
  • 5. Importance of PROCESS is not less than PRODUCT Theory Hypothesis Confirmation Observation Theory Hypothesis Observation Pattern Induction is usually described as moving from the specific to the general, while deduction begins with the general and ends with the specific. Arguments based on laws, rules and accepted principles are generally used for Deductive Reasoning. Observations tend to be used for Inductive Arguments. 5/27/2014 -Metrics as soft-computing or soft-modeling are Inductive Research Approaches. Uncertainty Are humans natural logic reasoners? No!!! 5
  • 6. 5/27/2014 Importance of PROCESS is not less than PRODUCT What Do We Need to Know in a Successful QSAR Modeling as a Drug Design Tool? 6
  • 7. I- Math-Science or Informatique or Informatics Aspect Linear Algebra Vectors, Matrices, Tensors… Homogenous and regular linear and nonlinear simultaneous equations Graph Theory Maximal Subgraph Clique Detection Multivariate Statistical Analysis Column Space, Row SpacePattern Recognition (Dis)Similarity Distance Metrics, Euclidean, Manhattan, Mahalanobis Fingerprints, Tanimoto, Jaccard Supervised and Unsupervised Pattern Recognition Clustering, Agglomerative(bottom up), Divisive(top down) MLR, PCA, PLS Optimization Selection of the most informative variables, GA Selection of the most representative objects, KS Function minimization, Newton, Gauss-Newton, Marquradt-Levenberg Computer Computer Graphic HPC 5/27/2014 Importance of PROCESS is not less than PRODUCT 7
  • 8. 5/27/2014 Importance of PROCESS is not less than PRODUCT II-Bio-Science Aspect Chemistry Organic Chemistry Quantum/Molecular Mechanics Forcefield, Conformer, Bioactive Conformer Medicinal Chemistry Biology Molecular Biology Systems Biology Pharmacology Pharmacokinetics Pharmacodynamics Toxicity ADMET 8
  • 9. Combination of I and II OMICS Bioinformatics Proteomics Metabolomics Genomics Metrics Biometrics Chemometrics Technometrics Chem(o)informatics 5/27/2014 Importance of PROCESS is not less than PRODUCT 9 QSAR is related to the most of – OMICS and – METRICS routines
  • 10. Bio- Science Part Start Here: 5/27/2014 Importance of PROCESS is not less than PRODUCT 10
  • 11. Chemical Space (Gathering Information from All Involved Species) Aggregation Host-Guest Complex Receptor- Inhibitor Complex Macromolecules Protein Receptor Host Small Molecules Guest Ligand Inhibitor 5/27/2014 Importance of PROCESS is not less than PRODUCT 11
  • 12. Chemical Space Chemical Information Information due to Macromolecule Structure Information due to Aggregation Structure Information Due to Small Molecule Structure 5/27/2014 Importance of PROCESS is not less than PRODUCT 12
  • 13. To have and use Chemical Space: Extract and Convert Chemical Information to Numerical Values We Are Calling These Numerical Values: Molecular Descriptors 5/27/2014 Importance of PROCESS is not less than PRODUCT 13
  • 14. Descriptors should be associated with the following desirable features: Easy Interpretation Show Correlation with a Property Discrimination of Isomers Independence Simplicity Not to be based on properties Not to be trivially related to other descriptors Allow for efficient construction Use familiar structural concepts Show gradual change with gradual change in structures 5/27/2014Importance of PROCESS is not less than PRODUCT
  • 15. End Points to Be Modeled Chemical properties Boiling point Retention time Dielectric constant Diffusion coefficient Dissociation constant Melting point Reactivity Solubility Stability Thermodynamic properties Viscosity 5/27/2014Importance of PROCESS is not less than PRODUCT
  • 16. End Points to Be Modeled Biological Properties Bioconcentration Biodegradation Carcinogenicity Drug metabolism and clearance Inhibition constant Mutagenicity Permeability Blood brain barrier Skin Pharmacokinetics Receptor binding 5/27/2014Importance of PROCESS is not less than PRODUCT
  • 17. There are more than 5500 Mol. Des. BUT! Why do we need more and more Molecular Descriptors? Each molecular descriptor takes into account a small part of the whole chemical information contained into the real molecule and, as a consequence, the number of descriptors is continuously increasing with the increasing request of deeper investigations on chemical and biological systems. Different descriptors have independent methods or perspectives to view a molecule, taking into account the various features of chemical structure. Molecular descriptors have now become some of the most important variables used in molecular modeling, and, consequently, managed by statistics, chemometrics, and chemoinformatics. 5/27/2014 Importance of PROCESS is not less than PRODUCT 17
  • 18. Molecular Descriptors Cost to Generate: Cheap Expensive 5/27/2014 Importance of PROCESS is not less than PRODUCT 18
  • 19. Molecular Descriptors How to Calculate Molecular Descriptors? By Hand! By Software Dragon SYBYL PaDEL- Descriptor AdrianaCode 5/27/2014 Importance of PROCESS is not less than PRODUCT 19
  • 20. Molecular Descriptors Classes! Different Classes? Yes How many? Many classes What are the bases of Classification? Based of Dimensionality 0D-4D Geometric Constitutional Topological Quantum Chemical etc…. Based of Origin Theoretical Experimental Both! 5/27/2014 Importance of PROCESS is not less than PRODUCT 20
  • 21. Molecular Descriptors Do they have equal importance? 0D<1D<2D<2.5D<3D<4D…<nD Low Information Content High Information Content 5/27/2014 Importance of PROCESS is not less than PRODUCT 21
  • 22. Now We Have Molecular Descriptors and Chemical, Molecular or Information Space But first define and introduce: Objects= Molecules Variables= Descriptors Object to Variable ratio ≥ 4 Why? Least-Squares Need it! 5/27/2014 Importance of PROCESS is not less than PRODUCT 22
  • 23. 5/27/2014 Importance of PROCESS is not less than PRODUCT 23 Math-Science Part Start Here: Using a Very Efficient Way to Show Chemical Information: Matrix-Vector
  • 24. Objects as rows Variables as Columns 1 2 3 . . . . . . . . . . n 1 2 3 . . . . . . . . . m Objects as rows 1 2 3 . . . . . . . . . . n
  • 25. Preprocessing On End Point Vector y nM unit log Transformation To Linearized the Variation To Have LFER InterpretationMean Centering Autoscaling On Molecular Descriptors Matrix X Mean Centering- Has its general purpose Autoscaling Has its general purpose Outlier Detection AD Dimensionality Reduction PCA 5/27/2014 Importance of PROCESS is not less than PRODUCT 25
  • 26. Geometrical Interpretation of Information Matrix Spaces Row Space Column Space: Object Map Metrics Distances Euclidean and…. Classes Clusters Groups 5/27/2014 Importance of PROCESS is not less than PRODUCT 26
  • 27. Row Space! Is it informative? How? What does it mean? How can we use it? On O1 O2 Each Point is a Vector! m-dimensional space Sm n- points pattern Pn Importance of PROCESS is not less than PRODUCT5/27/2014 27
  • 28. Column Space Objects Map Scientists(Chemists, Biologists..) are interest in!!! Is it informative? How? What does it mean? How can we use it? Vn V1 V2 Class I or Group I Class II or Group II Each Point is a Vector! n-dimensional space Sn m- points pattern Pm Importance of PROCESS is not less than PRODUCT5/27/2014 28
  • 29. QSAR Model Building Based on Molecular Geometry 2D-QSAR 2.5D-QSAR 3D-QSAR 5/27/2014 Importance of PROCESS is not less than PRODUCT 29
  • 30. QSAR Model Building Type of Mapping Function A Crucial Decision Linear MLR kNN PLS Nonlinear ANN SVM Linear+Non- Linear DT + other Tree and Ensemble Methods 5/27/2014 Importance of PROCESS is not less than PRODUCT 30
  • 31. QSAR Model Building Object Selection-Data Splitting-Train-Test Sets To have Good 1- Representative and 2- Diversity y-Based Method Randomly Evenly X-Based Methods Random Selection kNN Selection Similarity Principle KS,SOM, LMD, Duplex, MDC 5/27/2014 Importance of PROCESS is not less than PRODUCT 31
  • 32. QSAR Model Building Variable Selection Filters (Subjective) Uninformative Variable Elimination (UVE) Correlation Ranking (CR) Wrappers (Objective) GA-PLS Embedded (Selection+Mapping Integrated) Stepwise Selection RM, ERM, FFD 5/27/2014 Importance of PROCESS is not less than PRODUCT 32
  • 33. QSAR Model Building Model Validation- There are different Criteria in the Literatures Residual Analysis Analysis of Varaince Applicability Domain Residual Leverage Good Leverage Bad Leverage Q_Residual T2 _Hotelling Model Precision(Confidence Intervals of Model Parameters) Bootstrap Resampling Jackknife Resampling Model Accuracy(Predic tion Error) Internal Validation Cross Validation Leave One Out Leave Many Out Scrambling X- randomization y-randomization External Validation External and Fully Unseen or Independent Data Set 5/27/2014 Importance of PROCESS is not less than PRODUCT Final word on Validation: The external Independent Unseen Data Set Is Mandatory for a Successful QSAR Model: Do you know why? Local-X-Global or Induction Research has Uncertainty 33
  • 34. Purposes OF QSAR: Rational Identification of New Leads with: Pharmacological, Biocidal or Pesticidal Activity. Optimization of New Leads with: Pharmacological, Biocidal or Pesticidal Activity. The Rational Design of: Surface-active agents, Perfumes, Dyes, and Fine Chemicals. 5/27/2014Importance of PROCESS is not less than PRODUCT
  • 35. Purposes OF QSAR: The Selection of Compounds with Optimal Pharmacokinetic Properties. The Prediction of a variety of Physico- chemical Properties of Molecules. The Prediction of the Fate of Molecules. The Rationalization and Prediction of the Combined Effects of Molecules. 5/27/2014Importance of PROCESS is not less than PRODUCT
  • 36. Purposes OF QSAR: The Identification of Hazardous Compounds at Early Stages. The Designing out of Toxicity and Side-Effects in New Compounds. The Prediction of Toxicity of Compounds to Humans. The Prediction of Toxicity to Environmental Species. 5/27/2014Importance of PROCESS is not less than PRODUCT
  • 37. Original Data Set Curated Dataset Split into training, test and external validation set Multiple Training Sets Y-Randomization Combi-QSAR modeling Multiple Test Sets Activity Prediction Only Retain Models that pass both internal and external accuracy filters Validated Predictive models with High Internal and External Accuracy External Validation using Applicability Domain Virtual Screening Using Applicability Domain Experimental Validation The Most Rigorous and Currently Accepted QSAR Methodology 5/27/2014Importance of PROCESS is not less than PRODUCT
  • 38. 5/27/2014 Importance of PROCESS is not less than PRODUCT ASmallQuestion!!! Why is QSAR alive in spite of the existence of very strong rivals like Docking, MDs, Pharmacophore, SB and LB methods? Modeling and taking into account all pharmacological phenomena is: Nearly or totally impossible even in high level and advanced research laboratories. 38
  • 39. 5/27/2014 Importance of PROCESS is not less than PRODUCT 39 Thank You All!
  • 40. 1 2 a d c b Which one would be the third point? a, b, c or d? 1 and 2 have the largest distance. They are firstly selected. Then distance between of all unselected points and all selected points calculated. Calculate distances 1a and 2a then min(1a,2a)= 2a. Calculate distances 1b and 2b then min(1b,2b)= 2b. Calculate distances 1c and 2c then min(1c,2c)= 1c. Calculate distances 1d and 2d then min(1d,2d)= 1d. Max(min(1a,2a),min(1b,2b),min(1c,2c),min(1d,2d))=1d Then the point d is selected as the Third Point and so on… 1a 2a 1b 2b 1c 2c1d 2d KSA Graphical Algorithm 5/27/2014 40Importance of PROCESS is not less than PRODUCT
  • 41. 5/27/2014 Importance of PROCESS is not less than PRODUCT Applicability Domain 41
  • 42. Q Residuals and Hotelling T2 5/27/2014 Importance of PROCESS is not less than PRODUCT 42
  • 43. 5/27/2014 Importance of PROCESS is not less than PRODUCT 44
  • 44. 5/27/2014 Importance of PROCESS is not less than PRODUCT -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 2000 4000 6000 8000 10000 12000 1 2 3 4 5 6 7 8 9 10 11 Original Data log Values 45
  • 45. Activity Descr 1 Descr 2 … Descr m Y1 X11 X12 … X1m Y2 X21 X22 … X2m … … … … … Yn Xn1 Xn2 … Xnm Yi = a0 + a1 Xi1 + a2 Xi2 +…+ am Xim Don’t consider the nonlinearity effects Multiple Linear Regression (MLR) 465/27/2014 Importance of PROCESS is not less than PRODUCT
  • 46. nnn FqtqtqtY  2211 • t latent variables or scores • q loading vectors Partial Least Square (PLS) Robust with respect to collinear descriptors Only one model optimization parameter (LV’s ) Fast computational 47
  • 47. 48 Works on Similarity Principle A compound in space close to, its kNN compounds from the training set and predicts the activity class that is most highly represented among these neighbors. The k-NN scheme is sensitive: 1- Distance Metric 2- Number of training compounds 3- k can be optimized to yield best results. 5/27/2014 Importance of PROCESS is not less than PRODUCT The k-Nearest Neighbor Method kNN
  • 48. Artificial Neural Network (ANN) 495/27/2014 Importance of PROCESS is not less than PRODUCT DescriptorsorOriginalSpace NonlinearorHiddenSpace PropertiesBeingPredicted
  • 49.       otherwise if     0 :Only the points outside the ε-tube are penalized in a linear fashion ε-Insensitive Loss Function Support Vector Regression (SVR) Support Vector Classification (SVC) 505/27/2014 Importance of PROCESS is not less than PRODUCT
  • 50. Non-linear SVMs  Datasets that are linearly separable with some noise work out great:  But what are we going to do if the dataset is just too hard?  How about… mapping data to a higher-dimensional space: 0 x 0 x 0 x x2 5/27/2014 Importance of PROCESS is not less than PRODUCT 51
  • 51. Non-linear SVMs: Feature spaces  General idea: the original input space can always be mapped to some higher- dimensional feature space where the training set is separable: Φ: x → φ(x) 5/27/2014 Importance of PROCESS is not less than PRODUCT 52
  • 52. Decision Trees as a Greedy Algorithm: CART: Classification and regression Tree Binary recursive partitioning tree  Best First  Left Right  Up down  Here the Variable to classify Audience! Here the First Variable is “Biologist or Not”? Why? We are in Bio-Dept. 535/27/2014 Importance of PROCESS is not less than PRODUCT
  • 53. 3D-QSAR Notes Advantages over 2D-QSAR No reliance on experimental values Can be applied to molecules with unusual substituents Not restricted to molecules of the same structural class in (Pharmacophre 3D-QSAR case) Predictive capability 5/27/2014 Importance of PROCESS is not less than PRODUCT 54 No experimental constants or measurements are involved Properties are known as ‘Fields’ Steric field - defines the size and shape of the molecule Electrostatic field - defines electron rich/poor regions of molecule
  • 54. 3D-QSAR Comparative molecular field analysis (CoMFA) - Tripos Build each molecule using modelling software Identify the active conformation for each molecule Identify the pharmacophore Method NHCH3 OH HO HO Active conformation Build 3D model Define pharmacophore 5/27/2014 Importance of PROCESS is not less than PRODUCT 55
  • 55. 3D-QSAR Method NHCH3 OH HO HO Active conformation Build 3D model Define pharmacophore 5/27/2014 Importance of PROCESS is not less than PRODUCT 56 Comparative molecular field analysis (CoMFA) - Tripos Build each molecule using modelling software Identify the active conformation for each molecule Identify the pharmacophore
  • 56. 3D-QSAR •Place the pharmacophore into a lattice of grid points Method •Each grid point defines a point in space Grid points . . . . . 5/27/2014 Importance of PROCESS is not less than PRODUCT 57
  • 57. 3D-QSAR Method •Each grid point defines a point in space Grid points . . . . . •Position molecule to match the pharmacophore 5/27/2014 Importance of PROCESS is not less than PRODUCT 58
  • 58. 3D-QSAR •A probe atom is placed at each grid point in turn Method •Probe atom = a proton or sp3 hybridised carbocation . . . . . Probe atom 5/27/2014 Importance of PROCESS is not less than PRODUCT 59
  • 59. 3D-QSAR •A probe atom is placed at each grid point in turn Method •Measure the steric or electrostatic interaction of the probe atom with the molecule at each grid point . . . . . Probe atom 5/27/2014 Importance of PROCESS is not less than PRODUCT 60
  • 60. 3D-QSAR Method Compound Biological Steric fields (S) Electrostatic fields (E) activity at grid points (001-998) at grid points (001-098) S001 S002 S003 S004 S005 etc E001 E002 E003 E004 E005 etc 1 5.1 2 6.8 3 5.3 4 6.4 5 6.1 Tabulate fields for each compound at each grid point Partial least squares analysis (PLS) QSAR equation Activity = aS001 + bS002 +……..mS998 + nE001 +…….+yE998 + z . . . . . 5/27/2014 Importance of PROCESS is not less than PRODUCT 62
  • 61. 3D-QSAR •Define fields using contour maps round a representative molecule Method 5/27/2014 Importance of PROCESS is not less than PRODUCT 63
  • 62. A procedure based on the information included in the MIF generating a handful of informative variables, independent of the location of the molecules within the grid Two main steps of the procedure of transformation:  Field filtering  Maximum auto-cross correlation(MACC2) encoding. 2 means distance between two points in the space. 2.5D-QSAR or GRIND methodology 5/27/2014 Importance of PROCESS is not less than PRODUCT 64
  • 63. MACC2 transform  The MACC transform has maximum value of the products of the two i and j field values, found at each different rij distance.  Here the colors represent the activity of the compounds (blue inactive, red active)  33 means the energy products produced by two N1 probes  8 means the 8th variable of auto- correlogram 33 5/27/2014 Importance of PROCESS is not less than PRODUCT 65
  • 64. GRID interaction fields calculated using the N1 probe: positive (yellow) interactions describe unfavorable and negative (blue) interactions describe favorable interactions they should have low energy values (representing highly favorable interactions) they should be as far as possible one from each other. 5/27/2014 Importance of PROCESS is not less than PRODUCT 66
  • 65. 5/27/2014 Importance of PROCESS is not less than PRODUCT 67
  • 66. Each number are corresponds to a specific distance of the fields 5/27/2014 Importance of PROCESS is not less than PRODUCT 68
  • 67. 5/27/2014 Importance of PROCESS is not less than PRODUCT 69
  • 68. 5/27/2014 Importance of PROCESS is not less than PRODUCT 70
  • 69. 5/27/2014 Importance of PROCESS is not less than PRODUCT 71
  • 70. 5/27/2014 Importance of PROCESS is not less than PRODUCT 72
  • 71. One of the unique features of the MACC transform is that it is possible to trace back the variables that generated this "most intense" interaction. 5/27/2014 Importance of PROCESS is not less than PRODUCT 73 VRS