Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Â
Chemoinformatic
1. Chemoinformatics: a Hot Topic inChemoinformatics: a Hot Topic in
Distance EducationDistance Education
Zarrin EsâhaghiZarrin Esâhaghi
Department of Chemistry, Faculty of SciencesDepartment of Chemistry, Faculty of Sciences
Payame Noor University, Mashhad, IranPayame Noor University, Mashhad, Iran
E-Mail: z_eshaghi@pnu.ac.irE-Mail: z_eshaghi@pnu.ac.ir
Payame Noor University
2. Chemoinformatics: a Hot Topic inChemoinformatics: a Hot Topic in
Distance EducationDistance Education
Zarrin EsâhaghiZarrin Esâhaghi
Department of Chemistry, Faculty of SciencesDepartment of Chemistry, Faculty of Sciences
Payame Noor University, Mashhad, IranPayame Noor University, Mashhad, Iran
E-Mail: z_eshaghi@pnu.ac.irE-Mail: z_eshaghi@pnu.ac.ir
Payame Noor University
3. 3
Chemoinformatics
ďś Chem(o)informatics is a generic
term that encompasses the;
ďś design, creation, organization,design, creation, organization,
management, analysis, visualizationmanagement, analysis, visualization
and use of chemical information.and use of chemical information.
ďś In fact, Chemoinformatics is the
application of informatics methods
to solve chemical problems.
4. What is Chemoinformatics?What is Chemoinformatics?
Chemoinformatics,
Cheminformatics, Chemical
Informatics, Computational
Chemistry, âŚ
âthe set of computer algorithms
and tools to store and analyse
chemical data in the context
of drug discovery and design
projects etcâŚâ
4
5. What is Chemoinformatics?What is Chemoinformatics?
âthe mixing of information resources to
transform data into information and
information into knowledge, for the
intended purpose of making better
decisions faster in the arena of drug lead
identification and optimizatonâ
5
6. What is Chemoinformatics?What is Chemoinformatics?
âchemoinformatics encompasses the
design, creation, organisation,
management, retrieval,analysis,
dissemination, visualization and use
of chemical informationâ
6
8. Why do we needWhy do we need
ChemoinformaticsChemoinformatics??
9. ďTo handle large amounts of information
ďTo move chemistry into the computer age
ďTo move from data to knowledge.
9
10. And last but not least:
â˘To get funding (bioinformatics is
doing well currently, whereas
computational chemistry seems
to be lagging behind).
â˘Data information knowledge
â˘measurements/calculations
Why do we need ChemoinformaticsWhy do we need Chemoinformatics?
10
11. How do we learn?How do we learn?
Inductive learning vs..Inductive learning vs..
Deductive learningDeductive learning
12. Inductive learning vs. DeductiveInductive learning vs. Deductive
learninglearning
Deductive learning:Deductive learning:
A fundamental theory exists which allows us
to calculate properties and predict the
behavior of molecules.
The fundamental theory for Chemistry is
quantum mechanics.
13. Inductive learning vs. Deductive learningInductive learning vs. Deductive learning
Inductive learning = Learning from examplesInductive learning = Learning from examples
13
14. General scheme for inductive learningGeneral scheme for inductive learning
14
15. The fundamental tasks of a chemistThe fundamental tasks of a chemist
property prediction, synthesis, design, reaction prediction, and
structure elucidation
15
16. The realm of ChemoinformaticsThe realm of Chemoinformatics
a) Representing Chemical
Compounds
b) Searching Chemical
Structures
c) Similarity Searches
d) Relating structure to
properties with models
16
17. Machine Learning MethodsMachine Learning Methods
⢠Important role in chemoinformaticsImportant role in chemoinformatics
âFor example, it is usually difficult to
predict which types of descriptors are
most suitable for a given search,
classification.
⢠Therefore, machine learning techniques are
often used to facilitate descriptor selection
17
18. Machine Learning MethodsMachine Learning Methods
â Genetic algorithmsâ Genetic algorithms
⢠Different parameters and model solutions to given
problems are encoded in a chromosome and subjected to
random variation, thus generating a population.
⢠Solutions provided by these chromosomes are evaluated by
fitness function that assign high scores to desired results.
⢠Chromosomes yielding best intermediate solutions are
subjected to mutation and crossover operation that
correspond to random genetic mutations and gene
recombination events.
⢠The resulting modified chromosomes represent the next
generation and the process is continued until the obtained
results meet a satisfactory convergence criterion
18
19. Quantitative Structure ActivityQuantitative Structure Activity
Relationship Analysis (QSAR)Relationship Analysis (QSAR)
Goal :Goal : Evaluation of molecular features that
determine biological activity and the
prediction of compound potency as a
function of structural modification
19
20. Virtual Screening and Compound FilteringVirtual Screening and Compound Filtering
VS(Virtual Screening)
- the process of screening large databases on the
computer for molecules having desired
properties and biological activity.
A major application of VS techniques is the
identification of novel active molecules in large
compound databases.
20
21. Impact of new technology on drug discoveryImpact of new technology on drug discovery
⢠The last few years have seen a number of
ârevolutionaryâ new technologies:
â Gene chips, genomics and HGP
â Bioinformatics & Molecular biology
â More protein structures
â High-throughput screening & assays
â Virtual screening and library design
â Combinatorial chemistry
â Other computational methods
⢠How do we make it all work for us?
21
22. How Chemoinformatics can help outHow Chemoinformatics can help out
Producing and manage information for metrics
to reduce risk, e.g.
âVirtual screening
âLibrary design,
âDocking
âCost/benefit analysis
⢠Making information available at the right time
and the right place Needs to be integrated
into processes
22
23. Software relevance:Software relevance:
Bridge between computation & scienceBridge between computation & science
clustering
sim. searching
activity models
scaffold detection
docking
logp calculation
tasks:
âdoing a cluster
analysisâ
âidentifying
activity-related
fragmentsâ
tools
chemoinformatics science
tasks:
work out a chemical
synthesis
choose good reagents
try and document some
reactions
goals:
e.g. produce compounds
that have high biological
activity
?
23
25. OverviewOverview
From Chemical Information ToFrom Chemical Information To
ChemoinformaticsChemoinformatics
â Integration with techniques from molecular
modeling
â Developments in computer hardware and
software
â Data explosion arising from developments in
combinatorial chemistry and high-throughput
screening
25
26. Molecular ModellingMolecular Modelling
⢠Positioning of a putative ligand into a
proteinâs active site, first attempted
by the DOCK program (UCSF, 1982)
⢠Initially restricted to rigid ligands and
rigid proteins: current programs
permit some degree of flexibility
⢠Use in structure-based design
â Move from docking a single
ligand to sequential docking of
large datasets
26
28. Graph TheoryGraph Theory
⢠Graph theory is a branch of mathematics that
considers sets of objects, called nodesnodes, and
the relationships, called edgesedges, between pairs
of these objects
⢠The definition is completely general, allowing
graphs to be used in many different
application domains as long as an appropriate
representation can be derived
28
30. Proposed courses for a DistanceProposed courses for a Distance
learning Programlearning Program
⢠Chemoinformatics Virtual ClassroomChemoinformatics Virtual Classroom
30
31. ďAt present there are no specific software tools for
chemical information training in the IranIran.
ďA number of commercial software products used in
the pharmaceutical and biotechnology industry are
either too expensive or of limited utility for training in
either academic or business settings.
ďBy employing distance learningdistance learning through a web
delivery system, the training software will provide an
effective, low cost solution for academic institutions,
whether they are offering a single course to students
in a remote setting, or an entire program in
cheminformatics.
31
32. 32
ďIn addition, such training tools will be
very useful in industry settings with local
area networks, where in a multidiscipline
setting individuals need to receive
training on the concepts employed by
industrial chemoinformatics software's.
33. Chemoinformatics: aimsChemoinformatics: aims
⢠Develop an awareness of Informatics Management
techniques used in the design and implementation of
chemoinformatics systems
⢠Enable students to demonstrate skills learned by
carrying out a small-scale industrially relevant
chemoinformatics research project
⢠Basic structure
â Three semesters of taught modules
â One semester dissertation working at the site of
one of the companies supporting the programme
33
34. Proposed Cources ;Proposed Cources ;
⢠An introduction to chemoinformatics.
â Chemoinformatics (Fundamental)
â Information Systems Modelling
â Information Storage and Retrieval
â Foundations of Object-Oriented Programming
34
35. 35
⢠Chemoinformatics (Advance ; more
programming)
⢠Database Design
⢠Research Methods and Dissertation
Preparation
⢠Two from a range of elective modules,
including Molecular Modelling
(Chemistry), Healthcare Information...etc
36. ConclusionsConclusions
Distance learning is becoming increasingly accepted by
the professional bodies. The image of distance learning
would need to be improved. The concept would have to
be well presented as something new, modern and
completely different from the old-style correspondence
courses.
Chemoinformatics can step in to assist in this effort. And
it can do so in all fields of chemistry, inorganic, analytical,
organic, physical, medicinal, and bio-chemistry. And it
can reach beyond chemistry provide methods and
information that can be used in biology, medicine, and
physics.
36
37. ReferencesReferences
Journal Articles
⢠Y. M. Alvarez-Ginarte,et al. Bioorganic &
Medicinal Chemistry 16 (2008) 6448â6459.
⢠S. D. Lindell, L. C. Pattenden, J. Shannon,
Bioorganic & Medicinal Chemistry 17 (2009)
4035â4046.
⢠J. Gasteiger, Chemometrics and Intelligent
Laboratory Systems 82 (2006) 200 â 209.
37
38. ReferencesReferences
Books
⢠An introduction to chemoinformatics. A.R. Leach & V.J. Gillet.
Kluwer, 2003.
⢠Chemoinformatics â A textbook. J. Gasteiger & T. Engel (eds).
Wiley-VCH, 2003.
⢠Handbook of chemoinformatics. J. Gasteiger (ed.). Wiley-VCH,
2003.
⢠Chemoinformatics: Concepts, Methods, and Applications
(Methods in Molecular
Biology). J. Bajorath. Humana Press, 2004.
⢠Molecular Modelling Principles and Applications. A. R. Leach.
Longman, 1996.
38
41. 41
Chemoinformatics
ďś Chem(o)informatics is a generic
term that encompasses the;
ďś design, creation, organization,design, creation, organization,
management, analysis, visualizationmanagement, analysis, visualization
and use of chemical information.and use of chemical information.
ďś In fact, Chemoinformatics is the
application of informatics methods
to solve chemical problems.
42. What is Chemoinformatics?What is Chemoinformatics?
Chemoinformatics,
Cheminformatics, Chemical
Informatics, Computational
Chemistry, âŚ
âthe set of computer algorithms
and tools to store and analyse
chemical data in the context
of drug discovery and design
projects etcâŚâ
42
43. What is Chemoinformatics?What is Chemoinformatics?
âthe mixing of information resources to
transform data into information and
information into knowledge, for the
intended purpose of making better
decisions faster in the arena of drug lead
identification and optimizatonâ
43
44. What is Chemoinformatics?What is Chemoinformatics?
âchemoinformatics encompasses the
design, creation, organisation,
management, retrieval,analysis,
dissemination, visualization and use
of chemical informationâ
44
46. Why do we needWhy do we need
ChemoinformaticsChemoinformatics??
47. ďTo handle large amounts of information
ďTo move chemistry into the computer age
ďTo move from data to knowledge.
47
48. And last but not least:
â˘To get funding (bioinformatics is
doing well currently, whereas
computational chemistry seems
to be lagging behind).
â˘Data information knowledge
â˘measurements/calculations
Why do we need ChemoinformaticsWhy do we need Chemoinformatics?
48
49. How do we learn?How do we learn?
Inductive learning vs..Inductive learning vs..
Deductive learningDeductive learning
50. Inductive learning vs. DeductiveInductive learning vs. Deductive
learninglearning
Deductive learning:Deductive learning:
A fundamental theory exists which allows us
to calculate properties and predict the
behavior of molecules.
The fundamental theory for Chemistry is
quantum mechanics.
51. Inductive learning vs. Deductive learningInductive learning vs. Deductive learning
Inductive learning = Learning from examplesInductive learning = Learning from examples
51
52. General scheme for inductive learningGeneral scheme for inductive learning
52
53. The fundamental tasks of a chemistThe fundamental tasks of a chemist
property prediction, synthesis, design, reaction prediction, and
structure elucidation
53
54. The realm of ChemoinformaticsThe realm of Chemoinformatics
a) Representing Chemical
Compounds
b) Searching Chemical
Structures
c) Similarity Searches
d) Relating structure to
properties with models
54
55. Machine Learning MethodsMachine Learning Methods
⢠Important role in chemoinformaticsImportant role in chemoinformatics
âFor example, it is usually difficult to
predict which types of descriptors are
most suitable for a given search,
classification.
⢠Therefore, machine learning techniques are
often used to facilitate descriptor selection
55
56. Machine Learning MethodsMachine Learning Methods
â Genetic algorithmsâ Genetic algorithms
⢠Different parameters and model solutions to given
problems are encoded in a chromosome and subjected to
random variation, thus generating a population.
⢠Solutions provided by these chromosomes are evaluated by
fitness function that assign high scores to desired results.
⢠Chromosomes yielding best intermediate solutions are
subjected to mutation and crossover operation that
correspond to random genetic mutations and gene
recombination events.
⢠The resulting modified chromosomes represent the next
generation and the process is continued until the obtained
results meet a satisfactory convergence criterion
56
57. Quantitative Structure ActivityQuantitative Structure Activity
Relationship Analysis (QSAR)Relationship Analysis (QSAR)
Goal :Goal : Evaluation of molecular features that
determine biological activity and the
prediction of compound potency as a
function of structural modification
57
58. Virtual Screening and Compound FilteringVirtual Screening and Compound Filtering
VS(Virtual Screening)
- the process of screening large databases on the
computer for molecules having desired
properties and biological activity.
A major application of VS techniques is the
identification of novel active molecules in large
compound databases.
58
59. Impact of new technology on drug discoveryImpact of new technology on drug discovery
⢠The last few years have seen a number of
ârevolutionaryâ new technologies:
â Gene chips, genomics and HGP
â Bioinformatics & Molecular biology
â More protein structures
â High-throughput screening & assays
â Virtual screening and library design
â Combinatorial chemistry
â Other computational methods
⢠How do we make it all work for us?
59
60. How Chemoinformatics can help outHow Chemoinformatics can help out
Producing and manage information for metrics
to reduce risk, e.g.
âVirtual screening
âLibrary design,
âDocking
âCost/benefit analysis
⢠Making information available at the right time
and the right place Needs to be integrated
into processes
60
61. Software relevance:Software relevance:
Bridge between computation & scienceBridge between computation & science
clustering
sim. searching
activity models
scaffold detection
docking
logp calculation
tasks:
âdoing a cluster
analysisâ
âidentifying
activity-related
fragmentsâ
tools
chemoinformatics science
tasks:
work out a chemical
synthesis
choose good reagents
try and document some
reactions
goals:
e.g. produce compounds
that have high biological
activity
?
61
63. OverviewOverview
From Chemical Information ToFrom Chemical Information To
ChemoinformaticsChemoinformatics
â Integration with techniques from molecular
modeling
â Developments in computer hardware and
software
â Data explosion arising from developments in
combinatorial chemistry and high-throughput
screening
63
64. Molecular ModellingMolecular Modelling
⢠Positioning of a putative ligand into a
proteinâs active site, first attempted
by the DOCK program (UCSF, 1982)
⢠Initially restricted to rigid ligands and
rigid proteins: current programs
permit some degree of flexibility
⢠Use in structure-based design
â Move from docking a single
ligand to sequential docking of
large datasets
64
66. Graph TheoryGraph Theory
⢠Graph theory is a branch of mathematics that
considers sets of objects, called nodesnodes, and
the relationships, called edgesedges, between pairs
of these objects
⢠The definition is completely general, allowing
graphs to be used in many different
application domains as long as an appropriate
representation can be derived
66
68. Proposed courses for a DistanceProposed courses for a Distance
learning Programlearning Program
⢠Chemoinformatics Virtual ClassroomChemoinformatics Virtual Classroom
68
69. ďAt present there are no specific software tools for
chemical information training in the IranIran.
ďA number of commercial software products used in
the pharmaceutical and biotechnology industry are
either too expensive or of limited utility for training in
either academic or business settings.
ďBy employing distance learningdistance learning through a web
delivery system, the training software will provide an
effective, low cost solution for academic institutions,
whether they are offering a single course to students
in a remote setting, or an entire program in
cheminformatics.
69
70. 70
ďIn addition, such training tools will be
very useful in industry settings with local
area networks, where in a multidiscipline
setting individuals need to receive
training on the concepts employed by
industrial chemoinformatics software's.
71. Chemoinformatics: aimsChemoinformatics: aims
⢠Develop an awareness of Informatics Management
techniques used in the design and implementation of
chemoinformatics systems
⢠Enable students to demonstrate skills learned by
carrying out a small-scale industrially relevant
chemoinformatics research project
⢠Basic structure
â Three semesters of taught modules
â One semester dissertation working at the site of
one of the companies supporting the programme
71
72. Proposed Cources ;Proposed Cources ;
⢠An introduction to chemoinformatics.
â Chemoinformatics (Fundamental)
â Information Systems Modelling
â Information Storage and Retrieval
â Foundations of Object-Oriented Programming
72
73. 73
⢠Chemoinformatics (Advance ; more
programming)
⢠Database Design
⢠Research Methods and Dissertation
Preparation
⢠Two from a range of elective modules,
including Molecular Modelling
(Chemistry), Healthcare Information...etc
74. ConclusionsConclusions
Distance learning is becoming increasingly accepted by
the professional bodies. The image of distance learning
would need to be improved. The concept would have to
be well presented as something new, modern and
completely different from the old-style correspondence
courses.
Chemoinformatics can step in to assist in this effort. And
it can do so in all fields of chemistry, inorganic, analytical,
organic, physical, medicinal, and bio-chemistry. And it
can reach beyond chemistry provide methods and
information that can be used in biology, medicine, and
physics.
74
75. ReferencesReferences
Journal Articles
⢠Y. M. Alvarez-Ginarte,et al. Bioorganic &
Medicinal Chemistry 16 (2008) 6448â6459.
⢠S. D. Lindell, L. C. Pattenden, J. Shannon,
Bioorganic & Medicinal Chemistry 17 (2009)
4035â4046.
⢠J. Gasteiger, Chemometrics and Intelligent
Laboratory Systems 82 (2006) 200 â 209.
75
76. ReferencesReferences
Books
⢠An introduction to chemoinformatics. A.R. Leach & V.J. Gillet.
Kluwer, 2003.
⢠Chemoinformatics â A textbook. J. Gasteiger & T. Engel (eds).
Wiley-VCH, 2003.
⢠Handbook of chemoinformatics. J. Gasteiger (ed.). Wiley-VCH,
2003.
⢠Chemoinformatics: Concepts, Methods, and Applications
(Methods in Molecular
Biology). J. Bajorath. Humana Press, 2004.
⢠Molecular Modelling Principles and Applications. A. R. Leach.
Longman, 1996.
76