SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Efficient Searching and Similarity
of Unmapped Reactions:
Application to ELN Analysis
Roger Sayle, Ed Griffen, Thierry Kogej and David Drake
NextMove Software, Cambridge, UK
AstraZeneca, Alderley Park, UK
AstraZeneca, Molndal, Sweden
ACS National Meeting, San Diego, USA 27th March 2012
motivation
• Better understanding of chemical reactions can be
used to improve the design-test-make cycles in the
pharmaceutical industry, both for in-house synthesis
and outsourcing to CROs.
• Although small molecule informatics is well
supported, the computer representation, storage,
manipulation, analysis, QSAR and property
prediction of chemical reactions is relatively
unexplored.
ACS National Meeting, San Diego, USA 27th March 2012
Motivating example from gsk
• Stephen D. Pickett, Darren V.S. Green, David L. Hunt,
David A. Pardoe and Ian Hughes, “Automated Lead
Optimization of MMP-12 Inhibitors using a Genetic
Algorithm”, ACS Medicinal Chemistry Letters, Vol. 2,
pp. 28-33, 2011.
50 R1 x 50 R2 = 2500 cmpds
ACS National Meeting, San Diego, USA 27th March 2012
Libraries in practice
• Objective was to be diverse complete, so variations of
standard route and conditions were required to avoid
compromising diversity.
• Despite best efforts, 566 not made due to poor
reactivity, unanticipated side reactions or product
stability compared to 26 not assayed, 28 assay failed
and 176 inactive, giving 1704 assay results.
• Interestingly, R1=21&R2=7 and R1=31&R2=25 were the
two most active compounds (pIC50=8), R1=21&R2=25
had a pIC50=7.8 and R1=31&R2=7 was never made!
ACS National Meeting, San Diego, USA 27th March 2012
Transforms vs. reactions
Importance of Reaction Mechanism
Example: Ullman-type Coupling Reactions
SMIRKS: [H][N:1].[Cl][c:2]>>[N:1][c:2]
The SMIRKS transform alone is insufficient
to predict the products and by-products in
this example. A measure of nucleophility
is desirable for each atom in a molecule.
Without this software may be misled into
believing that protecting groups are required.
ACS National Meeting, San Diego, USA 27th March 2012
Beyond drug guru
ACS National Meeting, San Diego, USA 27th March 2012
sildenafil (viagra) vardenafil (levitra)
Proposed solution
• Analyze the hundreds of thousands of reaction
examples described in in-house Electronic Laboratory
Notebooks (ELNs).
• A typical use case would be to plot reaction yield for
related reactions against catalysts and/or solvents
used to determine the likely best or poor conditions.
• ELNs, unlike the literature, are strongly biased to the
types of reactions/chemistries used in pharma.
• Most importantly, one of the only sources of
negative data (reactions that have failed).
ACS National Meeting, San Diego, USA 27th March 2012
The challenges
1. Export of the data from the ELN.
2. High fidelity conversion to other file formats.
3. Reaction normalization/standardization.
4. Reaction identity (canonicalization).
5. Improved reaction depiction.
6. Intuitive reaction searching.
7. Reaction similarity.
8. Reaction classification/clustering.
ACS National Meeting, San Diego, USA 27th March 2012
Database export
• Typically, ELNs are implemented as complex schemas
within relational databases (Oracle), supporting
transactions, auditing and security privileges.
• Not uncommonly the vendor provided functionality
or APIs for data export are slow and/or buggy.
• In addition to reactions and structures, there is often
a requirement to export all associated data, including
textual and numeric data, tables, even LCMS and
NMR spectra.
ACS National Meeting, San Diego, USA 27th March 2012
typical eln fields in RD format
$DTYPE REACTION:REACTION.CONDITIONS:TEMPERATURE:VALUE
$DATUM 120 °C
$DTYPE REACTION:REACTION.CONDITIONS:TEMPERATURE:MINVAL
$DATUM 393.15
$DTYPE REACTION:REACTION.CONDITIONS:TEMPERATURE:MAXVAL
$DATUM 393.15
$DTYPE REACTION:REACTION.CONDITIONS:PRESSURE:VALUE
$DATUM 5 bar
$DTYPE REACTION:REACTANTS(1):NAME
$DATUM nicotinoyl chloride
$DTYPE REACTION:REACTANTS(1):CHEMICAL.STRUCTURE
$DATUM c1cc(cnc1)C(=O)Cl
$DTYPE REACTION:REACTANTS(1):FORMULA.MASS:VALUE
$DATUM 141.56
$DTYPE REACTION:REACTANTS(2):NAME
$DATUM (2R,3S)-3-methylpentan-2-ol
$DTYPE REACTION:REACTANTS(2):CHEMICAL.STRUCTURE
$DATUM CC[C@@H](C)[C@H](C)O
$DTYPE REACTION:REACTANTS(2):FORMULA.MASS:VALUE
$DATUM 102.17
ACS National Meeting, San Diego, USA 27th March 2012
$DTYPE REACTION:PRODUCTS(1):CHEMICAL.STRUCTURE
$DATUM CC[C@@H](C)[C@H](C)OC(=O)c1cccnc1
$DTYPE REACTION:PRODUCTS(1):NAME
$DATUM (2R,3S)-3-methylpentan-2-yl nicotinate
$DTYPE REACTION:PRODUCTS(1):MOLECULAR.WEIGHT:VALUE
$DATUM 207.27
$DTYPE REACTION:PRODUCTS(1):MOLECULAR.FORMULA
$DATUM C12H17NO2
$DTYPE REACTION:PRODUCTS(1):ACTUAL.MOLES:VALUE
$DATUM 2.16 mmol
$DTYPE REACTION:SOLVENTS(1):NAME
$DATUM 1-pentanol
$DTYPE REACTION:SOLVENTS(1):VOLUME:VALUE
$DATUM 5 mL
$DTYPE REACTION:SOLVENTS(1):R.VALUES
$DATUM R10,R20
File format conversion
• The source format for many reactions is typically a
sketch, in either CDX, CDXML, ISIS Sketch or Marvin
file format.
• For data processing reactions are much easier to
handle as reaction SMILES, MDL RXN or RD files and
possibly even variants of MOL and SD file formats.
• Alas handling of reaction file formats is generally
poorly handled by many cheminformatics tools.
• Additionally, reaction file formats can rarely encode
all of the same information.
ACS National Meeting, San Diego, USA 27th March 2012
Reaction standardization
• An often overlooked aspect of ELNs is the need to
enforce “business rules” to consistently represent a
reaction, in the same way that normalized molecules
are stored in registration systems.
• Pharmaceutical ELNs contain structures where nitros
are represented arbitrarily, and even cases where
azide representations differ on each side of an arrow.
• Unfortunately, the rules used for molecules (such as
InChI) may be inappropriate for reactions, where
metal co-ordination and radicals play a major role.
ACS National Meeting, San Diego, USA 27th March 2012
Reaction identity
• Continuing the notion of standardization, one might
ask when are two reactions the same (duplicates).
• Two reactions that have matching agents (catalysts
and solvents), reactants and products are considered
identical, yet two that have the same reactants and
products are considered the “same”.
• Whether a component is a reactant, catalyst, solvent
or reagent may be consistently defined by atom-
mapping; reactants contribute atoms to the product.
ACS National Meeting, San Diego, USA 27th March 2012
Improved reaction depiction
• The 2D display of reactions can be improved by
suitably aligning the reactants and products.
– Molecules should be rotated or flipped (adjusting chirality)
so that the substructures are consistent across an arrow.
– The ordering reactants in coupling reactions should match
their ordering in the product.
– Ideally, the configuration of angles, torsions and chiral
bonds should be consistent across a reaction arrow.
• Agents, solvents, salts and even protecting groups
may be easier to interpret textually (superatoms).
ACS National Meeting, San Diego, USA 27th March 2012
Reaction depiction
ACS National Meeting, San Diego, USA 27th March 2012
Reaction depiction
ACS National Meeting, San Diego, USA 27th March 2012
Rxn searching: make or Break
• In order to simplify the learning curve for exploiting
reaction databases, an easy to use query mechanism
has been developed for frequent use-cases.
• Query: Retrieve reactions that make indoles.
• Traditional interfaces provide either substructure
search or require reaction centers/atom mapping.
• A more convenient (robust) mechanism is to check
whether (or count) that a drawn substructure
appears in the products but not in either the
reactants or agents.
ACS National Meeting, San Diego, USA 27th March 2012
Avoiding uninteresting results
ACS National Meeting, San Diego, USA 27th March 2012
Atom mapping ambiguity
• Mechanistic atom mappings depend on mechanism.
ACS National Meeting, San Diego, USA 27th March 2012
Atom mapping ambiguity
• Mechanistic atom mappings depend on mechanism.
ACS National Meeting, San Diego, USA 27th March 2012
Atom mapping ambiguity
• Mechanistic atom mappings depend on mechanism.
ACS National Meeting, San Diego, USA 27th March 2012
bond changes in indole synthesis
Synthesis A B C D
Baeyer-Emmerling M
Bartoli M M
Bischler-Möhlau M C M
Fischer M C M
Fukuyama M
Hemetsberger M
Larock M C M
Mandelung M
Nenitzescu M M
Reissert M M
Larock M C M
ACS National Meeting, San Diego, USA 27th March 2012
Reaction similarity
• To define the similarity between two reactions, we
employ an ECFP variation of Daylight’s “difference”
fingerprints.
• Conceptually, a difference fingerprint encodes the
differences between the fingerprint of the reactants
and fingerprint of the products.
• We use differences in counted Scitegic/Accelrys ECFP
fingerprints to capture the local environment of the
reaction centre without requiring atom mapping.
ACS National Meeting, San Diego, USA 27th March 2012
Reaction classification
• Although reaction similarity may be used to cluster
and group mechanistically related reactions together,
it is also convenient for them to be categorized by
the type of reaction.
• Manually assignment into the RSC’s RXNO ontology.
• InfoChem’s ICCLASSIFY and ICNAMERXN tools.
• Algorithmic assignment of reaction class (compound
purchase, purification, chiral separation) or
recognized (named) reaction mechanism using
SMIRKS transformations.
ACS National Meeting, San Diego, USA 27th March 2012
categorization of reactions
J. Carey, D. Laffan, C. Thomson, M. Williams, Org. Biomol. Chem. 2337, 2006.
S. Roughley and A. Jordan, J. Med. Chem. 54:3451-3479, 2011.
23.1
22.4
11.5
18.0
8.2
5.6
5.6
3.1
1.5 1.1 Heteroatom alkylation and arylation
Acylation and related processes
C-C bond formation
Deprotections
Heterocycle formation
Reductions
Functional group conversion
Protections
Oxidations
Functional group addition
ACS National Meeting, San Diego, USA 27th March 2012
conclusions
• In an attempt to better understand and hopefully
improve the productivity of synthetic chemists,
new computational methods have been developed
to process “real world” organic reaction data.
• The fruits of this work now enable medicinal
chemists and informaticians to make greater use of
the wealth of information in their in-house ELNs.
ACS National Meeting, San Diego, USA 27th March 2012
acknowledgements
• Daniel Lowe, NextMove Software.
• Plamen Petrov and Mikko Vainio, AstraZeneca.
• Richard Bolton and Andrew Wooster, GSK.
• Peter Loew and Hans Kraut, InfoChem.
• Ethan Hoff, Abbott Laboratories.
• Stephen Roughley, Vernalis.
• Thank you for your time.
ACS National Meeting, San Diego, USA 27th March 2012

Weitere ähnliche Inhalte

Ähnlich wie Efficient Searching and Similarity of Unmapped Reactions: Application to ELN Analysis

EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert syste...
EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert syste...EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert syste...
EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert syste...ChemAxon
 
Evaluating the Quality and Performance of Automatic Atom Mapping Algorithms
Evaluating the Quality and Performance of Automatic Atom Mapping AlgorithmsEvaluating the Quality and Performance of Automatic Atom Mapping Algorithms
Evaluating the Quality and Performance of Automatic Atom Mapping AlgorithmsNextMove Software
 
Pharmacophore Modeling and Docking Techniques.ppt
Pharmacophore Modeling and Docking Techniques.pptPharmacophore Modeling and Docking Techniques.ppt
Pharmacophore Modeling and Docking Techniques.pptDrVivekChauhan1
 
Evaluating the Quality and Performance of Automatic Atom Mapping Algorithms
Evaluating the Quality and Performance of Automatic Atom Mapping AlgorithmsEvaluating the Quality and Performance of Automatic Atom Mapping Algorithms
Evaluating the Quality and Performance of Automatic Atom Mapping Algorithmsdan2097
 
Sandrogreco Lithium Diisopropylamide Solution Kinetics And
Sandrogreco Lithium Diisopropylamide   Solution Kinetics AndSandrogreco Lithium Diisopropylamide   Solution Kinetics And
Sandrogreco Lithium Diisopropylamide Solution Kinetics AndProfª Cristiana Passinato
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmShikha Popali
 
Standardized Representations of ELN Reactions for Categorization and Duplicat...
Standardized Representations of ELN Reactions for Categorization and Duplicat...Standardized Representations of ELN Reactions for Categorization and Duplicat...
Standardized Representations of ELN Reactions for Categorization and Duplicat...NextMove Software
 
Essential Biology: Making ATP Workbook (SL Core only)
Essential Biology: Making ATP Workbook (SL Core only)Essential Biology: Making ATP Workbook (SL Core only)
Essential Biology: Making ATP Workbook (SL Core only)Stephen Taylor
 
43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdf43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdfUmeshYadava1
 
SF and PE CTR-IN 2016 Poster_FInal
SF and PE CTR-IN 2016 Poster_FInalSF and PE CTR-IN 2016 Poster_FInal
SF and PE CTR-IN 2016 Poster_FInalSteve Flynn
 
Molecular docking
Molecular dockingMolecular docking
Molecular dockingRahul B S
 
Basics Of Molecular Docking
Basics Of Molecular DockingBasics Of Molecular Docking
Basics Of Molecular DockingSatarupa Deb
 
Drug design based on bioinformatic tools
Drug design based on bioinformatic toolsDrug design based on bioinformatic tools
Drug design based on bioinformatic toolsSujeethKrishnan
 
Implementing Life Cycle Assessment to Understand the Environmental and Energy...
Implementing Life Cycle Assessment to Understand the Environmental and Energy...Implementing Life Cycle Assessment to Understand the Environmental and Energy...
Implementing Life Cycle Assessment to Understand the Environmental and Energy...barkercarlock
 
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...Ed Griffen
 

Ähnlich wie Efficient Searching and Similarity of Unmapped Reactions: Application to ELN Analysis (20)

EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert syste...
EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert syste...EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert syste...
EUGM15 - Alexandre Varnek (Université de Strasbourg): Towards an expert syste...
 
Docking
DockingDocking
Docking
 
Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)
 
Evaluating the Quality and Performance of Automatic Atom Mapping Algorithms
Evaluating the Quality and Performance of Automatic Atom Mapping AlgorithmsEvaluating the Quality and Performance of Automatic Atom Mapping Algorithms
Evaluating the Quality and Performance of Automatic Atom Mapping Algorithms
 
Pharmacophore Modeling and Docking Techniques.ppt
Pharmacophore Modeling and Docking Techniques.pptPharmacophore Modeling and Docking Techniques.ppt
Pharmacophore Modeling and Docking Techniques.ppt
 
Evaluating the Quality and Performance of Automatic Atom Mapping Algorithms
Evaluating the Quality and Performance of Automatic Atom Mapping AlgorithmsEvaluating the Quality and Performance of Automatic Atom Mapping Algorithms
Evaluating the Quality and Performance of Automatic Atom Mapping Algorithms
 
Assignment 105B.pptx
Assignment 105B.pptxAssignment 105B.pptx
Assignment 105B.pptx
 
Sandrogreco Lithium Diisopropylamide Solution Kinetics And
Sandrogreco Lithium Diisopropylamide   Solution Kinetics AndSandrogreco Lithium Diisopropylamide   Solution Kinetics And
Sandrogreco Lithium Diisopropylamide Solution Kinetics And
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.Pharm
 
Standardized Representations of ELN Reactions for Categorization and Duplicat...
Standardized Representations of ELN Reactions for Categorization and Duplicat...Standardized Representations of ELN Reactions for Categorization and Duplicat...
Standardized Representations of ELN Reactions for Categorization and Duplicat...
 
Essential Biology: Making ATP Workbook (SL Core only)
Essential Biology: Making ATP Workbook (SL Core only)Essential Biology: Making ATP Workbook (SL Core only)
Essential Biology: Making ATP Workbook (SL Core only)
 
43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdf43_EMIJ-06-00212.pdf
43_EMIJ-06-00212.pdf
 
SF and PE CTR-IN 2016 Poster_FInal
SF and PE CTR-IN 2016 Poster_FInalSF and PE CTR-IN 2016 Poster_FInal
SF and PE CTR-IN 2016 Poster_FInal
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
 
Basics Of Molecular Docking
Basics Of Molecular DockingBasics Of Molecular Docking
Basics Of Molecular Docking
 
Drug design based on bioinformatic tools
Drug design based on bioinformatic toolsDrug design based on bioinformatic tools
Drug design based on bioinformatic tools
 
Year 9 rubric eg
Year 9 rubric egYear 9 rubric eg
Year 9 rubric eg
 
Implementing Life Cycle Assessment to Understand the Environmental and Energy...
Implementing Life Cycle Assessment to Understand the Environmental and Energy...Implementing Life Cycle Assessment to Understand the Environmental and Energy...
Implementing Life Cycle Assessment to Understand the Environmental and Energy...
 
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
Extracting medicinal chemistry knowledge by a secured Matched Molecular Pair ...
 
ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...
 

Mehr von NextMove Software

CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...NextMove Software
 
Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...NextMove Software
 
CINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedCINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedNextMove Software
 
A de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESA de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESNextMove Software
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionNextMove Software
 
Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...NextMove Software
 
Comparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsComparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsNextMove Software
 
Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...NextMove Software
 
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...NextMove Software
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKitNextMove Software
 
Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...NextMove Software
 
Digital Chemical Representations
Digital Chemical RepresentationsDigital Chemical Representations
Digital Chemical RepresentationsNextMove Software
 
Challenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsChallenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsNextMove Software
 
PubChem as a Biologics Database
PubChem as a Biologics DatabasePubChem as a Biologics Database
PubChem as a Biologics DatabaseNextMove Software
 
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...NextMove Software
 
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesCINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesNextMove Software
 
Building on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesBuilding on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesNextMove Software
 
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...NextMove Software
 
Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)NextMove Software
 

Mehr von NextMove Software (20)

DeepSMILES
DeepSMILESDeepSMILES
DeepSMILES
 
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
 
Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...
 
CINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedCINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speed
 
A de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESA de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILES
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
 
Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...
 
Comparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsComparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule Implementations
 
Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...
 
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKit
 
Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...
 
Digital Chemical Representations
Digital Chemical RepresentationsDigital Chemical Representations
Digital Chemical Representations
 
Challenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsChallenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptions
 
PubChem as a Biologics Database
PubChem as a Biologics DatabasePubChem as a Biologics Database
PubChem as a Biologics Database
 
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
 
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesCINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
 
Building on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesBuilding on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfiles
 
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
 
Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)
 

Kürzlich hochgeladen

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 

Kürzlich hochgeladen (20)

Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 

Efficient Searching and Similarity of Unmapped Reactions: Application to ELN Analysis

  • 1. Efficient Searching and Similarity of Unmapped Reactions: Application to ELN Analysis Roger Sayle, Ed Griffen, Thierry Kogej and David Drake NextMove Software, Cambridge, UK AstraZeneca, Alderley Park, UK AstraZeneca, Molndal, Sweden ACS National Meeting, San Diego, USA 27th March 2012
  • 2. motivation • Better understanding of chemical reactions can be used to improve the design-test-make cycles in the pharmaceutical industry, both for in-house synthesis and outsourcing to CROs. • Although small molecule informatics is well supported, the computer representation, storage, manipulation, analysis, QSAR and property prediction of chemical reactions is relatively unexplored. ACS National Meeting, San Diego, USA 27th March 2012
  • 3. Motivating example from gsk • Stephen D. Pickett, Darren V.S. Green, David L. Hunt, David A. Pardoe and Ian Hughes, “Automated Lead Optimization of MMP-12 Inhibitors using a Genetic Algorithm”, ACS Medicinal Chemistry Letters, Vol. 2, pp. 28-33, 2011. 50 R1 x 50 R2 = 2500 cmpds ACS National Meeting, San Diego, USA 27th March 2012
  • 4. Libraries in practice • Objective was to be diverse complete, so variations of standard route and conditions were required to avoid compromising diversity. • Despite best efforts, 566 not made due to poor reactivity, unanticipated side reactions or product stability compared to 26 not assayed, 28 assay failed and 176 inactive, giving 1704 assay results. • Interestingly, R1=21&R2=7 and R1=31&R2=25 were the two most active compounds (pIC50=8), R1=21&R2=25 had a pIC50=7.8 and R1=31&R2=7 was never made! ACS National Meeting, San Diego, USA 27th March 2012
  • 5. Transforms vs. reactions Importance of Reaction Mechanism Example: Ullman-type Coupling Reactions SMIRKS: [H][N:1].[Cl][c:2]>>[N:1][c:2] The SMIRKS transform alone is insufficient to predict the products and by-products in this example. A measure of nucleophility is desirable for each atom in a molecule. Without this software may be misled into believing that protecting groups are required. ACS National Meeting, San Diego, USA 27th March 2012
  • 6. Beyond drug guru ACS National Meeting, San Diego, USA 27th March 2012 sildenafil (viagra) vardenafil (levitra)
  • 7. Proposed solution • Analyze the hundreds of thousands of reaction examples described in in-house Electronic Laboratory Notebooks (ELNs). • A typical use case would be to plot reaction yield for related reactions against catalysts and/or solvents used to determine the likely best or poor conditions. • ELNs, unlike the literature, are strongly biased to the types of reactions/chemistries used in pharma. • Most importantly, one of the only sources of negative data (reactions that have failed). ACS National Meeting, San Diego, USA 27th March 2012
  • 8. The challenges 1. Export of the data from the ELN. 2. High fidelity conversion to other file formats. 3. Reaction normalization/standardization. 4. Reaction identity (canonicalization). 5. Improved reaction depiction. 6. Intuitive reaction searching. 7. Reaction similarity. 8. Reaction classification/clustering. ACS National Meeting, San Diego, USA 27th March 2012
  • 9. Database export • Typically, ELNs are implemented as complex schemas within relational databases (Oracle), supporting transactions, auditing and security privileges. • Not uncommonly the vendor provided functionality or APIs for data export are slow and/or buggy. • In addition to reactions and structures, there is often a requirement to export all associated data, including textual and numeric data, tables, even LCMS and NMR spectra. ACS National Meeting, San Diego, USA 27th March 2012
  • 10. typical eln fields in RD format $DTYPE REACTION:REACTION.CONDITIONS:TEMPERATURE:VALUE $DATUM 120 °C $DTYPE REACTION:REACTION.CONDITIONS:TEMPERATURE:MINVAL $DATUM 393.15 $DTYPE REACTION:REACTION.CONDITIONS:TEMPERATURE:MAXVAL $DATUM 393.15 $DTYPE REACTION:REACTION.CONDITIONS:PRESSURE:VALUE $DATUM 5 bar $DTYPE REACTION:REACTANTS(1):NAME $DATUM nicotinoyl chloride $DTYPE REACTION:REACTANTS(1):CHEMICAL.STRUCTURE $DATUM c1cc(cnc1)C(=O)Cl $DTYPE REACTION:REACTANTS(1):FORMULA.MASS:VALUE $DATUM 141.56 $DTYPE REACTION:REACTANTS(2):NAME $DATUM (2R,3S)-3-methylpentan-2-ol $DTYPE REACTION:REACTANTS(2):CHEMICAL.STRUCTURE $DATUM CC[C@@H](C)[C@H](C)O $DTYPE REACTION:REACTANTS(2):FORMULA.MASS:VALUE $DATUM 102.17 ACS National Meeting, San Diego, USA 27th March 2012 $DTYPE REACTION:PRODUCTS(1):CHEMICAL.STRUCTURE $DATUM CC[C@@H](C)[C@H](C)OC(=O)c1cccnc1 $DTYPE REACTION:PRODUCTS(1):NAME $DATUM (2R,3S)-3-methylpentan-2-yl nicotinate $DTYPE REACTION:PRODUCTS(1):MOLECULAR.WEIGHT:VALUE $DATUM 207.27 $DTYPE REACTION:PRODUCTS(1):MOLECULAR.FORMULA $DATUM C12H17NO2 $DTYPE REACTION:PRODUCTS(1):ACTUAL.MOLES:VALUE $DATUM 2.16 mmol $DTYPE REACTION:SOLVENTS(1):NAME $DATUM 1-pentanol $DTYPE REACTION:SOLVENTS(1):VOLUME:VALUE $DATUM 5 mL $DTYPE REACTION:SOLVENTS(1):R.VALUES $DATUM R10,R20
  • 11. File format conversion • The source format for many reactions is typically a sketch, in either CDX, CDXML, ISIS Sketch or Marvin file format. • For data processing reactions are much easier to handle as reaction SMILES, MDL RXN or RD files and possibly even variants of MOL and SD file formats. • Alas handling of reaction file formats is generally poorly handled by many cheminformatics tools. • Additionally, reaction file formats can rarely encode all of the same information. ACS National Meeting, San Diego, USA 27th March 2012
  • 12. Reaction standardization • An often overlooked aspect of ELNs is the need to enforce “business rules” to consistently represent a reaction, in the same way that normalized molecules are stored in registration systems. • Pharmaceutical ELNs contain structures where nitros are represented arbitrarily, and even cases where azide representations differ on each side of an arrow. • Unfortunately, the rules used for molecules (such as InChI) may be inappropriate for reactions, where metal co-ordination and radicals play a major role. ACS National Meeting, San Diego, USA 27th March 2012
  • 13. Reaction identity • Continuing the notion of standardization, one might ask when are two reactions the same (duplicates). • Two reactions that have matching agents (catalysts and solvents), reactants and products are considered identical, yet two that have the same reactants and products are considered the “same”. • Whether a component is a reactant, catalyst, solvent or reagent may be consistently defined by atom- mapping; reactants contribute atoms to the product. ACS National Meeting, San Diego, USA 27th March 2012
  • 14. Improved reaction depiction • The 2D display of reactions can be improved by suitably aligning the reactants and products. – Molecules should be rotated or flipped (adjusting chirality) so that the substructures are consistent across an arrow. – The ordering reactants in coupling reactions should match their ordering in the product. – Ideally, the configuration of angles, torsions and chiral bonds should be consistent across a reaction arrow. • Agents, solvents, salts and even protecting groups may be easier to interpret textually (superatoms). ACS National Meeting, San Diego, USA 27th March 2012
  • 15. Reaction depiction ACS National Meeting, San Diego, USA 27th March 2012
  • 16. Reaction depiction ACS National Meeting, San Diego, USA 27th March 2012
  • 17. Rxn searching: make or Break • In order to simplify the learning curve for exploiting reaction databases, an easy to use query mechanism has been developed for frequent use-cases. • Query: Retrieve reactions that make indoles. • Traditional interfaces provide either substructure search or require reaction centers/atom mapping. • A more convenient (robust) mechanism is to check whether (or count) that a drawn substructure appears in the products but not in either the reactants or agents. ACS National Meeting, San Diego, USA 27th March 2012
  • 18. Avoiding uninteresting results ACS National Meeting, San Diego, USA 27th March 2012
  • 19. Atom mapping ambiguity • Mechanistic atom mappings depend on mechanism. ACS National Meeting, San Diego, USA 27th March 2012
  • 20. Atom mapping ambiguity • Mechanistic atom mappings depend on mechanism. ACS National Meeting, San Diego, USA 27th March 2012
  • 21. Atom mapping ambiguity • Mechanistic atom mappings depend on mechanism. ACS National Meeting, San Diego, USA 27th March 2012
  • 22. bond changes in indole synthesis Synthesis A B C D Baeyer-Emmerling M Bartoli M M Bischler-Möhlau M C M Fischer M C M Fukuyama M Hemetsberger M Larock M C M Mandelung M Nenitzescu M M Reissert M M Larock M C M ACS National Meeting, San Diego, USA 27th March 2012
  • 23. Reaction similarity • To define the similarity between two reactions, we employ an ECFP variation of Daylight’s “difference” fingerprints. • Conceptually, a difference fingerprint encodes the differences between the fingerprint of the reactants and fingerprint of the products. • We use differences in counted Scitegic/Accelrys ECFP fingerprints to capture the local environment of the reaction centre without requiring atom mapping. ACS National Meeting, San Diego, USA 27th March 2012
  • 24. Reaction classification • Although reaction similarity may be used to cluster and group mechanistically related reactions together, it is also convenient for them to be categorized by the type of reaction. • Manually assignment into the RSC’s RXNO ontology. • InfoChem’s ICCLASSIFY and ICNAMERXN tools. • Algorithmic assignment of reaction class (compound purchase, purification, chiral separation) or recognized (named) reaction mechanism using SMIRKS transformations. ACS National Meeting, San Diego, USA 27th March 2012
  • 25. categorization of reactions J. Carey, D. Laffan, C. Thomson, M. Williams, Org. Biomol. Chem. 2337, 2006. S. Roughley and A. Jordan, J. Med. Chem. 54:3451-3479, 2011. 23.1 22.4 11.5 18.0 8.2 5.6 5.6 3.1 1.5 1.1 Heteroatom alkylation and arylation Acylation and related processes C-C bond formation Deprotections Heterocycle formation Reductions Functional group conversion Protections Oxidations Functional group addition ACS National Meeting, San Diego, USA 27th March 2012
  • 26. conclusions • In an attempt to better understand and hopefully improve the productivity of synthetic chemists, new computational methods have been developed to process “real world” organic reaction data. • The fruits of this work now enable medicinal chemists and informaticians to make greater use of the wealth of information in their in-house ELNs. ACS National Meeting, San Diego, USA 27th March 2012
  • 27. acknowledgements • Daniel Lowe, NextMove Software. • Plamen Petrov and Mikko Vainio, AstraZeneca. • Richard Bolton and Andrew Wooster, GSK. • Peter Loew and Hans Kraut, InfoChem. • Ethan Hoff, Abbott Laboratories. • Stephen Roughley, Vernalis. • Thank you for your time. ACS National Meeting, San Diego, USA 27th March 2012