SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
extraction, analysis, atom mapping, classification
                                                         and naming of reactions from pharmaceutical elns
                                                                                                                                                      Roger Sayle1, Daniel Lowe1, Noel O’Boyle1, Mick Kappler2,
                                                                                                                                                    Anna Paola Pelliccioli3, Nick Tomkinson4 and Daniel Stoffler5
                                                         1   NextMove Software Ltd, Cambridge, UK. 2 Hoffmann-La Roche, Nutley, USA. 3 Novartis NIBR, Basel, Switzerland.
                                                                                       4 AstraZeneca R&D, Alderley Park, UK. 5 F. Hoffmann-La Roche, Basel, Switzerland.



      Abstract
   1. Overview                                                                                                                                           6. NameRXN: Reaction naming and classification
   Electronic Laboratory Notebooks (ELNs) are widely used in the pharmaceutical                                                                          One useful form of reaction analysis is to recognize each experiment in the ELN as
   industry for recording the details of chemical synthesis experiments. The primary                                                                     an instance of a named or known reaction, such as a Diels –Alder cycloaddition,
   use of this information is often for the capture of intellectual property for future                                                                  Suzuki coupling or chiral separation. The NameRXN program uses a dictionary of
   patent filings, however this data can also be used in a number of additional                                                                          SMIRKS-like transformations to annotate reactions in RD, SD, RXN or reaction
   applications, including synthetic accessibility calculations, reaction planning and                                                                   SMILES format with a name, a reaction category and where applicable an
   reaction yield/optimization. Not only does a pharmaceutical ELN capture those                                                                         identifier into the Royal Society of Chemistry’s RXNO ontology.
   classes of reactions suitable for small scale medicinal chemistry, but it is also
   uniquely a source of information on failed and poor yield reactions; an important                                                                     The top five reactions in a major pharmaceutical company’s ELN were:
   source of data rarely found in the scientific literature or commercial reaction
                                                                                                                                                              ID      Reaction Name                                                                                 Reaction Category                        RXNO #
   databases.
                                                                                                                                                         #1   1.3.1   Buchwald-Hartwig amination                                                                    N-arylation with Ar-X                    0000192
   2. Method overview (workflow)                                                                                                                         #2   7.1     Nitro to amino                                                                                Nitro to amine reduction                 0000337
   In this work we describe the use of a suite of programs for extracting reaction                                                                       #3   1.7.6   Williamson ether synthesis                                                                    O-substitution                           0000090
   information and associated textual and numeric data from the PKI/CambridgeSoft
   eNotebook ELN, converting and normalizing it to “open” flat files that can be read                                                                    #4   2.1.2   Carboxylic acid + amine reaction                                                              N-acylation to amide
   by third party applications, such as Accelrys/MDL RD files or reaction SMILES.                                                                        #5   2.2.3   Sulfonamide Shotten-Baumann                                                                   N-sulfonylation                          0000165
                                                                                                                                Accelrys
                                                                                                                                Pipeline Pilot           Using the reaction classification categories of Carey et al.[1] the contents of the
                                                                                                                                (AstraZeneca, AbbVie
                                                                                                                                & Hoffmann-La Roche)     ELN may be presented as a pie-chart of the kinds of transformations it contains.
                                                                                                                                ChemAxon
                                                                                                                                JChem Cartridge
                                                                                                                                (GlaxoSmithKline
                                              HazELNut        Filbert        NameRXN         Cobnut                             & Novartis)


Perkin Elmer Informatics   Oracle Server                     Microsoft Windows or Linux                                         Elsevier Reaxys
(formerly CambridgeSoft)   version 10 or 11                                                                                     (Hoffmann-La Roche)
eNotebook v9, v11 or v13



    3. HazELNut: Reaction export from PKI/CambridgeSoft eNotebook ELN
   3. HazELNut: Reaction export from PKI/CambridgeSoft eNotebook ELN
   The primary step in exporting a reaction database from an ELN is performed by
   HazELNut, which interprets the chemist’s hand-drawn sketch into a connection
   table. For RD and SD formats, this includes writing the textual, numeric, tabular
   and molecular data as tagged fields in the output file. The process can work
   either from XML files exported from the client, or more typically by querying the
   Oracle server via OCI. A large pharmaceutical ELN can be exported overnight (or                                                                       7. Third-party atom mapping
                                                                                                                                                                        Atom Mapping
   transatlantic over a weekend), though incremental export allows a day’s or week’s                                                                     One way to investigate the quality of the reaction sketches in an ELN is to apply an
   experiments to be written in a few minutes. Experimental write-ups (and other                                                                         automatic atom-atom mapping algorithm, and assess the fraction of unmapped
   rich text) are converted from RTF to text or HTML, superatoms are expanded,                                                                           product atoms and average number of carbon-carbon bonds broken. At the Fall
   labels preserved, and so on. The images below show an ELN reaction, and some                                                                          2012 ACS meeting in Philadelphia, we presented an initial comparison of third-
   of the exported fields as they appear in the corresponding RD file format output.                                                                     party atom mapping software for the purpose of identifying incorrectly drawn
                                                                           $DTYPE HEADER:EXPERIMENT.HEADER:CREATION.DATE
                                                                           $DATUM 22-Oct-2010
                                                                                                                                                         reactions.                      100

                                                                           $DTYPE HEADER:EXPERIMENT.HEADER:CREATION.TIME
                                                                                                                                                                                                                                              90
                                                                           $DATUM 19:58:18 -0500
                                                                                                                                                         Here we summarize recent
                                                                                                                                                                                         Percent of reactions with all product atoms mapped




                                                                           $DTYPE DISCOVERY.CHEMISTRY:REACTANTS(1):CHEMICAL.STRUCTURE                                                                                                                                                                        Marvin 5.12
                                                                           $DATUM c1ccc(cc1)C(=O)O                                                                                                                                            80
                                                                                                                                                                                                                                                                                                             ChemDraw 12
                                                                           $DTYPE DISCOVERY.CHEMISTRY:REACTANTS(1):NAME
                                                                           $DATUM benzoic acid
                                                                           $DTYPE DISCOVERY.CHEMISTRY:REACTANTS(1):MOLECULAR.WEIGHT:VALUE
                                                                                                                                                         improvements to this                                                                 70                                                             Indigo 1.1
                                                                           $DATUM 122.12
                                                                           $DTYPE DISCOVERY.CHEMISTRY:PRODUCTS(1):NAME                                   approach, using consensus                                                            60
                                                                                                                                                                                                                                                                                                             Indigo 1.1 (lenient)
                                                                           $DATUM N-benzyl-N-ethylbenzamide
                                                                           $DTYPE DISCOVERY.CHEMISTRY:PRODUCTS(1):CHEMICAL.STRUCTURE
                                                                           $DATUM CCN(Cc1ccccc1)C(=O)c2ccccc2
                                                                                                                                                         methods to combine the                                                                                                                              ICMap 5.10
                                                                                                                                                                                                                                                                                                             PipelinePilot
                                                                                                                                                                                                                                              50
                                                                           $DTYPE DISCOVERY.CHEMISTRY:PRODUCTS(1):MOLECULAR.FORMULA
                                                                           $DATUM C16H17NO
                                                                           $DTYPE DISCOVERY.CHEMISTRY:PRODUCTS(1):MOLECULAR.WEIGHT:VALUE
                                                                                                                                                         results of multiple atom                                                                                                                            MDL Cheshire
                                                                                                                                                                                                                                              40
                                                                           $DATUM 239.31
                                                                           $DTYPE DISCOVERY.CHEMISTRY:PRODUCTS(1):YIELD:VALUE                            mapping algorithms, and                                                                                                    Verified/Recognised by
                                                                                                                                                                                                                                                                                           NameRXN
                                                                                                                                                                                                                                                                                                             Consensus Result

                                                                           $DATUM 89.8%                                                                                                                                                       30
                                                                           $DTYPE DISCOVERY.CHEMISTRY:PREPARATION
                                                                           $DATUM In a 5 mL round-bottomed flask, benzoic acid (200 mg, 1.64 mmol
                                                                                                                                                         combining atom-mapping                                                                                                              (62%)
                                                                                                                                                                                                                                              20
                                                                           N-benzylethanamine (266 mg, 1.97 mmol, Eq: 1.2) and HATU (747 mg, 1.97
                                                                           mmol, Eq: 1.2) were combined with DMF (5 ml) to give a light brown
                                                                           solution. Hunig's Base (423 mg, 572 μl, 3.28 mmol, Eq: 2.0) was added.
                                                                                                                                                         with reaction naming for
                                                                                                                                                                                                                                              10
                                                                           The reaction mixture was heated to 50 oC (Pressure: 1012 mbar, Rxn
                                                                           Molarity: 328 mM, Reaction Time: 12 min, Molarity Entered?: false) and        ELN quality assurance.
                                                                           stirred. The crude material was purified by flash chromatography                                                                                                   0
                                                                                                                                                                                                                                                   Atom mapping algorithms alone   Combined with NameRXN
  4. Filbert: Reaction file format conversion
  4. Filbert: Reaction file format conversion
  Whilst HazELNut can export reactions in a variety of file formats, its goal is to                                                                      8. Summary and future work
                                                                                                                                                            Conclusions
  “dump” the raw data; processing and file format conversion is performed by                                                                             The use of the HazELNut suite of tools allows the synthetic chemistry data locked
  Filbert. This program interconverts MDL RXN, RD and SD file formats, ISIS Sketch,                                                                      up in corporate ELNs to be exploited in scientifically novel ways. Current efforts
  CDXML and reaction SMILES. The roles of agents, catalysts and solvents drawn                                                                           include optimization of reaction conditions by plotting yields against catalyst and
  above and below a reaction arrow can be preserved using ChemAxon’s RXN file                                                                            solvent in SpotFire, and research to reduce the number of steps and application of
  extensions, stripped or treated as reactants. Co-ordinates can optionally be                                                                           low yield reaction strategies by identifying/reusing in-house insights/expertise.
  centered, rescaled or regenerated algorithmically. Atom maps can be removed.                                                                           9. Bibliography
  Molecules from the reactant/agent tables can be added to the sketch if missing.                                                                        1. John S. Carey, David Laffan, Colin Thomson and Mike T. Williams, “Analysis of
  5. Cobnut: MDL/Accelrys RDF file format tag stripping/renaming
               MDL/Accelrys RDF file format tag stripping/renaming                                                                                          the Reactions used for the Preparation of Drug Candidate Molecules”,
  Typically, ELNs are heavily customized with custom experiment types to capture                                                                            Organic & Biomolecular Chemistry, Vol. 4, pp. 2337-2347, 2006.
  the data needs of their target scientists. The raw export of this data results in RD                                                                   2. Stephen D. Roughley and Allan M. Jordan, “The Medicinal Chemist's Toolbox:
  files where the field/tag names for yields, volumes, chemist’s names,                                                                                     An Analysis of Reactions Used in the Pursuit of Drug Candidates”, Journal of
  experimental write-ups, etc. vary from reaction to reaction. The utility Cobnut                                                                           Medicinal Chemistry, Vol. 54, 3451-3479, 2011.
  was implemented to simplify and speed-up the process of normalizing (renaming)                                                                         3. Mikko J. Vainio, Thierry Kogej and Florian Raubacher, “Automated Recycling of
  RD tags from different data sources, and stripping out those that aren’t required.                                                                        Chemistry for Virtual Screening and Library Design”, Journal of Chemical
                                                                                                                                                            Information and Modeling (JCIM), Vol. 52, No. 7, pp. 1777-1786, June 2012.


                                                                                                                                                                                                                                                                                    NextMove Software Limited
                                                                                                                                                                                                                                                                                    Innovation Centre (Unit 23)
                                                                                                                             www.nextmovesoftware.co.uk
                                                                                                                                                                                                                                                                                       Cambridge Science Park
                                                                                                                             www.nextmovesoftware.com
                                                                                                                                                                                                                                                                                       Milton Road, Cambridge
                                                                                                                                                                                                                                                                                          England, UK CB4 0EY

Weitere ähnliche Inhalte

Andere mochten auch

wordpress training | wordpress certification | wordpress training course | wo...
wordpress training | wordpress certification | wordpress training course | wo...wordpress training | wordpress certification | wordpress training course | wo...
wordpress training | wordpress certification | wordpress training course | wo...Nancy Thomas
 
Though The Lens of an iPhone: Cartagena, Colombia
Though The Lens of an iPhone: Cartagena, ColombiaThough The Lens of an iPhone: Cartagena, Colombia
Though The Lens of an iPhone: Cartagena, ColombiaPaul Brown
 
Styling packaging and_displaying_your_product
Styling packaging and_displaying_your_productStyling packaging and_displaying_your_product
Styling packaging and_displaying_your_productFred Schechter
 
How to leverage existing investments ver1.1
How to leverage existing investments ver1.1How to leverage existing investments ver1.1
How to leverage existing investments ver1.1Kannathasan Meipporul
 
8 preguntas que generan debate en antiagregación - Dr. José Ramón González-Ju...
8 preguntas que generan debate en antiagregación - Dr. José Ramón González-Ju...8 preguntas que generan debate en antiagregación - Dr. José Ramón González-Ju...
8 preguntas que generan debate en antiagregación - Dr. José Ramón González-Ju...Sociedad Española de Cardiología
 
En i̇yi̇ kas yapici besi̇nler
En i̇yi̇ kas yapici besi̇nlerEn i̇yi̇ kas yapici besi̇nler
En i̇yi̇ kas yapici besi̇nlerRAMAZAN AKDAĞCIK
 
Using Metrics to Guide Your Customer Service Adventure
Using Metrics to Guide Your Customer Service AdventureUsing Metrics to Guide Your Customer Service Adventure
Using Metrics to Guide Your Customer Service AdventureAaron Wheeler
 
8 Ways to Avoid a Midday Productivity Crash
8 Ways to Avoid a Midday Productivity Crash8 Ways to Avoid a Midday Productivity Crash
8 Ways to Avoid a Midday Productivity CrashPGi
 
8 Signs that you are born entrepreneur
8 Signs that you are born entrepreneur8 Signs that you are born entrepreneur
8 Signs that you are born entrepreneurPrashanth Vishwanath
 
YouTube Basics For Small Business
YouTube Basics For Small BusinessYouTube Basics For Small Business
YouTube Basics For Small BusinessTexpoco
 
98791866 proyecto-de-innovacion-computacion-20105
98791866 proyecto-de-innovacion-computacion-2010598791866 proyecto-de-innovacion-computacion-20105
98791866 proyecto-de-innovacion-computacion-20105Elmer Leon Berrocal
 
901 install instant wordpress new
901 install instant wordpress new901 install instant wordpress new
901 install instant wordpress newSatoru Hoshiba
 
Enseñanza basada en ticc’s para el aprendizaje de la Microbiología
Enseñanza basada en ticc’s para el aprendizaje de la MicrobiologíaEnseñanza basada en ticc’s para el aprendizaje de la Microbiología
Enseñanza basada en ticc’s para el aprendizaje de la MicrobiologíaOfelia Candolfi Arballo
 

Andere mochten auch (17)

wordpress training | wordpress certification | wordpress training course | wo...
wordpress training | wordpress certification | wordpress training course | wo...wordpress training | wordpress certification | wordpress training course | wo...
wordpress training | wordpress certification | wordpress training course | wo...
 
Though The Lens of an iPhone: Cartagena, Colombia
Though The Lens of an iPhone: Cartagena, ColombiaThough The Lens of an iPhone: Cartagena, Colombia
Though The Lens of an iPhone: Cartagena, Colombia
 
Styling packaging and_displaying_your_product
Styling packaging and_displaying_your_productStyling packaging and_displaying_your_product
Styling packaging and_displaying_your_product
 
How to leverage existing investments ver1.1
How to leverage existing investments ver1.1How to leverage existing investments ver1.1
How to leverage existing investments ver1.1
 
La Educación con sentido del humor
La Educación con sentido del humorLa Educación con sentido del humor
La Educación con sentido del humor
 
8 preguntas que generan debate en antiagregación - Dr. José Ramón González-Ju...
8 preguntas que generan debate en antiagregación - Dr. José Ramón González-Ju...8 preguntas que generan debate en antiagregación - Dr. José Ramón González-Ju...
8 preguntas que generan debate en antiagregación - Dr. José Ramón González-Ju...
 
En i̇yi̇ kas yapici besi̇nler
En i̇yi̇ kas yapici besi̇nlerEn i̇yi̇ kas yapici besi̇nler
En i̇yi̇ kas yapici besi̇nler
 
Presentacion mariela
Presentacion marielaPresentacion mariela
Presentacion mariela
 
Using Metrics to Guide Your Customer Service Adventure
Using Metrics to Guide Your Customer Service AdventureUsing Metrics to Guide Your Customer Service Adventure
Using Metrics to Guide Your Customer Service Adventure
 
El alma de america vista en un calabazo
El alma de america vista en un calabazoEl alma de america vista en un calabazo
El alma de america vista en un calabazo
 
8 Ways to Avoid a Midday Productivity Crash
8 Ways to Avoid a Midday Productivity Crash8 Ways to Avoid a Midday Productivity Crash
8 Ways to Avoid a Midday Productivity Crash
 
8 Signs that you are born entrepreneur
8 Signs that you are born entrepreneur8 Signs that you are born entrepreneur
8 Signs that you are born entrepreneur
 
YouTube Basics For Small Business
YouTube Basics For Small BusinessYouTube Basics For Small Business
YouTube Basics For Small Business
 
Ansible
AnsibleAnsible
Ansible
 
98791866 proyecto-de-innovacion-computacion-20105
98791866 proyecto-de-innovacion-computacion-2010598791866 proyecto-de-innovacion-computacion-20105
98791866 proyecto-de-innovacion-computacion-20105
 
901 install instant wordpress new
901 install instant wordpress new901 install instant wordpress new
901 install instant wordpress new
 
Enseñanza basada en ticc’s para el aprendizaje de la Microbiología
Enseñanza basada en ticc’s para el aprendizaje de la MicrobiologíaEnseñanza basada en ticc’s para el aprendizaje de la Microbiología
Enseñanza basada en ticc’s para el aprendizaje de la Microbiología
 

Ähnlich wie Extraction, analysis, atom mapping, classification and naming of reactions from pharmaceutical ELNs

Virtual Reaction Service Using Chem Axon Reactor July06
Virtual Reaction Service Using Chem Axon Reactor July06Virtual Reaction Service Using Chem Axon Reactor July06
Virtual Reaction Service Using Chem Axon Reactor July06DanielSButler
 
Reactive Chemical Hazard Alerting in Pharmaceutical Notebooks
Reactive Chemical Hazard Alerting in Pharmaceutical NotebooksReactive Chemical Hazard Alerting in Pharmaceutical Notebooks
Reactive Chemical Hazard Alerting in Pharmaceutical NotebooksNextMove Software
 
AIChE 2011 - Automatic Reaction Mechanism Generation with Group Additive Kine...
AIChE 2011 - Automatic Reaction Mechanism Generation with Group Additive Kine...AIChE 2011 - Automatic Reaction Mechanism Generation with Group Additive Kine...
AIChE 2011 - Automatic Reaction Mechanism Generation with Group Additive Kine...Richard West
 
Chemiluminescence Immunoassay (CLIA) Technique
Chemiluminescence Immunoassay (CLIA) TechniqueChemiluminescence Immunoassay (CLIA) Technique
Chemiluminescence Immunoassay (CLIA) TechniqueTapeshwar Yadav
 
Open-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databasesOpen-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databasesGreg Landrum
 
Screening Assays For Gpc Rs 3
Screening Assays For Gpc Rs 3Screening Assays For Gpc Rs 3
Screening Assays For Gpc Rs 3Shirley Pullan
 
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesCINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesNextMove Software
 
Hydroxylamine Cleaning Chemistries
Hydroxylamine Cleaning ChemistriesHydroxylamine Cleaning Chemistries
Hydroxylamine Cleaning Chemistrieswaimunlee
 
Recent Advances Webinar Part 8
Recent Advances Webinar Part 8Recent Advances Webinar Part 8
Recent Advances Webinar Part 8dominev
 

Ähnlich wie Extraction, analysis, atom mapping, classification and naming of reactions from pharmaceutical ELNs (11)

15 N Performance Validation Ss
15 N Performance Validation Ss15 N Performance Validation Ss
15 N Performance Validation Ss
 
Virtual Reaction Service Using Chem Axon Reactor July06
Virtual Reaction Service Using Chem Axon Reactor July06Virtual Reaction Service Using Chem Axon Reactor July06
Virtual Reaction Service Using Chem Axon Reactor July06
 
Reactive Chemical Hazard Alerting in Pharmaceutical Notebooks
Reactive Chemical Hazard Alerting in Pharmaceutical NotebooksReactive Chemical Hazard Alerting in Pharmaceutical Notebooks
Reactive Chemical Hazard Alerting in Pharmaceutical Notebooks
 
AIChE 2011 - Automatic Reaction Mechanism Generation with Group Additive Kine...
AIChE 2011 - Automatic Reaction Mechanism Generation with Group Additive Kine...AIChE 2011 - Automatic Reaction Mechanism Generation with Group Additive Kine...
AIChE 2011 - Automatic Reaction Mechanism Generation with Group Additive Kine...
 
Chemiluminescence Immunoassay (CLIA) Technique
Chemiluminescence Immunoassay (CLIA) TechniqueChemiluminescence Immunoassay (CLIA) Technique
Chemiluminescence Immunoassay (CLIA) Technique
 
Open-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databasesOpen-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databases
 
Screening Assays For Gpc Rs 3
Screening Assays For Gpc Rs 3Screening Assays For Gpc Rs 3
Screening Assays For Gpc Rs 3
 
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesCINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
 
Hydroxylamine Cleaning Chemistries
Hydroxylamine Cleaning ChemistriesHydroxylamine Cleaning Chemistries
Hydroxylamine Cleaning Chemistries
 
Validating the ChemSpider Open Spectral Database NMR Collection using ACD/Lab...
Validating the ChemSpider Open Spectral Database NMR Collection using ACD/Lab...Validating the ChemSpider Open Spectral Database NMR Collection using ACD/Lab...
Validating the ChemSpider Open Spectral Database NMR Collection using ACD/Lab...
 
Recent Advances Webinar Part 8
Recent Advances Webinar Part 8Recent Advances Webinar Part 8
Recent Advances Webinar Part 8
 

Mehr von NextMove Software

CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...NextMove Software
 
Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...NextMove Software
 
CINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedCINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedNextMove Software
 
A de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESA de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESNextMove Software
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionNextMove Software
 
Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...NextMove Software
 
Comparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsComparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsNextMove Software
 
Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...NextMove Software
 
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...NextMove Software
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKitNextMove Software
 
Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...NextMove Software
 
Digital Chemical Representations
Digital Chemical RepresentationsDigital Chemical Representations
Digital Chemical RepresentationsNextMove Software
 
Challenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsChallenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsNextMove Software
 
PubChem as a Biologics Database
PubChem as a Biologics DatabasePubChem as a Biologics Database
PubChem as a Biologics DatabaseNextMove Software
 
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...NextMove Software
 
Building on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesBuilding on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesNextMove Software
 
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...NextMove Software
 
Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)NextMove Software
 
Challenges in Chemical Information Exchange
Challenges in Chemical Information ExchangeChallenges in Chemical Information Exchange
Challenges in Chemical Information ExchangeNextMove Software
 

Mehr von NextMove Software (20)

DeepSMILES
DeepSMILESDeepSMILES
DeepSMILES
 
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
 
Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...
 
CINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedCINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speed
 
A de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESA de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILES
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
 
Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...
 
Comparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsComparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule Implementations
 
Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...
 
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKit
 
Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...
 
Digital Chemical Representations
Digital Chemical RepresentationsDigital Chemical Representations
Digital Chemical Representations
 
Challenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsChallenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptions
 
PubChem as a Biologics Database
PubChem as a Biologics DatabasePubChem as a Biologics Database
PubChem as a Biologics Database
 
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
 
Building on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesBuilding on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfiles
 
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
 
Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)
 
Challenges in Chemical Information Exchange
Challenges in Chemical Information ExchangeChallenges in Chemical Information Exchange
Challenges in Chemical Information Exchange
 

Kürzlich hochgeladen

Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.EnglishCEIPdeSigeiro
 
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxSlides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxCapitolTechU
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptxSandy Millin
 
How to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeHow to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeCeline George
 
5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...CaraSkikne1
 
3.21.24 The Origins of Black Power.pptx
3.21.24  The Origins of Black Power.pptx3.21.24  The Origins of Black Power.pptx
3.21.24 The Origins of Black Power.pptxmary850239
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfYu Kanazawa / Osaka University
 
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya
 
Ultra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxUltra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxDr. Asif Anas
 
Optical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptxOptical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptxPurva Nikam
 
Department of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfDepartment of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfMohonDas
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfMohonDas
 
Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustVani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustSavipriya Raghavendra
 
How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17Celine George
 
Protein Structure - threading Protein modelling pptx
Protein Structure - threading Protein modelling pptxProtein Structure - threading Protein modelling pptx
Protein Structure - threading Protein modelling pptxvidhisharma994099
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17Celine George
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...raviapr7
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfMohonDas
 
Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICESayali Powar
 

Kürzlich hochgeladen (20)

Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.Easter in the USA presentation by Chloe.
Easter in the USA presentation by Chloe.
 
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptxSlides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
Slides CapTechTalks Webinar March 2024 Joshua Sinai.pptx
 
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
2024.03.23 What do successful readers do - Sandy Millin for PARK.pptx
 
How to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using CodeHow to Send Emails From Odoo 17 Using Code
How to Send Emails From Odoo 17 Using Code
 
5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...5 charts on South Africa as a source country for international student recrui...
5 charts on South Africa as a source country for international student recrui...
 
3.21.24 The Origins of Black Power.pptx
3.21.24  The Origins of Black Power.pptx3.21.24  The Origins of Black Power.pptx
3.21.24 The Origins of Black Power.pptx
 
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdfP4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
P4C x ELT = P4ELT: Its Theoretical Background (Kanazawa, 2024 March).pdf
 
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
 
Ultra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptxUltra structure and life cycle of Plasmodium.pptx
Ultra structure and life cycle of Plasmodium.pptx
 
Optical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptxOptical Fibre and It's Applications.pptx
Optical Fibre and It's Applications.pptx
 
Department of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfDepartment of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdf
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdf
 
Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational TrustVani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
Vani Magazine - Quarterly Magazine of Seshadripuram Educational Trust
 
How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17How to Add a New Field in Existing Kanban View in Odoo 17
How to Add a New Field in Existing Kanban View in Odoo 17
 
Protein Structure - threading Protein modelling pptx
Protein Structure - threading Protein modelling pptxProtein Structure - threading Protein modelling pptx
Protein Structure - threading Protein modelling pptx
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdf
 
March 2024 Directors Meeting, Division of Student Affairs and Academic Support
March 2024 Directors Meeting, Division of Student Affairs and Academic SupportMarch 2024 Directors Meeting, Division of Student Affairs and Academic Support
March 2024 Directors Meeting, Division of Student Affairs and Academic Support
 
Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICE
 

Extraction, analysis, atom mapping, classification and naming of reactions from pharmaceutical ELNs

  • 1. extraction, analysis, atom mapping, classification and naming of reactions from pharmaceutical elns Roger Sayle1, Daniel Lowe1, Noel O’Boyle1, Mick Kappler2, Anna Paola Pelliccioli3, Nick Tomkinson4 and Daniel Stoffler5 1 NextMove Software Ltd, Cambridge, UK. 2 Hoffmann-La Roche, Nutley, USA. 3 Novartis NIBR, Basel, Switzerland. 4 AstraZeneca R&D, Alderley Park, UK. 5 F. Hoffmann-La Roche, Basel, Switzerland. Abstract 1. Overview 6. NameRXN: Reaction naming and classification Electronic Laboratory Notebooks (ELNs) are widely used in the pharmaceutical One useful form of reaction analysis is to recognize each experiment in the ELN as industry for recording the details of chemical synthesis experiments. The primary an instance of a named or known reaction, such as a Diels –Alder cycloaddition, use of this information is often for the capture of intellectual property for future Suzuki coupling or chiral separation. The NameRXN program uses a dictionary of patent filings, however this data can also be used in a number of additional SMIRKS-like transformations to annotate reactions in RD, SD, RXN or reaction applications, including synthetic accessibility calculations, reaction planning and SMILES format with a name, a reaction category and where applicable an reaction yield/optimization. Not only does a pharmaceutical ELN capture those identifier into the Royal Society of Chemistry’s RXNO ontology. classes of reactions suitable for small scale medicinal chemistry, but it is also uniquely a source of information on failed and poor yield reactions; an important The top five reactions in a major pharmaceutical company’s ELN were: source of data rarely found in the scientific literature or commercial reaction ID Reaction Name Reaction Category RXNO # databases. #1 1.3.1 Buchwald-Hartwig amination N-arylation with Ar-X 0000192 2. Method overview (workflow) #2 7.1 Nitro to amino Nitro to amine reduction 0000337 In this work we describe the use of a suite of programs for extracting reaction #3 1.7.6 Williamson ether synthesis O-substitution 0000090 information and associated textual and numeric data from the PKI/CambridgeSoft eNotebook ELN, converting and normalizing it to “open” flat files that can be read #4 2.1.2 Carboxylic acid + amine reaction N-acylation to amide by third party applications, such as Accelrys/MDL RD files or reaction SMILES. #5 2.2.3 Sulfonamide Shotten-Baumann N-sulfonylation 0000165 Accelrys Pipeline Pilot Using the reaction classification categories of Carey et al.[1] the contents of the (AstraZeneca, AbbVie & Hoffmann-La Roche) ELN may be presented as a pie-chart of the kinds of transformations it contains. ChemAxon JChem Cartridge (GlaxoSmithKline HazELNut Filbert NameRXN Cobnut & Novartis) Perkin Elmer Informatics Oracle Server Microsoft Windows or Linux Elsevier Reaxys (formerly CambridgeSoft) version 10 or 11 (Hoffmann-La Roche) eNotebook v9, v11 or v13 3. HazELNut: Reaction export from PKI/CambridgeSoft eNotebook ELN 3. HazELNut: Reaction export from PKI/CambridgeSoft eNotebook ELN The primary step in exporting a reaction database from an ELN is performed by HazELNut, which interprets the chemist’s hand-drawn sketch into a connection table. For RD and SD formats, this includes writing the textual, numeric, tabular and molecular data as tagged fields in the output file. The process can work either from XML files exported from the client, or more typically by querying the Oracle server via OCI. A large pharmaceutical ELN can be exported overnight (or 7. Third-party atom mapping Atom Mapping transatlantic over a weekend), though incremental export allows a day’s or week’s One way to investigate the quality of the reaction sketches in an ELN is to apply an experiments to be written in a few minutes. Experimental write-ups (and other automatic atom-atom mapping algorithm, and assess the fraction of unmapped rich text) are converted from RTF to text or HTML, superatoms are expanded, product atoms and average number of carbon-carbon bonds broken. At the Fall labels preserved, and so on. The images below show an ELN reaction, and some 2012 ACS meeting in Philadelphia, we presented an initial comparison of third- of the exported fields as they appear in the corresponding RD file format output. party atom mapping software for the purpose of identifying incorrectly drawn $DTYPE HEADER:EXPERIMENT.HEADER:CREATION.DATE $DATUM 22-Oct-2010 reactions. 100 $DTYPE HEADER:EXPERIMENT.HEADER:CREATION.TIME 90 $DATUM 19:58:18 -0500 Here we summarize recent Percent of reactions with all product atoms mapped $DTYPE DISCOVERY.CHEMISTRY:REACTANTS(1):CHEMICAL.STRUCTURE Marvin 5.12 $DATUM c1ccc(cc1)C(=O)O 80 ChemDraw 12 $DTYPE DISCOVERY.CHEMISTRY:REACTANTS(1):NAME $DATUM benzoic acid $DTYPE DISCOVERY.CHEMISTRY:REACTANTS(1):MOLECULAR.WEIGHT:VALUE improvements to this 70 Indigo 1.1 $DATUM 122.12 $DTYPE DISCOVERY.CHEMISTRY:PRODUCTS(1):NAME approach, using consensus 60 Indigo 1.1 (lenient) $DATUM N-benzyl-N-ethylbenzamide $DTYPE DISCOVERY.CHEMISTRY:PRODUCTS(1):CHEMICAL.STRUCTURE $DATUM CCN(Cc1ccccc1)C(=O)c2ccccc2 methods to combine the ICMap 5.10 PipelinePilot 50 $DTYPE DISCOVERY.CHEMISTRY:PRODUCTS(1):MOLECULAR.FORMULA $DATUM C16H17NO $DTYPE DISCOVERY.CHEMISTRY:PRODUCTS(1):MOLECULAR.WEIGHT:VALUE results of multiple atom MDL Cheshire 40 $DATUM 239.31 $DTYPE DISCOVERY.CHEMISTRY:PRODUCTS(1):YIELD:VALUE mapping algorithms, and Verified/Recognised by NameRXN Consensus Result $DATUM 89.8% 30 $DTYPE DISCOVERY.CHEMISTRY:PREPARATION $DATUM In a 5 mL round-bottomed flask, benzoic acid (200 mg, 1.64 mmol combining atom-mapping (62%) 20 N-benzylethanamine (266 mg, 1.97 mmol, Eq: 1.2) and HATU (747 mg, 1.97 mmol, Eq: 1.2) were combined with DMF (5 ml) to give a light brown solution. Hunig's Base (423 mg, 572 μl, 3.28 mmol, Eq: 2.0) was added. with reaction naming for 10 The reaction mixture was heated to 50 oC (Pressure: 1012 mbar, Rxn Molarity: 328 mM, Reaction Time: 12 min, Molarity Entered?: false) and ELN quality assurance. stirred. The crude material was purified by flash chromatography 0 Atom mapping algorithms alone Combined with NameRXN 4. Filbert: Reaction file format conversion 4. Filbert: Reaction file format conversion Whilst HazELNut can export reactions in a variety of file formats, its goal is to 8. Summary and future work Conclusions “dump” the raw data; processing and file format conversion is performed by The use of the HazELNut suite of tools allows the synthetic chemistry data locked Filbert. This program interconverts MDL RXN, RD and SD file formats, ISIS Sketch, up in corporate ELNs to be exploited in scientifically novel ways. Current efforts CDXML and reaction SMILES. The roles of agents, catalysts and solvents drawn include optimization of reaction conditions by plotting yields against catalyst and above and below a reaction arrow can be preserved using ChemAxon’s RXN file solvent in SpotFire, and research to reduce the number of steps and application of extensions, stripped or treated as reactants. Co-ordinates can optionally be low yield reaction strategies by identifying/reusing in-house insights/expertise. centered, rescaled or regenerated algorithmically. Atom maps can be removed. 9. Bibliography Molecules from the reactant/agent tables can be added to the sketch if missing. 1. John S. Carey, David Laffan, Colin Thomson and Mike T. Williams, “Analysis of 5. Cobnut: MDL/Accelrys RDF file format tag stripping/renaming MDL/Accelrys RDF file format tag stripping/renaming the Reactions used for the Preparation of Drug Candidate Molecules”, Typically, ELNs are heavily customized with custom experiment types to capture Organic & Biomolecular Chemistry, Vol. 4, pp. 2337-2347, 2006. the data needs of their target scientists. The raw export of this data results in RD 2. Stephen D. Roughley and Allan M. Jordan, “The Medicinal Chemist's Toolbox: files where the field/tag names for yields, volumes, chemist’s names, An Analysis of Reactions Used in the Pursuit of Drug Candidates”, Journal of experimental write-ups, etc. vary from reaction to reaction. The utility Cobnut Medicinal Chemistry, Vol. 54, 3451-3479, 2011. was implemented to simplify and speed-up the process of normalizing (renaming) 3. Mikko J. Vainio, Thierry Kogej and Florian Raubacher, “Automated Recycling of RD tags from different data sources, and stripping out those that aren’t required. Chemistry for Virtual Screening and Library Design”, Journal of Chemical Information and Modeling (JCIM), Vol. 52, No. 7, pp. 1777-1786, June 2012. NextMove Software Limited Innovation Centre (Unit 23) www.nextmovesoftware.co.uk Cambridge Science Park www.nextmovesoftware.com Milton Road, Cambridge England, UK CB4 0EY