SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Six Not-So-Easy Pieces: Tautomers,
Nucleic Acids, Inorganics, Reactions,
Polymers and SMIRKS (in RDKit).
Roger Sayle, John Mayfield, Noel O’Boyle
NextMove Software, Cambridge, UK
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Quick overview
1. Tautomers
• Tautomer matching without enumeration
2. Nucleic Acids
• RDKit support for RNA and DNA in PDB, HELM and FASTA.
3. Inorganics
• Biovia’s valence table revisions
4. Reactions
• Some comments on expert rule sets
5. Polymers
• InChI’s recent support for polymers
6. SMIRKS
• Standardizing interpretation/semantics of SMIRKS
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
1. tautomers
• Tautomers are molecular isomers that easily
interconvert by migration of hydrogen atoms1.
1. R. Sayle, “So you think you understand tautomerism?”, JCAMD, 24(6-7):485-496, June 2010.
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
InChI=1S/C16H12N20/c19-16-11-10-15(13-8-4-5-9-14(13)16)18-17-12-6-2-1-3-7-12/h1-11,19H
InChI=1S/C16H12N20/c19-16-11-10-15(13-8-4-5-9-14(13)16)18-17-12-6-2-1-3-7-12/h1-11,17H
Friends of tautomers:
mesomers & protomers
InChI=1S/C2H4OS/c1-2(3)4/h1H3,(H,3,4)
InChI=1S/C2H4OS/c1-2(3)4/h1H3,(H,3,4)/p-1
Paolo’s solution: enumeration
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Enumeration vs. unumeration
• Given a generator function F, which is a one-to-many
mapping, many types of problem ask if F(X) == Y.
• Frequently, an efficient solution is to find the inverse
many-to-one mapping, and ask if F-1(Y) == X.
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
X
F (X)1
F (X)2
F (X)3
Y=
Alternative approach
• Automatically transform
– NC(=N)c1ccc(cc1)C(=O)O
• Into the bond order, formal charge and polar
hydrogen count suppressed SMARTS pattern
– [#7D1]~[#6H0D3](~[#7D1])~[#6H0D3]1~[#6H1D2]~[#6H1D
2]~[#6H0D3](~[#6H1D2]~[#6H1D2]~1)~[#6H0D3](~[#8D1])
~[#8D1]
• Plus additional constraints on the total formal charge
and the total polar hydrogen count.
– polar_hcount – total_charge = 4
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
2. Nucleic acids
• Support for DNA and RNA sequences in PDB, HELM
and FASTA will shortly be contributed to RDKit.
• Only a minor set of changes to RDKit’s APIs:
– RDKit::SequenceToMol(string,sanitize,flavor)
– RDKit::FASTAToMol(string,sanitize,flavor)
– RDKit::PDBBlockToMol(string)
– RDKit::HELMToMol(string)
– RDKit::MolToSequence(mol)
– RDKit::MolToFASTA(mol)
– RDKit::MolToPDBBlock(mol)
– RDKit::MolToHELM(mol)
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Biovia vs. pistoia helm issues
• RDKit’s implementation doesn’t have Biovia’s bugs.
• The monomer phase problem in RNA registration:
– Sequence: GATTACA
– Interpretation: 5’-G-A-T-T-A-C-A-3’
– Implied phosphates: 5’-G-P-A-P-T-P-T-P-A-P-C-P-A-3’
– IUMB/PDB/Biovia: G-PA-PT-PT-PA-PC-PA
– Pistoia HELM: GP-AP-TP-TP-AP-CP-A
• This discrepancy complicates RNA registration and
leads to serious bugs in Biovia’s HELM support.
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Rna bugs in biovia helm support
• From Sequence
– “ACGT” drawn in Biovia has 4 phosphates, but the HELM
editor interprets this sequence as having three.
– Biovia’s Mol file for this sequence places the phosphate at
the 5’ end, but the HELM string it generates has it at the 3’.
• HELM interpretation
– RNA1{R(A).P.R(C)P.R(T)P.R(G)}$$$$ is exactly the same
molecule as RNA1{R(A).PR(C).PR(T).PR(G)}$$$$ in HELM.
– These strings have very different behaviour in Biovia,
neither of which has the correct connection table.
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Expanded SCSR file
select molfile(mol('RNA1{R(A)P.R(T)P.R(C)P.R(G)}$$$$')) from dual;
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Rdkit helm implementation
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
No Caps
Flavor =2
5’-Cap
Flavor = 3
3’-Cap
Flavor = 4
Both Caps
Flavor = 5
3. inorganics
• Sanitization failures: Insane or insanitary?
• Great recent advances in RDKit.
– Hexafluorophosphate
– Dess-Martin Periodinane
• Perhaps exceptions for remaining “platypi”.
– Chlorine Dioxide O=[Cl]=O
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Mdl molfile-ageDdon
• Biovia 2017 changes the interpretation of MDL files.
• This affects over 213097 CIDs in PubChem!
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Heavy metals (it goes beyond 111)
Atomic RDKit RDKit Toolkit Biovia
Number Symbol IUPAC Ptable SMILES Heaviest 2017 SMARTS
103 Lr Y Y Indigo Lr
104 Rf Y Y ‘Rf’
105 Db Y ‘Db’
106 Sg Y ‘Sg’
107 Bh Y ‘Bh’
108 Hs Y ‘Hs’
109 Mt Y ChemAxon ‘Mt’
110 Ds Y ‘Ds’
111 Rg Y OEChem ‘Rg’
112 Cn Y OpenBabel ‘Cn’
113 Nh 2016? N&h
114 Fl ‘Fl’
115 Mc 2016?
116 Lv CDK ‘Lv’
117 Ts 2016?
118 Og 2016?
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
4. reactions
• Virtual library enumerations are overly optimistic.
– SAVI, SCUBIDOO, Enamine, PLC, Evotec, Novartis?
• Transformations vs. Reactions is a major challenge.
– Which reactions to use?
– Reactions that do happen!
– Reactions that don’t happen!
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Which reactions?
• NCI/LHASA SAVI (14+6 reactions)
– Suzuki Coupling 36262 out of 1.2M USPTO examples
– Sulfonamide Schotten-Baumann 14348 out of 1.2M USPTO examples
– Buchwald-Harwig 6040 out of 1.2M USPTO examples
– Hiyama coupling 458 out of 1.2M USPTO examples
– Fukuyama coupling 2 out of 1.2M USPTO examples
– Liebeskind-Srogl coupling 0 out of 1.2M USPTO examples
• Hartenfeller (58 reactions)
– #1 Pictet-Spengler reaction 7 out of 1.2M USPTO examples
– #10+ #11 Azide-nitrile Huisgen-cycloaddition 5 out of 1.2M USPTO examples
– #17 Pyridone synthesis 2 out of 1.2M USPTO examples
– #20 Phthalazinone synthesis 16 out of 1.2M USPTO examples
– #24 Friedlander quinoline synthesis 30 out of 1.2M USPTO examples
• Enamine REAL (43 reactions)
– Thiourea to guanidine (14 out of 160M examples).
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Reactions that do happen!
• Chloro Sonogashira Couplings (@ NCI)
– The LHASA ‘CHMTRN’ rules for Transform 2267, Sonogashira couplings,
(presented by Marc Nicklaus at Sheffield) states “Iodides are usually
more reactive than bromide. Chlorides do not react”.
Code Name Count Yield
3.3.2 Bromo Sonogashira coupling 3717 49.3%
3.3.3 Chloro Sonogashira coupling 429 44.2%
3.3.4 Iodo Sonogashira coupling 2721 64.9%
• Isotopically-labelled Compounds (@Eli Lilly)
– The LAAR reactions used to construct Lilly’s PLC (Nicolaou et al. 2016)
forbid the presence of isotopic labels in reactants.
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Reactions that don’t happen!
• Paal-Knorr Pyrrole Synthesis vs. Aldehydes/Ketones
Of the 430 examples of
Paal-Knorr pyrrole
synthesis reported in
US patent applications
2001-2012, exactly
zero have more than
the two reacting
ketones/aldehydes.
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
5. polymers
• The InChI Trust has added polymer support to InChI.
• Or has it?
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
InChI=1B/C2H4O/c1-2-3-1/h1-2H2/z101-1-
3(1,2,1,3,2,3)
InChIKey=IAYPIBMASNFSPL-GCGQHNKHBA-N
InChI=1B/C4H8O2/c1-2-6-4-3-5-1/h1-4H2/z101-1-
6(1,2,1,5,2,6,3,4,3,5,4,6)
InChIKey=RYHBNJHYFVUHQT-UEXHMUNTBA-N
Experimental Beta Option -Polymers
INChI PolymerS
capping Group Frame Shift
InChI=1B/C4H8N2O3/c5-1-3(7)6-2-4(8)9/h1-
2,5H2,(H,6,7)(H,8,9)/z101-1,3,6-7(2-6,5-1)
InChIKey=YMAWOPBAYDPSLA-NPFRMHEKBA-N
InChI=1B/C4H8N2O3/c5-1-3(7)6-2-4(8)9/h1-
2,5H2,(H,6,7)(H,8,9)/z101-2-3,6-7(1-3,4-2)
InChIKey=YMAWOPBAYDPSLA-LCMLVUGIBA-N
InChI=1B/C4H8N2O3/c5-1-3(7)6-2-4(8)9/h1-
2,5H2,(H,6,7)(H,8,9)/z101-2,4,6,8(3-6,9-4)
InChIKey=YMAWOPBAYDPSLA-YYSJEQPSBA-N
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
6. smirks
• SMIRKS is Daylight’s Reaction Transform Language.
• SMILES is nearly portable, SMARTS is improving, but...
• The RSC’s CVSP platform contains rules such as:
[N-,P-:1][C:2]=[N+,P+:3]>>[N,P:1]=[C:2][N,P:3]
1. Daylight SMIRKS requires SMILES on the RHS, but allows
SMARTS on the LHS.
2. In SMIRKS, a property is matched/modified if specified.
[N-,P-:1][C:2]=[N+,P+:3]>>[*+0:1]=[C:2][*+0:3]
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
Need an Opensmirks standard?
• Product Primitives
– SMILES Primitives
• #N Atomic Numbers
• N Isotopic Mass
• +N/-N Formal Charge
– SMARTS Primitives
• hN/HN (Implicit) Hydrogen Count
• A/a Aliphatic/Aromatic
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
acknowledgements
• The RDKit Hackers at Novartis (and T5)
– Greg Landrum
– Nadine Schneider
– Joann Prescott-Roy
• The team at NextMove Software
– John Mayfield
– Noel O’Boyle
– Daniel Lowe
• And many thanks for your time!
5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016

Weitere ähnliche Inhalte

Was ist angesagt?

Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...NextMove Software
 
Unlocking chemical information from tables and legacy articles
Unlocking chemical information from tables and legacy articlesUnlocking chemical information from tables and legacy articles
Unlocking chemical information from tables and legacy articlesNextMove Software
 
Chemical structure representation in PubChem
Chemical structure representation in PubChemChemical structure representation in PubChem
Chemical structure representation in PubChemNextMove Software
 
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...NextMove Software
 
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesCINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesNextMove Software
 
CINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedCINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedNextMove Software
 
Automated Extraction of Reactions from the Patent Literature
Automated Extraction of Reactions from the Patent LiteratureAutomated Extraction of Reactions from the Patent Literature
Automated Extraction of Reactions from the Patent Literaturedan2097
 
Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...
Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...
Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...NextMove Software
 
Standardized Representations of ELN Reactions for Categorization and Duplicat...
Standardized Representations of ELN Reactions for Categorization and Duplicat...Standardized Representations of ELN Reactions for Categorization and Duplicat...
Standardized Representations of ELN Reactions for Categorization and Duplicat...NextMove Software
 
Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...NextMove Software
 
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...NextMove Software
 
Xu_et_al-2015-Angewandte_Chemie_International_Edition
Xu_et_al-2015-Angewandte_Chemie_International_EditionXu_et_al-2015-Angewandte_Chemie_International_Edition
Xu_et_al-2015-Angewandte_Chemie_International_EditionThomas Bobinski
 
Critical Assessment of Function Annotation, 2005
Critical Assessment of Function Annotation, 2005Critical Assessment of Function Annotation, 2005
Critical Assessment of Function Annotation, 2005Iddo
 
Acs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvspAcs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvspKen Karapetyan
 
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...NextMove Software
 
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...Standardization and Generation of Parents for Open PHACTS Chemical Registry S...
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...Ken Karapetyan
 
Cv2 Michael D Johnson
Cv2 Michael D JohnsonCv2 Michael D Johnson
Cv2 Michael D Johnsonguestbf40739
 
Sketchy sketches hiding chemistry in plain sight
Sketchy sketches hiding chemistry in plain sightSketchy sketches hiding chemistry in plain sight
Sketchy sketches hiding chemistry in plain sightNextMove Software
 
RNA editing as a drug target in tryp. development of a high throughput sceeni...
RNA editing as a drug target in tryp. development of a high throughput sceeni...RNA editing as a drug target in tryp. development of a high throughput sceeni...
RNA editing as a drug target in tryp. development of a high throughput sceeni...Laurence Dawkins-Hall
 

Was ist angesagt? (20)

Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...Pharmaceutical industry best practices in lessons learned: ELN implementation...
Pharmaceutical industry best practices in lessons learned: ELN implementation...
 
Unlocking chemical information from tables and legacy articles
Unlocking chemical information from tables and legacy articlesUnlocking chemical information from tables and legacy articles
Unlocking chemical information from tables and legacy articles
 
Chemical structure representation in PubChem
Chemical structure representation in PubChemChemical structure representation in PubChem
Chemical structure representation in PubChem
 
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...
 
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction DatabasesCINF 13: Pistachio - Search and Faceting of Large Reaction Databases
CINF 13: Pistachio - Search and Faceting of Large Reaction Databases
 
CINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speedCINF 35: Structure searching for patent information: The need for speed
CINF 35: Structure searching for patent information: The need for speed
 
Automated Extraction of Reactions from the Patent Literature
Automated Extraction of Reactions from the Patent LiteratureAutomated Extraction of Reactions from the Patent Literature
Automated Extraction of Reactions from the Patent Literature
 
Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...
Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...
Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...
 
Standardized Representations of ELN Reactions for Categorization and Duplicat...
Standardized Representations of ELN Reactions for Categorization and Duplicat...Standardized Representations of ELN Reactions for Categorization and Duplicat...
Standardized Representations of ELN Reactions for Categorization and Duplicat...
 
Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...Eugene Garfield: the father of chemical text mining and artificial intelligen...
Eugene Garfield: the father of chemical text mining and artificial intelligen...
 
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...CINF 170: Regioselectivity: An application of expert systems and ontologies t...
CINF 170: Regioselectivity: An application of expert systems and ontologies t...
 
Xu_et_al-2015-Angewandte_Chemie_International_Edition
Xu_et_al-2015-Angewandte_Chemie_International_EditionXu_et_al-2015-Angewandte_Chemie_International_Edition
Xu_et_al-2015-Angewandte_Chemie_International_Edition
 
Critical Assessment of Function Annotation, 2005
Critical Assessment of Function Annotation, 2005Critical Assessment of Function Annotation, 2005
Critical Assessment of Function Annotation, 2005
 
Acs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvspAcs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvsp
 
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
Chemical similarity using multi-terabyte graph databases: 68 billion nodes an...
 
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...Standardization and Generation of Parents for Open PHACTS Chemical Registry S...
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...
 
Cv2 Michael D Johnson
Cv2 Michael D JohnsonCv2 Michael D Johnson
Cv2 Michael D Johnson
 
Sketchy sketches hiding chemistry in plain sight
Sketchy sketches hiding chemistry in plain sightSketchy sketches hiding chemistry in plain sight
Sketchy sketches hiding chemistry in plain sight
 
RNA editing as a drug target in tryp. development of a high throughput sceeni...
RNA editing as a drug target in tryp. development of a high throughput sceeni...RNA editing as a drug target in tryp. development of a high throughput sceeni...
RNA editing as a drug target in tryp. development of a high throughput sceeni...
 
GPCRs_HouseLA
GPCRs_HouseLAGPCRs_HouseLA
GPCRs_HouseLA
 

Andere mochten auch

RDKit UGM 2016: Higher Quality Chemical Depictions
RDKit UGM 2016: Higher Quality Chemical DepictionsRDKit UGM 2016: Higher Quality Chemical Depictions
RDKit UGM 2016: Higher Quality Chemical DepictionsNextMove Software
 
Which is the best fingerprint for medicinal chemistry?
Which is the best fingerprint for medicinal chemistry?Which is the best fingerprint for medicinal chemistry?
Which is the best fingerprint for medicinal chemistry?NextMove Software
 
Cheminformatics toolkits: a personal perspective
Cheminformatics toolkits: a personal perspectiveCheminformatics toolkits: a personal perspective
Cheminformatics toolkits: a personal perspectiveNextMove Software
 
Pistoia Alliance European Conference 2015 - Stefan Klostermann / Roche
Pistoia Alliance European Conference 2015 - Stefan Klostermann / RochePistoia Alliance European Conference 2015 - Stefan Klostermann / Roche
Pistoia Alliance European Conference 2015 - Stefan Klostermann / RochePistoia Alliance
 
ICIC 2013 Conference Proceedings Roland Knispel ChemAxon
ICIC 2013 Conference Proceedings Roland Knispel ChemAxonICIC 2013 Conference Proceedings Roland Knispel ChemAxon
ICIC 2013 Conference Proceedings Roland Knispel ChemAxonDr. Haxel Consult
 
CINF 18: Wikipedia and Wiktionary as resources for chemical text mining
CINF 18: Wikipedia and Wiktionary as resources for chemical text miningCINF 18: Wikipedia and Wiktionary as resources for chemical text mining
CINF 18: Wikipedia and Wiktionary as resources for chemical text miningNextMove Software
 
Classification, representation and analysis of cyclic peptides and peptide-li...
Classification, representation and analysis of cyclic peptides and peptide-li...Classification, representation and analysis of cyclic peptides and peptide-li...
Classification, representation and analysis of cyclic peptides and peptide-li...NextMove Software
 
Cae technologies for efficient vibro acoustic vehicle
Cae technologies for efficient vibro acoustic vehicleCae technologies for efficient vibro acoustic vehicle
Cae technologies for efficient vibro acoustic vehicleCec deMille
 
CAE Managed Services - Token Based Support
CAE Managed Services - Token Based SupportCAE Managed Services - Token Based Support
CAE Managed Services - Token Based SupportMartin Rowland
 
Insights and ideas to drive association success
Insights and ideas to drive association successInsights and ideas to drive association success
Insights and ideas to drive association successGreg Melia, CAE
 
Substructure Search Face-off
Substructure Search Face-offSubstructure Search Face-off
Substructure Search Face-offNextMove Software
 
Software Technology Trends
Software Technology TrendsSoftware Technology Trends
Software Technology TrendsKMS Technology
 
Sub floor code
Sub floor codeSub floor code
Sub floor codejbusse
 
Software Technology Trends in 2013-2014
Software Technology Trends in 2013-2014Software Technology Trends in 2013-2014
Software Technology Trends in 2013-2014KMS Technology
 
Peptide line notations for biologics registration and patent filings
Peptide line notations for biologics registration and patent filingsPeptide line notations for biologics registration and patent filings
Peptide line notations for biologics registration and patent filingsNextMove Software
 
Roundtripping between small-molecule and biopolymer representations
Roundtripping between small-molecule and biopolymer representationsRoundtripping between small-molecule and biopolymer representations
Roundtripping between small-molecule and biopolymer representationsNextMove Software
 
Alfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Software
 
CAE for Digital Development
CAE for Digital DevelopmentCAE for Digital Development
CAE for Digital DevelopmentAltair
 

Andere mochten auch (20)

RDKit UGM 2016: Higher Quality Chemical Depictions
RDKit UGM 2016: Higher Quality Chemical DepictionsRDKit UGM 2016: Higher Quality Chemical Depictions
RDKit UGM 2016: Higher Quality Chemical Depictions
 
Which is the best fingerprint for medicinal chemistry?
Which is the best fingerprint for medicinal chemistry?Which is the best fingerprint for medicinal chemistry?
Which is the best fingerprint for medicinal chemistry?
 
Cheminformatics toolkits: a personal perspective
Cheminformatics toolkits: a personal perspectiveCheminformatics toolkits: a personal perspective
Cheminformatics toolkits: a personal perspective
 
Pistoia Alliance European Conference 2015 - Stefan Klostermann / Roche
Pistoia Alliance European Conference 2015 - Stefan Klostermann / RochePistoia Alliance European Conference 2015 - Stefan Klostermann / Roche
Pistoia Alliance European Conference 2015 - Stefan Klostermann / Roche
 
ICIC 2013 Conference Proceedings Roland Knispel ChemAxon
ICIC 2013 Conference Proceedings Roland Knispel ChemAxonICIC 2013 Conference Proceedings Roland Knispel ChemAxon
ICIC 2013 Conference Proceedings Roland Knispel ChemAxon
 
CINF 18: Wikipedia and Wiktionary as resources for chemical text mining
CINF 18: Wikipedia and Wiktionary as resources for chemical text miningCINF 18: Wikipedia and Wiktionary as resources for chemical text mining
CINF 18: Wikipedia and Wiktionary as resources for chemical text mining
 
Classification, representation and analysis of cyclic peptides and peptide-li...
Classification, representation and analysis of cyclic peptides and peptide-li...Classification, representation and analysis of cyclic peptides and peptide-li...
Classification, representation and analysis of cyclic peptides and peptide-li...
 
Is 20TB really Big Data?
Is 20TB really Big Data?Is 20TB really Big Data?
Is 20TB really Big Data?
 
Cae technologies for efficient vibro acoustic vehicle
Cae technologies for efficient vibro acoustic vehicleCae technologies for efficient vibro acoustic vehicle
Cae technologies for efficient vibro acoustic vehicle
 
CAE Managed Services - Token Based Support
CAE Managed Services - Token Based SupportCAE Managed Services - Token Based Support
CAE Managed Services - Token Based Support
 
Insights and ideas to drive association success
Insights and ideas to drive association successInsights and ideas to drive association success
Insights and ideas to drive association success
 
Substructure Search Face-off
Substructure Search Face-offSubstructure Search Face-off
Substructure Search Face-off
 
Software Technology Trends
Software Technology TrendsSoftware Technology Trends
Software Technology Trends
 
Sub floor code
Sub floor codeSub floor code
Sub floor code
 
Software Technology Trends in 2013-2014
Software Technology Trends in 2013-2014Software Technology Trends in 2013-2014
Software Technology Trends in 2013-2014
 
Peptide line notations for biologics registration and patent filings
Peptide line notations for biologics registration and patent filingsPeptide line notations for biologics registration and patent filings
Peptide line notations for biologics registration and patent filings
 
Roundtripping between small-molecule and biopolymer representations
Roundtripping between small-molecule and biopolymer representationsRoundtripping between small-molecule and biopolymer representations
Roundtripping between small-molecule and biopolymer representations
 
Cad cam cae
Cad cam caeCad cam cae
Cad cam cae
 
Alfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossier
 
CAE for Digital Development
CAE for Digital DevelopmentCAE for Digital Development
CAE for Digital Development
 

Ähnlich wie RDKit: Six Not-So-Easy Pieces [RDKit UGM 2016]

Bio partnering 2006_final
Bio partnering 2006_finalBio partnering 2006_final
Bio partnering 2006_finalAVIVE, INC.
 
ICIC 2016: Mind the Gap: The novel benefits of human-curated substance locat...
ICIC 2016: Mind the Gap:  The novel benefits of human-curated substance locat...ICIC 2016: Mind the Gap:  The novel benefits of human-curated substance locat...
ICIC 2016: Mind the Gap: The novel benefits of human-curated substance locat...Dr. Haxel Consult
 
Semi SMC 2010 - Botsford Presentation
Semi SMC 2010 - Botsford PresentationSemi SMC 2010 - Botsford Presentation
Semi SMC 2010 - Botsford Presentationkbots
 
Open-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databasesOpen-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databasesGreg Landrum
 
2015 EUROPACAT ZEOLITE APPLICATIONS
2015 EUROPACAT ZEOLITE APPLICATIONS2015 EUROPACAT ZEOLITE APPLICATIONS
2015 EUROPACAT ZEOLITE APPLICATIONSIacovos Vasalos
 
Ko process-r-d-online-hi-jan2021b
Ko process-r-d-online-hi-jan2021bKo process-r-d-online-hi-jan2021b
Ko process-r-d-online-hi-jan2021bSteve Brough
 
Chemical Information Retrieval Class 1
Chemical Information Retrieval Class 1Chemical Information Retrieval Class 1
Chemical Information Retrieval Class 1Jean-Claude Bradley
 
Pistoia Alliance US Conference 2015 - 1.2.3 Innovation in other industries - ...
Pistoia Alliance US Conference 2015 - 1.2.3 Innovation in other industries - ...Pistoia Alliance US Conference 2015 - 1.2.3 Innovation in other industries - ...
Pistoia Alliance US Conference 2015 - 1.2.3 Innovation in other industries - ...Pistoia Alliance
 
Cell Culture Techniques and Best Practices
Cell Culture Techniques and Best PracticesCell Culture Techniques and Best Practices
Cell Culture Techniques and Best PracticesThermo Fisher Scientific
 
13 update on nea tdb activities and phase vi priorities zavarin llnl-pres-731735
13 update on nea tdb activities and phase vi priorities zavarin llnl-pres-73173513 update on nea tdb activities and phase vi priorities zavarin llnl-pres-731735
13 update on nea tdb activities and phase vi priorities zavarin llnl-pres-731735leann_mays
 
Kenoye_Muzan_Ekpelu_CV_April_2016
Kenoye_Muzan_Ekpelu_CV_April_2016Kenoye_Muzan_Ekpelu_CV_April_2016
Kenoye_Muzan_Ekpelu_CV_April_2016Kenoye Muzan Ekpelu
 
Omdi2021 Ontologies for (Materials) Science in the Digital Age
Omdi2021 Ontologies for (Materials) Science in the Digital AgeOmdi2021 Ontologies for (Materials) Science in the Digital Age
Omdi2021 Ontologies for (Materials) Science in the Digital Agepetermurrayrust
 
Debecker europacat 2017 mesoporous sn si aerosol ethyl lactate
Debecker europacat 2017 mesoporous sn si aerosol ethyl lactateDebecker europacat 2017 mesoporous sn si aerosol ethyl lactate
Debecker europacat 2017 mesoporous sn si aerosol ethyl lactateDamien Debecker
 
Synthetically Accessible Virtual Inventory (SAVI) : Reaction generation and h...
Synthetically Accessible Virtual Inventory (SAVI) : Reaction generation and h...Synthetically Accessible Virtual Inventory (SAVI) : Reaction generation and h...
Synthetically Accessible Virtual Inventory (SAVI) : Reaction generation and h...Hitesh Patel
 
Waste conversion of the future, operating facility in France
Waste conversion of the future, operating facility in FranceWaste conversion of the future, operating facility in France
Waste conversion of the future, operating facility in FranceSandy Gutner
 

Ähnlich wie RDKit: Six Not-So-Easy Pieces [RDKit UGM 2016] (20)

Bio partnering 2006_final
Bio partnering 2006_finalBio partnering 2006_final
Bio partnering 2006_final
 
ICIC 2016: Mind the Gap: The novel benefits of human-curated substance locat...
ICIC 2016: Mind the Gap:  The novel benefits of human-curated substance locat...ICIC 2016: Mind the Gap:  The novel benefits of human-curated substance locat...
ICIC 2016: Mind the Gap: The novel benefits of human-curated substance locat...
 
Semi SMC 2010 - Botsford Presentation
Semi SMC 2010 - Botsford PresentationSemi SMC 2010 - Botsford Presentation
Semi SMC 2010 - Botsford Presentation
 
Open-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databasesOpen-source tools for querying and organizing large reaction databases
Open-source tools for querying and organizing large reaction databases
 
2015 EUROPACAT ZEOLITE APPLICATIONS
2015 EUROPACAT ZEOLITE APPLICATIONS2015 EUROPACAT ZEOLITE APPLICATIONS
2015 EUROPACAT ZEOLITE APPLICATIONS
 
Ko process-r-d-online-hi-jan2021b
Ko process-r-d-online-hi-jan2021bKo process-r-d-online-hi-jan2021b
Ko process-r-d-online-hi-jan2021b
 
Chemical Information Retrieval Class 1
Chemical Information Retrieval Class 1Chemical Information Retrieval Class 1
Chemical Information Retrieval Class 1
 
Wellington Laboratories Certified Reference Standards & Materials
Wellington Laboratories Certified Reference Standards & Materials Wellington Laboratories Certified Reference Standards & Materials
Wellington Laboratories Certified Reference Standards & Materials
 
Chemical Process Conception
Chemical Process ConceptionChemical Process Conception
Chemical Process Conception
 
Pistoia Alliance US Conference 2015 - 1.2.3 Innovation in other industries - ...
Pistoia Alliance US Conference 2015 - 1.2.3 Innovation in other industries - ...Pistoia Alliance US Conference 2015 - 1.2.3 Innovation in other industries - ...
Pistoia Alliance US Conference 2015 - 1.2.3 Innovation in other industries - ...
 
Cell Culture Techniques and Best Practices
Cell Culture Techniques and Best PracticesCell Culture Techniques and Best Practices
Cell Culture Techniques and Best Practices
 
13 update on nea tdb activities and phase vi priorities zavarin llnl-pres-731735
13 update on nea tdb activities and phase vi priorities zavarin llnl-pres-73173513 update on nea tdb activities and phase vi priorities zavarin llnl-pres-731735
13 update on nea tdb activities and phase vi priorities zavarin llnl-pres-731735
 
M4 sf 18sn010303061 8th us german 020918 lac reduced sand2018-1339r
M4 sf 18sn010303061 8th us german 020918 lac reduced sand2018-1339rM4 sf 18sn010303061 8th us german 020918 lac reduced sand2018-1339r
M4 sf 18sn010303061 8th us german 020918 lac reduced sand2018-1339r
 
ReidN_PhD_Defense
ReidN_PhD_DefenseReidN_PhD_Defense
ReidN_PhD_Defense
 
Kenoye_Muzan_Ekpelu_CV_April_2016
Kenoye_Muzan_Ekpelu_CV_April_2016Kenoye_Muzan_Ekpelu_CV_April_2016
Kenoye_Muzan_Ekpelu_CV_April_2016
 
Omdi2021 Ontologies for (Materials) Science in the Digital Age
Omdi2021 Ontologies for (Materials) Science in the Digital AgeOmdi2021 Ontologies for (Materials) Science in the Digital Age
Omdi2021 Ontologies for (Materials) Science in the Digital Age
 
Debecker europacat 2017 mesoporous sn si aerosol ethyl lactate
Debecker europacat 2017 mesoporous sn si aerosol ethyl lactateDebecker europacat 2017 mesoporous sn si aerosol ethyl lactate
Debecker europacat 2017 mesoporous sn si aerosol ethyl lactate
 
Synthetically Accessible Virtual Inventory (SAVI) : Reaction generation and h...
Synthetically Accessible Virtual Inventory (SAVI) : Reaction generation and h...Synthetically Accessible Virtual Inventory (SAVI) : Reaction generation and h...
Synthetically Accessible Virtual Inventory (SAVI) : Reaction generation and h...
 
Waste conversion of the future, operating facility in France
Waste conversion of the future, operating facility in FranceWaste conversion of the future, operating facility in France
Waste conversion of the future, operating facility in France
 
QbR---.pptx
QbR---.pptxQbR---.pptx
QbR---.pptx
 

Mehr von NextMove Software

Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...NextMove Software
 
A de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESA de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESNextMove Software
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionNextMove Software
 
Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...NextMove Software
 
Comparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsComparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsNextMove Software
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKitNextMove Software
 
Digital Chemical Representations
Digital Chemical RepresentationsDigital Chemical Representations
Digital Chemical RepresentationsNextMove Software
 
Challenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsChallenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsNextMove Software
 
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...NextMove Software
 
Building on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesBuilding on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesNextMove Software
 
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...NextMove Software
 
Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)NextMove Software
 
Automatic extraction of bioactivity data from patents
Automatic extraction of bioactivity data from patentsAutomatic extraction of bioactivity data from patents
Automatic extraction of bioactivity data from patentsNextMove Software
 
CINF 29: Visualization and manipulation of Matched Molecular Series for decis...
CINF 29: Visualization and manipulation of Matched Molecular Series for decis...CINF 29: Visualization and manipulation of Matched Molecular Series for decis...
CINF 29: Visualization and manipulation of Matched Molecular Series for decis...NextMove Software
 

Mehr von NextMove Software (15)

DeepSMILES
DeepSMILESDeepSMILES
DeepSMILES
 
Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...Building a bridge between human-readable and machine-readable representations...
Building a bridge between human-readable and machine-readable representations...
 
A de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILESA de facto standard or a free-for-all? A benchmark for reading SMILES
A de facto standard or a free-for-all? A benchmark for reading SMILES
 
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs RevolutionRecent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
Recent Advances in Chemical & Biological Search Systems: Evolution vs Revolution
 
Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...Can we agree on the structure represented by a SMILES string? A benchmark dat...
Can we agree on the structure represented by a SMILES string? A benchmark dat...
 
Comparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule ImplementationsComparing Cahn-Ingold-Prelog Rule Implementations
Comparing Cahn-Ingold-Prelog Rule Implementations
 
Recent improvements to the RDKit
Recent improvements to the RDKitRecent improvements to the RDKit
Recent improvements to the RDKit
 
Digital Chemical Representations
Digital Chemical RepresentationsDigital Chemical Representations
Digital Chemical Representations
 
Challenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptionsChallenges and successes in machine interpretation of Markush descriptions
Challenges and successes in machine interpretation of Markush descriptions
 
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
CINF 17: Comparing Cahn-Ingold-Prelog Rule Implementations: The need for an o...
 
Building on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfilesBuilding on Sand: Standard InChIs on non-standard molfiles
Building on Sand: Standard InChIs on non-standard molfiles
 
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
Chemical Structure Representation of Inorganic Salts and Mixtures of Gases: A...
 
Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)Advanced grammars for state-of-the-art named entity recognition (NER)
Advanced grammars for state-of-the-art named entity recognition (NER)
 
Automatic extraction of bioactivity data from patents
Automatic extraction of bioactivity data from patentsAutomatic extraction of bioactivity data from patents
Automatic extraction of bioactivity data from patents
 
CINF 29: Visualization and manipulation of Matched Molecular Series for decis...
CINF 29: Visualization and manipulation of Matched Molecular Series for decis...CINF 29: Visualization and manipulation of Matched Molecular Series for decis...
CINF 29: Visualization and manipulation of Matched Molecular Series for decis...
 

Kürzlich hochgeladen

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Youngkajalvid75
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 

Kürzlich hochgeladen (20)

Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 

RDKit: Six Not-So-Easy Pieces [RDKit UGM 2016]

  • 1. Six Not-So-Easy Pieces: Tautomers, Nucleic Acids, Inorganics, Reactions, Polymers and SMIRKS (in RDKit). Roger Sayle, John Mayfield, Noel O’Boyle NextMove Software, Cambridge, UK 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 2. Quick overview 1. Tautomers • Tautomer matching without enumeration 2. Nucleic Acids • RDKit support for RNA and DNA in PDB, HELM and FASTA. 3. Inorganics • Biovia’s valence table revisions 4. Reactions • Some comments on expert rule sets 5. Polymers • InChI’s recent support for polymers 6. SMIRKS • Standardizing interpretation/semantics of SMIRKS 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 3. 1. tautomers • Tautomers are molecular isomers that easily interconvert by migration of hydrogen atoms1. 1. R. Sayle, “So you think you understand tautomerism?”, JCAMD, 24(6-7):485-496, June 2010. 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016 InChI=1S/C16H12N20/c19-16-11-10-15(13-8-4-5-9-14(13)16)18-17-12-6-2-1-3-7-12/h1-11,19H InChI=1S/C16H12N20/c19-16-11-10-15(13-8-4-5-9-14(13)16)18-17-12-6-2-1-3-7-12/h1-11,17H
  • 4. Friends of tautomers: mesomers & protomers InChI=1S/C2H4OS/c1-2(3)4/h1H3,(H,3,4) InChI=1S/C2H4OS/c1-2(3)4/h1H3,(H,3,4)/p-1
  • 5. Paolo’s solution: enumeration 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 6. Enumeration vs. unumeration • Given a generator function F, which is a one-to-many mapping, many types of problem ask if F(X) == Y. • Frequently, an efficient solution is to find the inverse many-to-one mapping, and ask if F-1(Y) == X. 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016 X F (X)1 F (X)2 F (X)3 Y=
  • 7. Alternative approach • Automatically transform – NC(=N)c1ccc(cc1)C(=O)O • Into the bond order, formal charge and polar hydrogen count suppressed SMARTS pattern – [#7D1]~[#6H0D3](~[#7D1])~[#6H0D3]1~[#6H1D2]~[#6H1D 2]~[#6H0D3](~[#6H1D2]~[#6H1D2]~1)~[#6H0D3](~[#8D1]) ~[#8D1] • Plus additional constraints on the total formal charge and the total polar hydrogen count. – polar_hcount – total_charge = 4 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 8. 2. Nucleic acids • Support for DNA and RNA sequences in PDB, HELM and FASTA will shortly be contributed to RDKit. • Only a minor set of changes to RDKit’s APIs: – RDKit::SequenceToMol(string,sanitize,flavor) – RDKit::FASTAToMol(string,sanitize,flavor) – RDKit::PDBBlockToMol(string) – RDKit::HELMToMol(string) – RDKit::MolToSequence(mol) – RDKit::MolToFASTA(mol) – RDKit::MolToPDBBlock(mol) – RDKit::MolToHELM(mol) 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 9. Biovia vs. pistoia helm issues • RDKit’s implementation doesn’t have Biovia’s bugs. • The monomer phase problem in RNA registration: – Sequence: GATTACA – Interpretation: 5’-G-A-T-T-A-C-A-3’ – Implied phosphates: 5’-G-P-A-P-T-P-T-P-A-P-C-P-A-3’ – IUMB/PDB/Biovia: G-PA-PT-PT-PA-PC-PA – Pistoia HELM: GP-AP-TP-TP-AP-CP-A • This discrepancy complicates RNA registration and leads to serious bugs in Biovia’s HELM support. 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 10. Rna bugs in biovia helm support • From Sequence – “ACGT” drawn in Biovia has 4 phosphates, but the HELM editor interprets this sequence as having three. – Biovia’s Mol file for this sequence places the phosphate at the 5’ end, but the HELM string it generates has it at the 3’. • HELM interpretation – RNA1{R(A).P.R(C)P.R(T)P.R(G)}$$$$ is exactly the same molecule as RNA1{R(A).PR(C).PR(T).PR(G)}$$$$ in HELM. – These strings have very different behaviour in Biovia, neither of which has the correct connection table. 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 11. Expanded SCSR file select molfile(mol('RNA1{R(A)P.R(T)P.R(C)P.R(G)}$$$$')) from dual; 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 12. Rdkit helm implementation 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016 No Caps Flavor =2 5’-Cap Flavor = 3 3’-Cap Flavor = 4 Both Caps Flavor = 5
  • 13. 3. inorganics • Sanitization failures: Insane or insanitary? • Great recent advances in RDKit. – Hexafluorophosphate – Dess-Martin Periodinane • Perhaps exceptions for remaining “platypi”. – Chlorine Dioxide O=[Cl]=O 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 14. Mdl molfile-ageDdon • Biovia 2017 changes the interpretation of MDL files. • This affects over 213097 CIDs in PubChem! 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 15. Heavy metals (it goes beyond 111) Atomic RDKit RDKit Toolkit Biovia Number Symbol IUPAC Ptable SMILES Heaviest 2017 SMARTS 103 Lr Y Y Indigo Lr 104 Rf Y Y ‘Rf’ 105 Db Y ‘Db’ 106 Sg Y ‘Sg’ 107 Bh Y ‘Bh’ 108 Hs Y ‘Hs’ 109 Mt Y ChemAxon ‘Mt’ 110 Ds Y ‘Ds’ 111 Rg Y OEChem ‘Rg’ 112 Cn Y OpenBabel ‘Cn’ 113 Nh 2016? N&h 114 Fl ‘Fl’ 115 Mc 2016? 116 Lv CDK ‘Lv’ 117 Ts 2016? 118 Og 2016? 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 16. 4. reactions • Virtual library enumerations are overly optimistic. – SAVI, SCUBIDOO, Enamine, PLC, Evotec, Novartis? • Transformations vs. Reactions is a major challenge. – Which reactions to use? – Reactions that do happen! – Reactions that don’t happen! 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 17. Which reactions? • NCI/LHASA SAVI (14+6 reactions) – Suzuki Coupling 36262 out of 1.2M USPTO examples – Sulfonamide Schotten-Baumann 14348 out of 1.2M USPTO examples – Buchwald-Harwig 6040 out of 1.2M USPTO examples – Hiyama coupling 458 out of 1.2M USPTO examples – Fukuyama coupling 2 out of 1.2M USPTO examples – Liebeskind-Srogl coupling 0 out of 1.2M USPTO examples • Hartenfeller (58 reactions) – #1 Pictet-Spengler reaction 7 out of 1.2M USPTO examples – #10+ #11 Azide-nitrile Huisgen-cycloaddition 5 out of 1.2M USPTO examples – #17 Pyridone synthesis 2 out of 1.2M USPTO examples – #20 Phthalazinone synthesis 16 out of 1.2M USPTO examples – #24 Friedlander quinoline synthesis 30 out of 1.2M USPTO examples • Enamine REAL (43 reactions) – Thiourea to guanidine (14 out of 160M examples). 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 18. Reactions that do happen! • Chloro Sonogashira Couplings (@ NCI) – The LHASA ‘CHMTRN’ rules for Transform 2267, Sonogashira couplings, (presented by Marc Nicklaus at Sheffield) states “Iodides are usually more reactive than bromide. Chlorides do not react”. Code Name Count Yield 3.3.2 Bromo Sonogashira coupling 3717 49.3% 3.3.3 Chloro Sonogashira coupling 429 44.2% 3.3.4 Iodo Sonogashira coupling 2721 64.9% • Isotopically-labelled Compounds (@Eli Lilly) – The LAAR reactions used to construct Lilly’s PLC (Nicolaou et al. 2016) forbid the presence of isotopic labels in reactants. 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 19. Reactions that don’t happen! • Paal-Knorr Pyrrole Synthesis vs. Aldehydes/Ketones Of the 430 examples of Paal-Knorr pyrrole synthesis reported in US patent applications 2001-2012, exactly zero have more than the two reacting ketones/aldehydes. 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 20. 5. polymers • The InChI Trust has added polymer support to InChI. • Or has it? 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016 InChI=1B/C2H4O/c1-2-3-1/h1-2H2/z101-1- 3(1,2,1,3,2,3) InChIKey=IAYPIBMASNFSPL-GCGQHNKHBA-N InChI=1B/C4H8O2/c1-2-6-4-3-5-1/h1-4H2/z101-1- 6(1,2,1,5,2,6,3,4,3,5,4,6) InChIKey=RYHBNJHYFVUHQT-UEXHMUNTBA-N Experimental Beta Option -Polymers
  • 21. INChI PolymerS capping Group Frame Shift InChI=1B/C4H8N2O3/c5-1-3(7)6-2-4(8)9/h1- 2,5H2,(H,6,7)(H,8,9)/z101-1,3,6-7(2-6,5-1) InChIKey=YMAWOPBAYDPSLA-NPFRMHEKBA-N InChI=1B/C4H8N2O3/c5-1-3(7)6-2-4(8)9/h1- 2,5H2,(H,6,7)(H,8,9)/z101-2-3,6-7(1-3,4-2) InChIKey=YMAWOPBAYDPSLA-LCMLVUGIBA-N InChI=1B/C4H8N2O3/c5-1-3(7)6-2-4(8)9/h1- 2,5H2,(H,6,7)(H,8,9)/z101-2,4,6,8(3-6,9-4) InChIKey=YMAWOPBAYDPSLA-YYSJEQPSBA-N 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 22. 6. smirks • SMIRKS is Daylight’s Reaction Transform Language. • SMILES is nearly portable, SMARTS is improving, but... • The RSC’s CVSP platform contains rules such as: [N-,P-:1][C:2]=[N+,P+:3]>>[N,P:1]=[C:2][N,P:3] 1. Daylight SMIRKS requires SMILES on the RHS, but allows SMARTS on the LHS. 2. In SMIRKS, a property is matched/modified if specified. [N-,P-:1][C:2]=[N+,P+:3]>>[*+0:1]=[C:2][*+0:3] 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 23. Need an Opensmirks standard? • Product Primitives – SMILES Primitives • #N Atomic Numbers • N Isotopic Mass • +N/-N Formal Charge – SMARTS Primitives • hN/HN (Implicit) Hydrogen Count • A/a Aliphatic/Aromatic 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016
  • 24. acknowledgements • The RDKit Hackers at Novartis (and T5) – Greg Landrum – Nadine Schneider – Joann Prescott-Roy • The team at NextMove Software – John Mayfield – Noel O’Boyle – Daniel Lowe • And many thanks for your time! 5th RDKit User Group Meeting, Novartis, Basel, Switzerland, Thursday 27th October 2016