SlideShare a Scribd company logo
1 of 27
Download to read offline
Leah R. McEwen & Alex M. Clark
Mixtures
informatics for formulations and consumer products
Who
✤ InChI Trust / IUPAC

‣ https://www.inchi-trust.org

‣ Mixtures InChI notation

✤ Collaborative Drug Discovery

‣ https://collaborativedrug.com

‣ Mixfiles & tools
2
Introduction
✤ Cheminformatics has 40 years of practice representing abstract molecules

✤ Very successful applications for pharmaceutical drug discovery
✤ But the reality of chemicals in the lab is that

‣ nothing is ever completely pure

‣ most activities involve explicitly mixing chemicals

‣ mixtures are crafted for specific purposes

✤ Yet, there is lack of standard way to describe mixtures
CC(=O)OC1=CC=CC=C1C(=O)O
InChI=1S/C9H8O4/c1-6(10)13-8-5-3-2
-4-7(8)9(11)12/h2-5H,1H3,(H,11,12) InChI
SMILES
Molfile
3
Mixfile/MInChI ✤ Format needs to be:

‣ hierarchical

‣ embed structures when possible

‣ include concentration information

‣ tolerate uncertainty

✤ More verbose ELN-friendly form is Mixfile

✤ Concise form with canonical components is
MInChI (mixtures InChI)
MInChI=0.00.1S/C4H8O/c1-2-4-5-3-1/h1-4H2&C6H12/
c1-6-4-2-3-5-6/h6H,2-5H2,1H3&C6H14/c1-3-5-6-4-2/
h3-6H2,1-2H3&C6H14/c1-4-5-6(2)3/h6H,4-5H2,1-3H3&C6H14/
c1-4-6(3)5-2/h6H,4-5H2,1-3H3&C6H14N.Li/c1-5(2)7-6(3)4;/
h5-6H,1-4H3;/q-1;+1/n{6&{1&{3&2&4&5}}}/
g{1mr0&{1vp0&{5:7pp1&1:2pp1&1:5pp0&1:5pp0}7vp0}}
4
Formulation Example
✤ Many consumer products are well described
from a chemical perspective

✤ Some components are more easily defined
than others

✤ When structure is not available, can use
external identifiers

✤ Hierarchy encodes information about the
design of the product

✤ Concentrations can be expressed with
uncertainties
5
Formulation Example
✤ Many consumer products are well described
from a chemical perspective

✤ Some components are more easily defined
than others

✤ When structure is not available, can use
external identifiers

✤ Hierarchy encodes information about the
design of the product

✤ Concentrations can be expressed with
uncertainties
5
Formulation Example
✤ Many consumer products are well described
from a chemical perspective

✤ Some components are more easily defined
than others

✤ When structure is not available, can use
external identifiers

✤ Hierarchy encodes information about the
design of the product

✤ Concentrations can be expressed with
uncertainties
5
Formulation Example
✤ Many consumer products are well described
from a chemical perspective

✤ Some components are more easily defined
than others

✤ When structure is not available, can use
external identifiers

✤ Hierarchy encodes information about the
design of the product

✤ Concentrations can be expressed with
uncertainties
5
Formulation Example
✤ Many consumer products are well described
from a chemical perspective

✤ Some components are more easily defined
than others

✤ When structure is not available, can use
external identifiers

✤ Hierarchy encodes information about the
design of the product

✤ Concentrations can be expressed with
uncertainties
5
Formulation Example
✤ Many consumer products are well described
from a chemical perspective

✤ Some components are more easily defined
than others

✤ When structure is not available, can use
external identifiers

✤ Hierarchy encodes information about the
design of the product

✤ Concentrations can be expressed with
uncertainties
5
Formulation Example
✤ Many consumer products are well described
from a chemical perspective

✤ Some components are more easily defined
than others

✤ When structure is not available, can use
external identifiers

✤ Hierarchy encodes information about the
design of the product

✤ Concentrations can be expressed with
uncertainties
5
Design of Mixtures
✤ Each branch is a thing

✤ Each leaf is a concept
✤ Layout can correspond to how the
mixture is formulated
6
Knowledge Capture
✤ Capture what we know about the mixture: and
nothing more

‣ ideally each leaf node has well defined
structure & precise concentration

‣ the closer we get to this, the more analysis we
can do

✤ Concentrations often variable, unknown, vague,
or implied

✤ Structure(s) can be hard to pin down...

‣ ... not always a single, well defined, easy to
draw molecule
7
Cerium(IV) and Perchloric
acid, Etchant Solution,
Ce(IV) concentration 0.22N
1,2-Diphenylcyclopropane,
cis + trans, 97%
1,3,5,7-Cyclooctatetraene,
98%, stab. with 0.1%
Hydroquinone
Lithium bis(trimethylsilyl)
amide, 20% (ca 1.06M) soln.
in THF/ethylbenzene,
packaged in resealable
septum cap bottle
Properties
density 0.719 g/mL
density 0.79 g/mL

CAS 4111-54-0
b.p. 65-67°C

density 0.889 g/mL

CAS 109-99-9
b.p. 68-70°C

density 0.672 g/mL

CAS 110-54-3
b.p. 72°C

density 0.748 g/mL

CAS 96-37-7
b.p. 60-62°C

density 0.653 g/mL

CAS 107-83-5
b.p. 63-64°C

density 0.664 g/mL

CAS 96-14-0
b.p. 68-70°C

density 0.672

CAS 73513-42-5
✤ Metadata is attached to a position:

‣ root = the whole thing

‣ leaf = individual component

‣ branch = several components
8
Structures by External Definition
✤ Sometimes have to resort to describing a component by method of
preparation, means of extraction, measured properties, etc.

✤ External database identifiers can be useful:

‣ CASRN: Chemical Abstracts literature extraction

‣ INCI: International Nomenclature of Cosmetics Ingredients

‣ UNII: Food & Drug Administration database

✤ Database identifiers are not ideal for machine readability, but they can be used
to establish equivalence:
9
Comparisons with Structures
InChI=1S/C7H6O3/c8-6-4-2-1-3-5(6)7(9)10/h1-4,8H,(H,9,10)
≡ substructure of
n n
MW 300-500 MW 400-700
≅
Y1.2Ba0.8CuO4 ≅ YBa2Cu3O7−δ
≅
10
Search Queries
>40%has
has INCI: COCAMIDE DEA
has not substructure
✤ Looking for a certain subset of
external cleaning surfactants,
phosphate-free
11
Informatics Example
✤ Solubility of theophylline

✤ Often delivered in liquid form with mixed solvents: optimising
proportion of drug is important

✤ Consider a scenario where:

‣ all data was provided in Mixtures InChI form

‣ these data exist in openly available repositories

✤ Query:

‣ check that theophylline is present and has concentration

‣ check that other ingredients are solvents

✤ Consider 4 papers with relevant solubility, published over 20 years...
theophylline

nasal anti-inflammatory
12
Paper #1
✤ Valizadeh et al, Adv. Pharm. Bull. (2011), DOI 10.5681/apb.2011.003
13
Paper #2
✤ Yan et al, J. Chem. Eng. Data (2017), DOI 10.1021/acs.jced.7b00065
14
Paper #3
✤ Martínez et al, J. Solution Chem. (2017), DOI 10.1007/s10953-017-0666-z
15
Paper #4
✤ Campisi et al, J. Pharm. Biomed. (1998), DOI 10.1016/S0731-7085(98)00175-7
n n
+ 14 more measurements
n n
16
All Together for QSAR
Solubility
0.699 1
15.19 1
1.04 1
3.142 1
0.784 1
0.91 1
6.3 1
13.7 1
11.6 1
13.58 1
6.73 1
9.3 1
8.20 0.8 0.2
16.38 0.5 0.5
13.60 0.2 0.8
(+8 more similar)
15.39 0.333 0.667
26.6 0.5 0.5
17.97 0.083 0.584 0.333
19.06 0.708 0.292
22.59 0.417 0.25 0.333
26.52 0.283 0.25 0.3 0.167
(+14 more similar)
n
17
Source Data & Inventory
✤ Gather public content like INCI and UNII:

✤ Chemical Abstracts in mixture form would be nice

✤ Reagents and materials from vendors: could search, copy, paste from site
18
Hazards
✤ Automated lookup of hazard classifications, toxicity data, etc. needs
structures and mixture context... each of these is not like the other
✤ Machine readability and open access are both major hurdles
19
Longevity
✤ Any ELN, private registration system or public database:

‣ capture data in machine readable form

‣ if a machine can understand it, so can a human

‣ if standards are followed, data will always be interpretable

‣ data can be shared as much or as little as needed

✤ Sophisticated queries and analysis become possible

✤ Institutional knowledge does not evaporate

✤ An open ecosystem means that tools will evolve

‣ tools can be free or proprietary, general purpose or specific
20
Questions?
✤ Contact:

‣ Leah R. McEwen lrm1@cornell.edu (Cornell University, IUPAC/InChI Trust)

‣ Alex M. Clark alex@collaborativedrug.com (Collaborative Drug Discovery)
21
Journal of Cheminformatics (2019)
10.1186/s13321-019-0357-4

More Related Content

Similar to Mixtures: informatics for formulations and consumer products

THESIS-1.1-FB12061-DESIGN-OF-SOFTWARE-CODE-TO-IMPROVE-THE-ACCURACY-OF-SHRIMP-...
THESIS-1.1-FB12061-DESIGN-OF-SOFTWARE-CODE-TO-IMPROVE-THE-ACCURACY-OF-SHRIMP-...THESIS-1.1-FB12061-DESIGN-OF-SOFTWARE-CODE-TO-IMPROVE-THE-ACCURACY-OF-SHRIMP-...
THESIS-1.1-FB12061-DESIGN-OF-SOFTWARE-CODE-TO-IMPROVE-THE-ACCURACY-OF-SHRIMP-...
chan chao shiung
 
Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...
Valery Tkachenko
 
Day-2-1440-Yasser-Nashed-Samuel-YES
Day-2-1440-Yasser-Nashed-Samuel-YESDay-2-1440-Yasser-Nashed-Samuel-YES
Day-2-1440-Yasser-Nashed-Samuel-YES
Yasser Nashed-Samuel
 
Stability Predictions by ASAP
Stability Predictions by ASAPStability Predictions by ASAP
Stability Predictions by ASAP
scrivens_g
 
Pda Visual Inspection 2009 Aldrich
Pda Visual Inspection 2009 AldrichPda Visual Inspection 2009 Aldrich
Pda Visual Inspection 2009 Aldrich
D Scott Aldrich
 
Glycos Biotechnologies.TAMU presentation_PC edits
Glycos Biotechnologies.TAMU presentation_PC editsGlycos Biotechnologies.TAMU presentation_PC edits
Glycos Biotechnologies.TAMU presentation_PC edits
Allana Robertson
 

Similar to Mixtures: informatics for formulations and consumer products (20)

Development and Pharmaceutical Evaluation of Clotrimazole Loaded Topical Hydr...
Development and Pharmaceutical Evaluation of Clotrimazole Loaded Topical Hydr...Development and Pharmaceutical Evaluation of Clotrimazole Loaded Topical Hydr...
Development and Pharmaceutical Evaluation of Clotrimazole Loaded Topical Hydr...
 
Application Note: Crystal16 and Solubility Curves
Application Note: Crystal16 and Solubility CurvesApplication Note: Crystal16 and Solubility Curves
Application Note: Crystal16 and Solubility Curves
 
Determination of Elemental Impurities – Challenges of a Screening Method
Determination of Elemental Impurities – Challenges of a Screening MethodDetermination of Elemental Impurities – Challenges of a Screening Method
Determination of Elemental Impurities – Challenges of a Screening Method
 
123 yo yo.pdf
123 yo yo.pdf123 yo yo.pdf
123 yo yo.pdf
 
THESIS-1.1-FB12061-DESIGN-OF-SOFTWARE-CODE-TO-IMPROVE-THE-ACCURACY-OF-SHRIMP-...
THESIS-1.1-FB12061-DESIGN-OF-SOFTWARE-CODE-TO-IMPROVE-THE-ACCURACY-OF-SHRIMP-...THESIS-1.1-FB12061-DESIGN-OF-SOFTWARE-CODE-TO-IMPROVE-THE-ACCURACY-OF-SHRIMP-...
THESIS-1.1-FB12061-DESIGN-OF-SOFTWARE-CODE-TO-IMPROVE-THE-ACCURACY-OF-SHRIMP-...
 
Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...
 
Day-2-1440-Yasser-Nashed-Samuel-YES
Day-2-1440-Yasser-Nashed-Samuel-YESDay-2-1440-Yasser-Nashed-Samuel-YES
Day-2-1440-Yasser-Nashed-Samuel-YES
 
Stability Predictions by ASAP
Stability Predictions by ASAPStability Predictions by ASAP
Stability Predictions by ASAP
 
A Simple and an Innovative Gas Chromatography Method to Quantify Isopentane i...
A Simple and an Innovative Gas Chromatography Method to Quantify Isopentane i...A Simple and an Innovative Gas Chromatography Method to Quantify Isopentane i...
A Simple and an Innovative Gas Chromatography Method to Quantify Isopentane i...
 
Biomanufacturing 2016
Biomanufacturing 2016 Biomanufacturing 2016
Biomanufacturing 2016
 
Lignocellulose Biomass- Hydrolysis & Fermentation Lab Protocols
Lignocellulose Biomass- Hydrolysis & Fermentation Lab ProtocolsLignocellulose Biomass- Hydrolysis & Fermentation Lab Protocols
Lignocellulose Biomass- Hydrolysis & Fermentation Lab Protocols
 
IRJET - Formulation and Evaluation of Tinidazole Loaded Fast Dissolving Tablets
IRJET - Formulation and Evaluation of Tinidazole Loaded Fast Dissolving TabletsIRJET - Formulation and Evaluation of Tinidazole Loaded Fast Dissolving Tablets
IRJET - Formulation and Evaluation of Tinidazole Loaded Fast Dissolving Tablets
 
Pda Visual Inspection 2009 Aldrich
Pda Visual Inspection 2009 AldrichPda Visual Inspection 2009 Aldrich
Pda Visual Inspection 2009 Aldrich
 
Poster VisuaLCA Environmental Picture
Poster VisuaLCA Environmental PicturePoster VisuaLCA Environmental Picture
Poster VisuaLCA Environmental Picture
 
Standardized Representations of ELN Reactions for Categorization and Duplicat...
Standardized Representations of ELN Reactions for Categorization and Duplicat...Standardized Representations of ELN Reactions for Categorization and Duplicat...
Standardized Representations of ELN Reactions for Categorization and Duplicat...
 
SLAS Labware Leachables Special Interest Group SLAS2017 Presentation
SLAS Labware Leachables Special Interest Group SLAS2017 PresentationSLAS Labware Leachables Special Interest Group SLAS2017 Presentation
SLAS Labware Leachables Special Interest Group SLAS2017 Presentation
 
Workflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaWorkflows supporting drug discovery against malaria
Workflows supporting drug discovery against malaria
 
DOE Applications in Process Chemistry Presentation
DOE Applications in Process Chemistry PresentationDOE Applications in Process Chemistry Presentation
DOE Applications in Process Chemistry Presentation
 
Glycos Biotechnologies.TAMU presentation_PC edits
Glycos Biotechnologies.TAMU presentation_PC editsGlycos Biotechnologies.TAMU presentation_PC edits
Glycos Biotechnologies.TAMU presentation_PC edits
 
DEVELOPMENT AND VALIDATION OF SPECTROSCOPIC AND CHROMATOGRAPHIC METHOD FOR D...
DEVELOPMENT AND VALIDATION OF SPECTROSCOPIC AND CHROMATOGRAPHIC  METHOD FOR D...DEVELOPMENT AND VALIDATION OF SPECTROSCOPIC AND CHROMATOGRAPHIC  METHOD FOR D...
DEVELOPMENT AND VALIDATION OF SPECTROSCOPIC AND CHROMATOGRAPHIC METHOD FOR D...
 

More from Alex Clark

Representing molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informaticsRepresenting molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informatics
Alex Clark
 

More from Alex Clark (20)

Mixtures QSAR: modelling collections of chemicals
Mixtures QSAR: modelling collections of chemicalsMixtures QSAR: modelling collections of chemicals
Mixtures QSAR: modelling collections of chemicals
 
Mixtures InChI: a story of how standards drive upstream products
Mixtures InChI: a story of how standards drive upstream productsMixtures InChI: a story of how standards drive upstream products
Mixtures InChI: a story of how standards drive upstream products
 
Mixtures as first class citizens in the realm of informatics
Mixtures as first class citizens in the realm of informaticsMixtures as first class citizens in the realm of informatics
Mixtures as first class citizens in the realm of informatics
 
Coordination InChI (2019)
Coordination InChI (2019)Coordination InChI (2019)
Coordination InChI (2019)
 
Chemical mixtures: File format, open source tools, example data, and mixtures...
Chemical mixtures: File format, open source tools, example data, and mixtures...Chemical mixtures: File format, open source tools, example data, and mixtures...
Chemical mixtures: File format, open source tools, example data, and mixtures...
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...
 
ACS CINF Luncheon talk (Boston 2018)
ACS CINF Luncheon talk (Boston 2018)ACS CINF Luncheon talk (Boston 2018)
ACS CINF Luncheon talk (Boston 2018)
 
Autonomous model building with a preponderance of well annotated assay protocols
Autonomous model building with a preponderance of well annotated assay protocolsAutonomous model building with a preponderance of well annotated assay protocols
Autonomous model building with a preponderance of well annotated assay protocols
 
Representing molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informaticsRepresenting molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informatics
 
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
 
BioAssay Express
BioAssay ExpressBioAssay Express
BioAssay Express
 
SLAS2016: Why have one model when you could have thousands?
SLAS2016: Why have one model when you could have thousands?SLAS2016: Why have one model when you could have thousands?
SLAS2016: Why have one model when you could have thousands?
 
The anatomy of a chemical reaction: Dissection by machine learning algorithms
The anatomy of a chemical reaction: Dissection by machine learning algorithmsThe anatomy of a chemical reaction: Dissection by machine learning algorithms
The anatomy of a chemical reaction: Dissection by machine learning algorithms
 
Compact models for compact devices: Visualisation of SAR using mobile apps
Compact models for compact devices: Visualisation of SAR using mobile appsCompact models for compact devices: Visualisation of SAR using mobile apps
Compact models for compact devices: Visualisation of SAR using mobile apps
 
Green chemistry in chemical reactions: informatics by design
Green chemistry in chemical reactions: informatics by designGreen chemistry in chemical reactions: informatics by design
Green chemistry in chemical reactions: informatics by design
 
ICCE 2014: The Green Lab Notebook
ICCE 2014: The Green Lab NotebookICCE 2014: The Green Lab Notebook
ICCE 2014: The Green Lab Notebook
 
Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)
Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)
Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)
 
Building a mobile reaction lab notebook (ACS Dallas 2014)
Building a mobile reaction lab notebook (ACS Dallas 2014)Building a mobile reaction lab notebook (ACS Dallas 2014)
Building a mobile reaction lab notebook (ACS Dallas 2014)
 
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
 
Alex Clark : NETTAB 2013
Alex Clark : NETTAB 2013Alex Clark : NETTAB 2013
Alex Clark : NETTAB 2013
 

Recently uploaded

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
Lokesh Kothari
 

Recently uploaded (20)

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 

Mixtures: informatics for formulations and consumer products

  • 1. Leah R. McEwen & Alex M. Clark Mixtures informatics for formulations and consumer products
  • 2. Who ✤ InChI Trust / IUPAC ‣ https://www.inchi-trust.org ‣ Mixtures InChI notation ✤ Collaborative Drug Discovery ‣ https://collaborativedrug.com ‣ Mixfiles & tools 2
  • 3. Introduction ✤ Cheminformatics has 40 years of practice representing abstract molecules ✤ Very successful applications for pharmaceutical drug discovery ✤ But the reality of chemicals in the lab is that ‣ nothing is ever completely pure ‣ most activities involve explicitly mixing chemicals ‣ mixtures are crafted for specific purposes ✤ Yet, there is lack of standard way to describe mixtures CC(=O)OC1=CC=CC=C1C(=O)O InChI=1S/C9H8O4/c1-6(10)13-8-5-3-2 -4-7(8)9(11)12/h2-5H,1H3,(H,11,12) InChI SMILES Molfile 3
  • 4. Mixfile/MInChI ✤ Format needs to be: ‣ hierarchical ‣ embed structures when possible ‣ include concentration information ‣ tolerate uncertainty ✤ More verbose ELN-friendly form is Mixfile ✤ Concise form with canonical components is MInChI (mixtures InChI) MInChI=0.00.1S/C4H8O/c1-2-4-5-3-1/h1-4H2&C6H12/ c1-6-4-2-3-5-6/h6H,2-5H2,1H3&C6H14/c1-3-5-6-4-2/ h3-6H2,1-2H3&C6H14/c1-4-5-6(2)3/h6H,4-5H2,1-3H3&C6H14/ c1-4-6(3)5-2/h6H,4-5H2,1-3H3&C6H14N.Li/c1-5(2)7-6(3)4;/ h5-6H,1-4H3;/q-1;+1/n{6&{1&{3&2&4&5}}}/ g{1mr0&{1vp0&{5:7pp1&1:2pp1&1:5pp0&1:5pp0}7vp0}} 4
  • 5. Formulation Example ✤ Many consumer products are well described from a chemical perspective ✤ Some components are more easily defined than others ✤ When structure is not available, can use external identifiers ✤ Hierarchy encodes information about the design of the product ✤ Concentrations can be expressed with uncertainties 5
  • 6. Formulation Example ✤ Many consumer products are well described from a chemical perspective ✤ Some components are more easily defined than others ✤ When structure is not available, can use external identifiers ✤ Hierarchy encodes information about the design of the product ✤ Concentrations can be expressed with uncertainties 5
  • 7. Formulation Example ✤ Many consumer products are well described from a chemical perspective ✤ Some components are more easily defined than others ✤ When structure is not available, can use external identifiers ✤ Hierarchy encodes information about the design of the product ✤ Concentrations can be expressed with uncertainties 5
  • 8. Formulation Example ✤ Many consumer products are well described from a chemical perspective ✤ Some components are more easily defined than others ✤ When structure is not available, can use external identifiers ✤ Hierarchy encodes information about the design of the product ✤ Concentrations can be expressed with uncertainties 5
  • 9. Formulation Example ✤ Many consumer products are well described from a chemical perspective ✤ Some components are more easily defined than others ✤ When structure is not available, can use external identifiers ✤ Hierarchy encodes information about the design of the product ✤ Concentrations can be expressed with uncertainties 5
  • 10. Formulation Example ✤ Many consumer products are well described from a chemical perspective ✤ Some components are more easily defined than others ✤ When structure is not available, can use external identifiers ✤ Hierarchy encodes information about the design of the product ✤ Concentrations can be expressed with uncertainties 5
  • 11. Formulation Example ✤ Many consumer products are well described from a chemical perspective ✤ Some components are more easily defined than others ✤ When structure is not available, can use external identifiers ✤ Hierarchy encodes information about the design of the product ✤ Concentrations can be expressed with uncertainties 5
  • 12. Design of Mixtures ✤ Each branch is a thing ✤ Each leaf is a concept ✤ Layout can correspond to how the mixture is formulated 6
  • 13. Knowledge Capture ✤ Capture what we know about the mixture: and nothing more ‣ ideally each leaf node has well defined structure & precise concentration ‣ the closer we get to this, the more analysis we can do ✤ Concentrations often variable, unknown, vague, or implied ✤ Structure(s) can be hard to pin down... ‣ ... not always a single, well defined, easy to draw molecule 7 Cerium(IV) and Perchloric acid, Etchant Solution, Ce(IV) concentration 0.22N 1,2-Diphenylcyclopropane, cis + trans, 97% 1,3,5,7-Cyclooctatetraene, 98%, stab. with 0.1% Hydroquinone Lithium bis(trimethylsilyl) amide, 20% (ca 1.06M) soln. in THF/ethylbenzene, packaged in resealable septum cap bottle
  • 14. Properties density 0.719 g/mL density 0.79 g/mL CAS 4111-54-0 b.p. 65-67°C density 0.889 g/mL CAS 109-99-9 b.p. 68-70°C density 0.672 g/mL CAS 110-54-3 b.p. 72°C density 0.748 g/mL CAS 96-37-7 b.p. 60-62°C density 0.653 g/mL CAS 107-83-5 b.p. 63-64°C density 0.664 g/mL CAS 96-14-0 b.p. 68-70°C density 0.672 CAS 73513-42-5 ✤ Metadata is attached to a position: ‣ root = the whole thing ‣ leaf = individual component ‣ branch = several components 8
  • 15. Structures by External Definition ✤ Sometimes have to resort to describing a component by method of preparation, means of extraction, measured properties, etc. ✤ External database identifiers can be useful: ‣ CASRN: Chemical Abstracts literature extraction ‣ INCI: International Nomenclature of Cosmetics Ingredients ‣ UNII: Food & Drug Administration database ✤ Database identifiers are not ideal for machine readability, but they can be used to establish equivalence: 9
  • 16. Comparisons with Structures InChI=1S/C7H6O3/c8-6-4-2-1-3-5(6)7(9)10/h1-4,8H,(H,9,10) ≡ substructure of n n MW 300-500 MW 400-700 ≅ Y1.2Ba0.8CuO4 ≅ YBa2Cu3O7−δ ≅ 10
  • 17. Search Queries >40%has has INCI: COCAMIDE DEA has not substructure ✤ Looking for a certain subset of external cleaning surfactants, phosphate-free 11
  • 18. Informatics Example ✤ Solubility of theophylline ✤ Often delivered in liquid form with mixed solvents: optimising proportion of drug is important ✤ Consider a scenario where: ‣ all data was provided in Mixtures InChI form ‣ these data exist in openly available repositories ✤ Query: ‣ check that theophylline is present and has concentration ‣ check that other ingredients are solvents ✤ Consider 4 papers with relevant solubility, published over 20 years... theophylline nasal anti-inflammatory 12
  • 19. Paper #1 ✤ Valizadeh et al, Adv. Pharm. Bull. (2011), DOI 10.5681/apb.2011.003 13
  • 20. Paper #2 ✤ Yan et al, J. Chem. Eng. Data (2017), DOI 10.1021/acs.jced.7b00065 14
  • 21. Paper #3 ✤ Martínez et al, J. Solution Chem. (2017), DOI 10.1007/s10953-017-0666-z 15
  • 22. Paper #4 ✤ Campisi et al, J. Pharm. Biomed. (1998), DOI 10.1016/S0731-7085(98)00175-7 n n + 14 more measurements n n 16
  • 23. All Together for QSAR Solubility 0.699 1 15.19 1 1.04 1 3.142 1 0.784 1 0.91 1 6.3 1 13.7 1 11.6 1 13.58 1 6.73 1 9.3 1 8.20 0.8 0.2 16.38 0.5 0.5 13.60 0.2 0.8 (+8 more similar) 15.39 0.333 0.667 26.6 0.5 0.5 17.97 0.083 0.584 0.333 19.06 0.708 0.292 22.59 0.417 0.25 0.333 26.52 0.283 0.25 0.3 0.167 (+14 more similar) n 17
  • 24. Source Data & Inventory ✤ Gather public content like INCI and UNII: ✤ Chemical Abstracts in mixture form would be nice ✤ Reagents and materials from vendors: could search, copy, paste from site 18
  • 25. Hazards ✤ Automated lookup of hazard classifications, toxicity data, etc. needs structures and mixture context... each of these is not like the other ✤ Machine readability and open access are both major hurdles 19
  • 26. Longevity ✤ Any ELN, private registration system or public database: ‣ capture data in machine readable form ‣ if a machine can understand it, so can a human ‣ if standards are followed, data will always be interpretable ‣ data can be shared as much or as little as needed ✤ Sophisticated queries and analysis become possible ✤ Institutional knowledge does not evaporate ✤ An open ecosystem means that tools will evolve ‣ tools can be free or proprietary, general purpose or specific 20
  • 27. Questions? ✤ Contact: ‣ Leah R. McEwen lrm1@cornell.edu (Cornell University, IUPAC/InChI Trust) ‣ Alex M. Clark alex@collaborativedrug.com (Collaborative Drug Discovery) 21 Journal of Cheminformatics (2019) 10.1186/s13321-019-0357-4