SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Evolution of open chemical information
Valery Tkachenko
Royal Society of Chemistry
ACS Fall 2016
Philadelphia, PA
The Short History of Time
Image credit: Rhys Taylor, Cardiff University
~1992
Chemical database
PubChem
• 57 million chemicals and growing
• Data sourced from >500 different sources
• Crowdsourced curation and annotation
• Ongoing deposition of data from our
journals and our collaborators
• A structure centric hub for web-searching
ChemSpider
ChemSpider
ChemSpider real-time curation
Article X-ray
Compounds
Reaction
Analytical Data
Text and References
Reaction 1: NextMove reaction text-mined
from RSC archive – original article
Reaction 1: NextMove reaction text-
mined from RSC archive – cml output
<?xml version="1.0" encoding="UTF-8"?>
<reactionList xmlns="http://www.xml-cml.org/schema" xmlns:cmlDict="http://www.xml-cml.org/dictionary/cml/" xmlns:nameDict="http://www.xml-cml.org/dictionary/cml/name/"
xmlns:unit="http://www.xml-cml.org/unit/" xmlns:cml="http://www.xml-cml.org/schema" xmlns:dl="http://bitbucket.org/dan2097">
<reaction>
<dl:source>
<dl:documentId>c3ra45871g</dl:documentId>
<dl:paragraphText>Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL)
at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL), concentrated. The residue
was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product
was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid. [α]D20 −24.2 (c 1.1, CHCl3); 1H NMR
(CDCl3, 300 MHz) δ 0.04 (s, 3H), 0.07 (s, 3H), 0.85 (s, 9H), 1.34 (s, 3H), 1.44 (s, 3H), 2.16 (br, 1H), 3.68–3.81 (m, 3H), 4.16 (t, J = 13.8 Hz, J = 13.8 Hz, 1H), 4.59 (t, J = 6.6 Hz, J = 6.6
Hz, 1H), 5.22 (d, J = 10.7 Hz, 1H), 5.34 (d, J = 17.1 Hz, 1H), 5.90 (ddd, J = 7.2 Hz, J = 10.2 Hz, J = 17.2 Hz, 1H); 13C NMR (CDCl3, 75 MHz) δ 134.1, 118.4, 108.5, 79.5, 78.8, 70.8,
65.0, 27.8, 25.9, 25.4, 18.1, −3.7, −4.4. HRMS (ESI) calcd for [M + Na]+ (C15H30O4SiNa) 325.1811, found 325.1807.</dl:paragraphText>
</dl:source>
<dl:reactionSmiles>[H-
].C([Al+]CC(C)C)C(C)C.C([O:17][CH2:18][C@@H:19]([O:29][Si:30]([C:33]([CH3:36])([CH3:35])[CH3:34])([CH3:32])[CH3:31])[C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3
:28])([CH3:27])[O:21]1)(=O)C(C)(C)C&gt;ClCCl&gt;[C:33]([Si:30]([CH3:32])([CH3:31])[O:29][C@@H:19]([C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21
]1)[CH2:18][OH:17])([CH3:36])([CH3:35])[CH3:34] |f:0.1|</dl:reactionSmiles>
<productList>
<product role="product">
<molecule id="m0">
<name dictRef="nameDict:unknown">10</name>
<dl:nameResolved>(R)-2-((tert-Butyldimethylsilyl)oxy)-2-((4S,5S)-2,2-dimethyl-5-vinyl-1,3-dioxolan-4-yl)ethanol</dl:nameResolved>
</molecule>
<amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00102">1.02 mmol</amount>
<amount dl:propertyType="MASS" dl:normalizedValue="0.308">308 mg</amount>
<amount dl:propertyType="PERCENTYIELD" dl:normalizedValue="79">79%</amount>
<amount dl:propertyType="CALCULATEDPERCENTYIELD" dl:normalizedValue="79.1" units="unit:percentYield">79.1</amount>
<identifier dictRef="cml:smiles" value="C(C)(C)(C)[Si](O[C@H](CO)[C@H]1OC(O[C@H]1C=C)(C)C)(C)C"/>
<identifier dictRef="cml:inchi" value="InChI=1S/C15H30O4Si/c1-9-11-13(18-15(5,6)17-11)12(10-16)19-20(7,8)14(2,3)4/h9,11-13,16H,1,10H2,2-8H3/t11-,12+,13-/m0/s1"/>
<dl:entityType>definiteReference</dl:entityType>
<dl:appearance>colourless</dl:appearance>
<dl:state>liquid</dl:state>
</product>
</productList>
<reactantList>
<reactant role="reactant">
<molecule id="m1">
<name dictRef="nameDict:unknown">Diisobutylaluminium hydride</name>
</molecule>
<amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00323">3.23 mmol</amount>
Reaction 1: procedure steps
Diisobutylaluminium hydride (1.1 M in
cyclohexane, 2.93 mL, 3.23 mmol) was added
dropwise to the solution of 9 (500 mg, 1.29
mmol) and dichloromethane (20 mL) at −78 °C.
The reaction mixture was stirred at −78 °C for
another 2 h, warmed up to rt, quenched with
methanol (3 mL) and citric acid (aq) (w/w, 10%,
5 mL), concentrated. The residue was added
with water (10 mL) and extracted with
dichloromethane (12 mL × 3). The organic
layers were combined, dried over Na2SO4,
filtered and concentrated. The crude product
was further purified by column chromatography
(SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give
10 (308 mg, 1.02 mmol, 79%) as a colourless
liquid.
Text mining breaks down procedure summary into steps:
<dl:reactionActionList/dl:reactionActions> dl:phraseTexts
• action="Add“: Diisobutylaluminium hydride (1.1 M in
cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to
the solution of 9 (500 mg, 1.29 mmol) and
dichloromethane (20 mL) at −78 °C
• action=" Stir“: The reaction mixture was stirred at −78 °C
for another 2 h
• action="Heat“: warmed up to rt
• action="Quench“: quenched with methanol (3 mL) and
citric acid(aq) (w/w, 10%, 5 mL)
• action="Concentrate“: concentrated
• action="Add“: The residue was added with water (10 mL)
• action="Extract“: extracted with dichloromethane (12 mL ×
3)
• action="Dry“: dried over Na2SO4
• action="Filter“: filtered
• action="Concentrate“: concentrated
• action="Purify“: The crude product was further purified by
column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf
0.33)
• action="Yield“: to give 10 (308 mg, 1.02 mmol, 79%) as a
colourless liquid
http://www.wired.com/2014/04/google-project-ara/
http://www.wsj.com/articles/googles-modular-phones-to-go-on-
sale-next-year-1463783371
The World we are heading into
http://www.gartner.com/newsroom/id/3143521
Our World is hyperconnected
Standards?
Data quality issues
Robochemistry
Proliferation of errors in public and
private databases
Automated quality control system
CVSP
CVSP – submission details
CVSP – issues review
J. Brechner, IUPAC
Graphical Representation of
stereochem. configurations
Section: ST-1.1.10
DB06287
CVSP - mapping
CVSP – rules
Dimensions and complexity of science
D2I2K2W
info@openphactsfoundation.org @Open_PHACTS
Open PHACTS Practical Semantics
OpenPHACTS
GlaxoSmithKline – Coordinator
Universität Wien – Managing entity
Technical University of Denmark
University of Hamburg, Center for
Bioinformatics
BioSolveIT GmBH
Consorci Mar Parc de Salut de Barcelona
Leiden University Medical Centre
Royal Society of Chemistry
Vrije Universiteit Amsterdam
Novartis
Merck Serono
H. Lundbeck A/S
Eli Lilly
Netherlands Bioinformatics Centre
Swiss Institute of Bioinformatics
ConnectedDiscovery
EMBL-European Bioinformatics Institute
Janssen Esteve Almirall
OpenLink Scibite
The Open PHACTS Foundation
Spanish National Cancer Research Centre
University of Manchester
Maastricht University
Aqnowledge
University of Santiago de Compostela
Rheinische Friedrich-Wilhelms-Universität
Bonn
AstraZeneca
Pfizer
Why is it so hard to….
Competitors?
What’s the
structure?
Are they in our
file?
What’s
similar?
What’s the
target?Pharmacology
data?
Known
Pathways?
Working On
Now?
Connections to
disease?
Expressed in right
cell type?
IP?
@gray_alasdair Big Data Integration 30
Knowledge is federated
Publishing – then…
…and now?
http://ec.europa.eu/research/press/2016/pdf/opendata-infographic_072016.pdf
Data Market
Publishers - the guardians of knowledge
This is a poster for Guardians of the Galaxy. The poster art copyright is believed to belong to the distributor of the Film, Walt Disney Studios Motion
Pictures, the publisher, Marvel Studios, or the graphic artist.
Data Publishing
Original artist: Joseph Ferdinand Keppler (1838-1894) Restoration: Adam Cuerden - http://www.loc.gov/pictures/item/2011661385/ by way
ofhttp://adamcuerden.deviantart.com/gallery/#/d5onmxh
The World we live in
Moore’s Law
"Internet host count history". Internet Systems Consortium. Retrieved May 16,2012.
We are on a verge of a new technical revolution
and it feels great to anticipate it and be ready to ride!
Image from surfline.com by Mike Cianciulli
Data Science @ RSC
The team. From left to right: Valery Tkachenko and Alexey Pshenichnov, based in the United States;
Aileen Day, based in Southampton; John Boyle, Peter Corbett, Colin Batchelor, Jeff White, Nicholas
Bailey and Val the plant, based at TGH
Thank you
Email: tkachenkov@rsc.org
Slides:
http://www.slideshare.net/valerytkachenko16

Weitere ähnliche Inhalte

Andere mochten auch

Empleo y esclerosis múltiple.
Empleo y esclerosis múltiple.Empleo y esclerosis múltiple.
Empleo y esclerosis múltiple.José María
 
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)aalvarez1410
 
Verkko ja tyohyvinvointi, TTL 26052010
Verkko ja tyohyvinvointi, TTL 26052010Verkko ja tyohyvinvointi, TTL 26052010
Verkko ja tyohyvinvointi, TTL 26052010Karoliina Luoto
 
Processes should serve creativity - Which processes help creatives to work be...
Processes should serve creativity - Which processes help creatives to work be...Processes should serve creativity - Which processes help creatives to work be...
Processes should serve creativity - Which processes help creatives to work be...Jan Korsanke
 
Nociones de derecho civil mapa conceptual
Nociones de derecho civil mapa conceptualNociones de derecho civil mapa conceptual
Nociones de derecho civil mapa conceptualylerin
 
Serverless Logging with AWS Lambda and the Elastic Stack
Serverless Logging with AWS Lambda and the Elastic StackServerless Logging with AWS Lambda and the Elastic Stack
Serverless Logging with AWS Lambda and the Elastic StackEdoardo Paolo Scalafiotti
 
Érzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzésÉrzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzésZoltan Varju
 
Nociones del derecho civil mapa conceptual
Nociones del derecho civil mapa conceptualNociones del derecho civil mapa conceptual
Nociones del derecho civil mapa conceptualMaría Herrera
 
OpenPHACTS - Chemistry Platform Update and Learnings
OpenPHACTS - Chemistry Platform Update and LearningsOpenPHACTS - Chemistry Platform Update and Learnings
OpenPHACTS - Chemistry Platform Update and LearningsValery Tkachenko
 
Postcron: Automate and Plan Posts Ahead
Postcron: Automate and Plan Posts AheadPostcron: Automate and Plan Posts Ahead
Postcron: Automate and Plan Posts AheadMafel Gorne
 
How to Decide: When to Use What In Office 365 - SharePoint Fest DC
How to Decide: When to Use What In Office 365 - SharePoint Fest DCHow to Decide: When to Use What In Office 365 - SharePoint Fest DC
How to Decide: When to Use What In Office 365 - SharePoint Fest DCRichard Harbridge
 
Historic Gay Rights Decision
Historic Gay Rights DecisionHistoric Gay Rights Decision
Historic Gay Rights Decisionmaditabalnco
 
Gradle: Harder, Stronger, Better, Faster
Gradle: Harder, Stronger, Better, FasterGradle: Harder, Stronger, Better, Faster
Gradle: Harder, Stronger, Better, FasterAndres Almiray
 

Andere mochten auch (17)

2016 laura's resume
2016 laura's resume2016 laura's resume
2016 laura's resume
 
Empleo y esclerosis múltiple.
Empleo y esclerosis múltiple.Empleo y esclerosis múltiple.
Empleo y esclerosis múltiple.
 
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)
 
Verkko ja tyohyvinvointi, TTL 26052010
Verkko ja tyohyvinvointi, TTL 26052010Verkko ja tyohyvinvointi, TTL 26052010
Verkko ja tyohyvinvointi, TTL 26052010
 
Processes should serve creativity - Which processes help creatives to work be...
Processes should serve creativity - Which processes help creatives to work be...Processes should serve creativity - Which processes help creatives to work be...
Processes should serve creativity - Which processes help creatives to work be...
 
Nociones de derecho civil mapa conceptual
Nociones de derecho civil mapa conceptualNociones de derecho civil mapa conceptual
Nociones de derecho civil mapa conceptual
 
Serverless Logging with AWS Lambda and the Elastic Stack
Serverless Logging with AWS Lambda and the Elastic StackServerless Logging with AWS Lambda and the Elastic Stack
Serverless Logging with AWS Lambda and the Elastic Stack
 
Érzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzésÉrzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzés
 
Nociones del derecho civil mapa conceptual
Nociones del derecho civil mapa conceptualNociones del derecho civil mapa conceptual
Nociones del derecho civil mapa conceptual
 
OpenPHACTS - Chemistry Platform Update and Learnings
OpenPHACTS - Chemistry Platform Update and LearningsOpenPHACTS - Chemistry Platform Update and Learnings
OpenPHACTS - Chemistry Platform Update and Learnings
 
Postcron: Automate and Plan Posts Ahead
Postcron: Automate and Plan Posts AheadPostcron: Automate and Plan Posts Ahead
Postcron: Automate and Plan Posts Ahead
 
How to Decide: When to Use What In Office 365 - SharePoint Fest DC
How to Decide: When to Use What In Office 365 - SharePoint Fest DCHow to Decide: When to Use What In Office 365 - SharePoint Fest DC
How to Decide: When to Use What In Office 365 - SharePoint Fest DC
 
Medicina Humana
Medicina HumanaMedicina Humana
Medicina Humana
 
DEBILIDAD INTELECTUAL
DEBILIDAD INTELECTUALDEBILIDAD INTELECTUAL
DEBILIDAD INTELECTUAL
 
Historic Gay Rights Decision
Historic Gay Rights DecisionHistoric Gay Rights Decision
Historic Gay Rights Decision
 
#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE
#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE
#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE
 
Gradle: Harder, Stronger, Better, Faster
Gradle: Harder, Stronger, Better, FasterGradle: Harder, Stronger, Better, Faster
Gradle: Harder, Stronger, Better, Faster
 

Ähnlich wie Evolution of open chemical information

Not just another reaction database
Not just another reaction databaseNot just another reaction database
Not just another reaction databaseValery Tkachenko
 
IRSAE aquatic ecology 28 June 2018 metabolomics
IRSAE aquatic ecology 28 June 2018 metabolomicsIRSAE aquatic ecology 28 June 2018 metabolomics
IRSAE aquatic ecology 28 June 2018 metabolomicsPanagiotis Arapitsas
 
Method Validation: Comparison among two analitical methods for the determinat...
Method Validation: Comparison among two analitical methods for the determinat...Method Validation: Comparison among two analitical methods for the determinat...
Method Validation: Comparison among two analitical methods for the determinat...Greta SuperTramp
 
Text mining to produce large chemistry datasets for community access
Text mining to produce large chemistry datasets for community accessText mining to produce large chemistry datasets for community access
Text mining to produce large chemistry datasets for community accessValery Tkachenko
 
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.pptARUNNT2
 
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case StudyHemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case StudyInsideScientific
 
151 performance of a localized fiber optic
151 performance of a localized fiber optic151 performance of a localized fiber optic
151 performance of a localized fiber opticSHAPE Society
 
N-alkylation methods, Characterization and Evaluation of antibacterial activi...
N-alkylation methods, Characterization and Evaluation of antibacterial activi...N-alkylation methods, Characterization and Evaluation of antibacterial activi...
N-alkylation methods, Characterization and Evaluation of antibacterial activi...IJERA Editor
 
Chemical intelligence that makes hidden knowledge effortlessly reachable
Chemical intelligence that makes hidden knowledge effortlessly reachableChemical intelligence that makes hidden knowledge effortlessly reachable
Chemical intelligence that makes hidden knowledge effortlessly reachableChemAxon
 
A041030106
A041030106A041030106
A041030106IOSR-JEN
 

Ähnlich wie Evolution of open chemical information (20)

Current initiatives in developing research data repositories at the Royal Soc...
Current initiatives in developing research data repositories at the Royal Soc...Current initiatives in developing research data repositories at the Royal Soc...
Current initiatives in developing research data repositories at the Royal Soc...
 
Not just another reaction database
Not just another reaction databaseNot just another reaction database
Not just another reaction database
 
Metabolomics.ppt
Metabolomics.pptMetabolomics.ppt
Metabolomics.ppt
 
The importance of standards for data exchange and interchange on the Royal So...
The importance of standards for data exchange and interchange on the Royal So...The importance of standards for data exchange and interchange on the Royal So...
The importance of standards for data exchange and interchange on the Royal So...
 
IRSAE aquatic ecology 28 June 2018 metabolomics
IRSAE aquatic ecology 28 June 2018 metabolomicsIRSAE aquatic ecology 28 June 2018 metabolomics
IRSAE aquatic ecology 28 June 2018 metabolomics
 
Method Validation: Comparison among two analitical methods for the determinat...
Method Validation: Comparison among two analitical methods for the determinat...Method Validation: Comparison among two analitical methods for the determinat...
Method Validation: Comparison among two analitical methods for the determinat...
 
Experiences in Hosting Big Chemistry Data Collections for the Community
Experiences in Hosting Big Chemistry Data Collections for the CommunityExperiences in Hosting Big Chemistry Data Collections for the Community
Experiences in Hosting Big Chemistry Data Collections for the Community
 
NOMAD
NOMADNOMAD
NOMAD
 
Text mining to produce large chemistry datasets for community access
Text mining to produce large chemistry datasets for community accessText mining to produce large chemistry datasets for community access
Text mining to produce large chemistry datasets for community access
 
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
 
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case StudyHemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
 
A chemistry data repository to serve them all
A chemistry data repository to serve them allA chemistry data repository to serve them all
A chemistry data repository to serve them all
 
151 performance of a localized fiber optic
151 performance of a localized fiber optic151 performance of a localized fiber optic
151 performance of a localized fiber optic
 
Nirs
NirsNirs
Nirs
 
151 performance of a localized fiber optic
151 performance of a localized fiber optic151 performance of a localized fiber optic
151 performance of a localized fiber optic
 
Synthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acid
Synthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acidSynthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acid
Synthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acid
 
N-alkylation methods, Characterization and Evaluation of antibacterial activi...
N-alkylation methods, Characterization and Evaluation of antibacterial activi...N-alkylation methods, Characterization and Evaluation of antibacterial activi...
N-alkylation methods, Characterization and Evaluation of antibacterial activi...
 
Dealing with the complex challenge of managing diverse analytical chemistry d...
Dealing with the complex challenge of managing diverse analytical chemistry d...Dealing with the complex challenge of managing diverse analytical chemistry d...
Dealing with the complex challenge of managing diverse analytical chemistry d...
 
Chemical intelligence that makes hidden knowledge effortlessly reachable
Chemical intelligence that makes hidden knowledge effortlessly reachableChemical intelligence that makes hidden knowledge effortlessly reachable
Chemical intelligence that makes hidden knowledge effortlessly reachable
 
A041030106
A041030106A041030106
A041030106
 

Mehr von Valery Tkachenko

Evolution of public chemistry databases: past and the future
Evolution of public chemistry databases: past and the futureEvolution of public chemistry databases: past and the future
Evolution of public chemistry databases: past and the futureValery Tkachenko
 
In silico design of new functional materials
In silico design of new functional materialsIn silico design of new functional materials
In silico design of new functional materialsValery Tkachenko
 
Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...Valery Tkachenko
 
Abstract recommendation system: beyond word-level representations
Abstract recommendation system: beyond word-level representationsAbstract recommendation system: beyond word-level representations
Abstract recommendation system: beyond word-level representationsValery Tkachenko
 
Machine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpointsMachine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpointsValery Tkachenko
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionValery Tkachenko
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsValery Tkachenko
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsValery Tkachenko
 
Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Valery Tkachenko
 
Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...Valery Tkachenko
 
Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...Valery Tkachenko
 
Living in a world of federated knowledge challenges, principles, tools and ...
Living in a world of federated knowledge   challenges, principles, tools and ...Living in a world of federated knowledge   challenges, principles, tools and ...
Living in a world of federated knowledge challenges, principles, tools and ...Valery Tkachenko
 
Open chemistry registry and mapping platform based on open source cheminforma...
Open chemistry registry and mapping platform based on open source cheminforma...Open chemistry registry and mapping platform based on open source cheminforma...
Open chemistry registry and mapping platform based on open source cheminforma...Valery Tkachenko
 
Using the structured product labeling format to index versatile chemical data
Using the structured product labeling format to index versatile chemical dataUsing the structured product labeling format to index versatile chemical data
Using the structured product labeling format to index versatile chemical dataValery Tkachenko
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesValery Tkachenko
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Valery Tkachenko
 
Open Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials researchOpen Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials researchValery Tkachenko
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardizationValery Tkachenko
 
OMPOL – visualisation of large chemical spaces
OMPOL – visualisation of large chemical spacesOMPOL – visualisation of large chemical spaces
OMPOL – visualisation of large chemical spacesValery Tkachenko
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSValery Tkachenko
 

Mehr von Valery Tkachenko (20)

Evolution of public chemistry databases: past and the future
Evolution of public chemistry databases: past and the futureEvolution of public chemistry databases: past and the future
Evolution of public chemistry databases: past and the future
 
In silico design of new functional materials
In silico design of new functional materialsIn silico design of new functional materials
In silico design of new functional materials
 
Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...
 
Abstract recommendation system: beyond word-level representations
Abstract recommendation system: beyond word-level representationsAbstract recommendation system: beyond word-level representations
Abstract recommendation system: beyond word-level representations
 
Machine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpointsMachine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpoints
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collection
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
 
Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...
 
Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...
 
Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...
 
Living in a world of federated knowledge challenges, principles, tools and ...
Living in a world of federated knowledge   challenges, principles, tools and ...Living in a world of federated knowledge   challenges, principles, tools and ...
Living in a world of federated knowledge challenges, principles, tools and ...
 
Open chemistry registry and mapping platform based on open source cheminforma...
Open chemistry registry and mapping platform based on open source cheminforma...Open chemistry registry and mapping platform based on open source cheminforma...
Open chemistry registry and mapping platform based on open source cheminforma...
 
Using the structured product labeling format to index versatile chemical data
Using the structured product labeling format to index versatile chemical dataUsing the structured product labeling format to index versatile chemical data
Using the structured product labeling format to index versatile chemical data
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
 
Open Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials researchOpen Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials research
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
 
OMPOL – visualisation of large chemical spaces
OMPOL – visualisation of large chemical spacesOMPOL – visualisation of large chemical spaces
OMPOL – visualisation of large chemical spaces
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTS
 

Kürzlich hochgeladen

BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasBACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasChayanika Das
 
Unit-V-Introduction to Data Mining.pptx
Unit-V-Introduction to  Data Mining.pptxUnit-V-Introduction to  Data Mining.pptx
Unit-V-Introduction to Data Mining.pptxHarsha Patel
 
Interpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTInterpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTAlexander F. Mayer
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerLuis Miguel Chong Chong
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Christina Parmionova
 
Environment modelling and its environmental aspects
Environment modelling and its environmental aspectsEnvironment modelling and its environmental aspects
Environment modelling and its environmental aspectsMansi Rastogi
 
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's SurvivalHarry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survivalkevin8smith
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxGiDMOh
 
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...Speed Breeding in Vegetable Crops- innovative approach for present era of cro...
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...jana861314
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyLAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyChayanika Das
 
Think Science: What Are Eclipses (101), by Craig Bobchin
Think Science: What Are Eclipses (101), by Craig BobchinThink Science: What Are Eclipses (101), by Craig Bobchin
Think Science: What Are Eclipses (101), by Craig BobchinNathan Cone
 
Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsDobusch Leonhard
 
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Chiheb Ben Hammouda
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPRPirithiRaju
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxpriyankatabhane
 

Kürzlich hochgeladen (20)

Introduction Classification Of Alkaloids
Introduction Classification Of AlkaloidsIntroduction Classification Of Alkaloids
Introduction Classification Of Alkaloids
 
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasBACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
 
Unit-V-Introduction to Data Mining.pptx
Unit-V-Introduction to  Data Mining.pptxUnit-V-Introduction to  Data Mining.pptx
Unit-V-Introduction to Data Mining.pptx
 
Interpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTInterpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWST
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of Cancer
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
 
Environment modelling and its environmental aspects
Environment modelling and its environmental aspectsEnvironment modelling and its environmental aspects
Environment modelling and its environmental aspects
 
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's SurvivalHarry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
Harry Coumnas Thinks That Human Teleportation May Ensure Humanity's Survival
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptx
 
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...Speed Breeding in Vegetable Crops- innovative approach for present era of cro...
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyLAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
 
Think Science: What Are Eclipses (101), by Craig Bobchin
Think Science: What Are Eclipses (101), by Craig BobchinThink Science: What Are Eclipses (101), by Craig Bobchin
Think Science: What Are Eclipses (101), by Craig Bobchin
 
Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and Pitfalls
 
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
 
Bioenergetics and the role of ATP to drive the beats of life.
Bioenergetics and the role of ATP to drive the beats of life.Bioenergetics and the role of ATP to drive the beats of life.
Bioenergetics and the role of ATP to drive the beats of life.
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptx
 
Interferons.pptx.
Interferons.pptx.Interferons.pptx.
Interferons.pptx.
 

Evolution of open chemical information

  • 1. Evolution of open chemical information Valery Tkachenko Royal Society of Chemistry ACS Fall 2016 Philadelphia, PA
  • 2. The Short History of Time Image credit: Rhys Taylor, Cardiff University ~1992
  • 3.
  • 5.
  • 7. • 57 million chemicals and growing • Data sourced from >500 different sources • Crowdsourced curation and annotation • Ongoing deposition of data from our journals and our collaborators • A structure centric hub for web-searching
  • 12. Reaction 1: NextMove reaction text-mined from RSC archive – original article
  • 13. Reaction 1: NextMove reaction text- mined from RSC archive – cml output <?xml version="1.0" encoding="UTF-8"?> <reactionList xmlns="http://www.xml-cml.org/schema" xmlns:cmlDict="http://www.xml-cml.org/dictionary/cml/" xmlns:nameDict="http://www.xml-cml.org/dictionary/cml/name/" xmlns:unit="http://www.xml-cml.org/unit/" xmlns:cml="http://www.xml-cml.org/schema" xmlns:dl="http://bitbucket.org/dan2097"> <reaction> <dl:source> <dl:documentId>c3ra45871g</dl:documentId> <dl:paragraphText>Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL), concentrated. The residue was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid. [α]D20 −24.2 (c 1.1, CHCl3); 1H NMR (CDCl3, 300 MHz) δ 0.04 (s, 3H), 0.07 (s, 3H), 0.85 (s, 9H), 1.34 (s, 3H), 1.44 (s, 3H), 2.16 (br, 1H), 3.68–3.81 (m, 3H), 4.16 (t, J = 13.8 Hz, J = 13.8 Hz, 1H), 4.59 (t, J = 6.6 Hz, J = 6.6 Hz, 1H), 5.22 (d, J = 10.7 Hz, 1H), 5.34 (d, J = 17.1 Hz, 1H), 5.90 (ddd, J = 7.2 Hz, J = 10.2 Hz, J = 17.2 Hz, 1H); 13C NMR (CDCl3, 75 MHz) δ 134.1, 118.4, 108.5, 79.5, 78.8, 70.8, 65.0, 27.8, 25.9, 25.4, 18.1, −3.7, −4.4. HRMS (ESI) calcd for [M + Na]+ (C15H30O4SiNa) 325.1811, found 325.1807.</dl:paragraphText> </dl:source> <dl:reactionSmiles>[H- ].C([Al+]CC(C)C)C(C)C.C([O:17][CH2:18][C@@H:19]([O:29][Si:30]([C:33]([CH3:36])([CH3:35])[CH3:34])([CH3:32])[CH3:31])[C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3 :28])([CH3:27])[O:21]1)(=O)C(C)(C)C&gt;ClCCl&gt;[C:33]([Si:30]([CH3:32])([CH3:31])[O:29][C@@H:19]([C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21 ]1)[CH2:18][OH:17])([CH3:36])([CH3:35])[CH3:34] |f:0.1|</dl:reactionSmiles> <productList> <product role="product"> <molecule id="m0"> <name dictRef="nameDict:unknown">10</name> <dl:nameResolved>(R)-2-((tert-Butyldimethylsilyl)oxy)-2-((4S,5S)-2,2-dimethyl-5-vinyl-1,3-dioxolan-4-yl)ethanol</dl:nameResolved> </molecule> <amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00102">1.02 mmol</amount> <amount dl:propertyType="MASS" dl:normalizedValue="0.308">308 mg</amount> <amount dl:propertyType="PERCENTYIELD" dl:normalizedValue="79">79%</amount> <amount dl:propertyType="CALCULATEDPERCENTYIELD" dl:normalizedValue="79.1" units="unit:percentYield">79.1</amount> <identifier dictRef="cml:smiles" value="C(C)(C)(C)[Si](O[C@H](CO)[C@H]1OC(O[C@H]1C=C)(C)C)(C)C"/> <identifier dictRef="cml:inchi" value="InChI=1S/C15H30O4Si/c1-9-11-13(18-15(5,6)17-11)12(10-16)19-20(7,8)14(2,3)4/h9,11-13,16H,1,10H2,2-8H3/t11-,12+,13-/m0/s1"/> <dl:entityType>definiteReference</dl:entityType> <dl:appearance>colourless</dl:appearance> <dl:state>liquid</dl:state> </product> </productList> <reactantList> <reactant role="reactant"> <molecule id="m1"> <name dictRef="nameDict:unknown">Diisobutylaluminium hydride</name> </molecule> <amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00323">3.23 mmol</amount>
  • 14. Reaction 1: procedure steps Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid (aq) (w/w, 10%, 5 mL), concentrated. The residue was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid. Text mining breaks down procedure summary into steps: <dl:reactionActionList/dl:reactionActions> dl:phraseTexts • action="Add“: Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C • action=" Stir“: The reaction mixture was stirred at −78 °C for another 2 h • action="Heat“: warmed up to rt • action="Quench“: quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL) • action="Concentrate“: concentrated • action="Add“: The residue was added with water (10 mL) • action="Extract“: extracted with dichloromethane (12 mL × 3) • action="Dry“: dried over Na2SO4 • action="Filter“: filtered • action="Concentrate“: concentrated • action="Purify“: The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) • action="Yield“: to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid
  • 16. The World we are heading into http://www.gartner.com/newsroom/id/3143521
  • 17. Our World is hyperconnected
  • 19. Data quality issues Robochemistry Proliferation of errors in public and private databases Automated quality control system
  • 20. CVSP
  • 22. CVSP – issues review
  • 23. J. Brechner, IUPAC Graphical Representation of stereochem. configurations Section: ST-1.1.10 DB06287
  • 28. info@openphactsfoundation.org @Open_PHACTS Open PHACTS Practical Semantics OpenPHACTS GlaxoSmithKline – Coordinator Universität Wien – Managing entity Technical University of Denmark University of Hamburg, Center for Bioinformatics BioSolveIT GmBH Consorci Mar Parc de Salut de Barcelona Leiden University Medical Centre Royal Society of Chemistry Vrije Universiteit Amsterdam Novartis Merck Serono H. Lundbeck A/S Eli Lilly Netherlands Bioinformatics Centre Swiss Institute of Bioinformatics ConnectedDiscovery EMBL-European Bioinformatics Institute Janssen Esteve Almirall OpenLink Scibite The Open PHACTS Foundation Spanish National Cancer Research Centre University of Manchester Maastricht University Aqnowledge University of Santiago de Compostela Rheinische Friedrich-Wilhelms-Universität Bonn AstraZeneca Pfizer
  • 29. Why is it so hard to…. Competitors? What’s the structure? Are they in our file? What’s similar? What’s the target?Pharmacology data? Known Pathways? Working On Now? Connections to disease? Expressed in right cell type? IP?
  • 30. @gray_alasdair Big Data Integration 30 Knowledge is federated
  • 35. Publishers - the guardians of knowledge This is a poster for Guardians of the Galaxy. The poster art copyright is believed to belong to the distributor of the Film, Walt Disney Studios Motion Pictures, the publisher, Marvel Studios, or the graphic artist.
  • 36. Data Publishing Original artist: Joseph Ferdinand Keppler (1838-1894) Restoration: Adam Cuerden - http://www.loc.gov/pictures/item/2011661385/ by way ofhttp://adamcuerden.deviantart.com/gallery/#/d5onmxh
  • 37. The World we live in
  • 38.
  • 40. "Internet host count history". Internet Systems Consortium. Retrieved May 16,2012.
  • 41. We are on a verge of a new technical revolution and it feels great to anticipate it and be ready to ride! Image from surfline.com by Mike Cianciulli
  • 42.
  • 43. Data Science @ RSC The team. From left to right: Valery Tkachenko and Alexey Pshenichnov, based in the United States; Aileen Day, based in Southampton; John Boyle, Peter Corbett, Colin Batchelor, Jeff White, Nicholas Bailey and Val the plant, based at TGH

Hinweis der Redaktion

  1. What about science and chemistry in particular?
  2. Remember this, some of these questions are easier to answer than others
  3. Open PHACTS was developed to support the key questions of drug discovery Business questions have been at the heart of Open PHACTS and have driven the development of the platform Mx/psa, how calculated who did it? Mash up. With your data too, - top layer join together but need them all commercial Data provided by many publishers Originally in many formats: relational, SD files and RDF Worked closely with publishers Data licensing was a major issue Over 5 billion triples – 14 datasets & growing Hosted on beefy hardware; data in memory (aim) Extensive memcaching Pose complex queries to extract data