Selaginella: features, morphology ,anatomy and reproduction.
New target prediction and vizualization tools incorporating open source molecular fingerprints for TB Mobile version 2
1. New Target Prediction and Visualization ToolsNew Target Prediction and Visualization Tools
Incorporating Open Source Molecular FingerprintsIncorporating Open Source Molecular Fingerprints
For TB Mobile Version 2For TB Mobile Version 2
Sean EkinsSean Ekins1, 21, 2
, Alex M. Clark, Alex M. Clark33
and Malabika Sarkerand Malabika Sarker44
1
Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94010, USA.
2
Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC 27526, USA.
3
Molecular Materials Informatics, 1900 St. Jacques #302, Montreal Quebec, Canada H3J 2S1
4
SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025, USA.
.
2. Tuberculosis kills 1.6-1.7m/yr (~1 every 8 seconds)
1/3rd
of worlds population infected!!!!
streptomycin (1943)streptomycin (1943)
para-para-aminosalicyclic acid (1949)aminosalicyclic acid (1949)
isoniazid (1952)isoniazid (1952)
pyrazinamide (1954)pyrazinamide (1954)
cycloserine (1955)cycloserine (1955)
ethambutol (1962)ethambutol (1962)
rifampicin (1967)rifampicin (1967)
Multi drug resistance in 4.3% of casesMulti drug resistance in 4.3% of cases
Extensively drug resistant increasingExtensively drug resistant increasing
incidenceincidence
one new drug (bedaquiline) in 40 yrsone new drug (bedaquiline) in 40 yrs
TB key pointsTB key points
3. Tested >350,000 molecules Tested ~2M 2M >300,000
>1500 active and non toxic Published 177 100s 800
Big Data: Screening for New Tuberculosis TreatmentsBig Data: Screening for New Tuberculosis Treatments
How many will become a new drug?
How do we learn from this big data?
What are the targets for these molecules?
Others have likely screened another 500,000
4. Pathway analysis
Binding site similarity to Mtb proteins
Docking
Bayesian Models - ligand similarity
Predicting the target/s for small moleculesPredicting the target/s for small molecules
5. Multi-step processMulti-step process
1.1.Identification of essentialIdentification of essential in vivoin vivo enzymes ofenzymes of MtbMtb involved intensiveinvolved intensive
literature mining and manual curation, to extract all the genes essential forliterature mining and manual curation, to extract all the genes essential for
MtbMtb growthgrowth in vivoin vivo across speciesacross species..
2.2.Homolog information was collated from other studies.Homolog information was collated from other studies.
3.3.Collection of metabolic pathway information involved using TBDB.Collection of metabolic pathway information involved using TBDB.
4.4.Identifying molecules and drugs with known or predicted targetsIdentifying molecules and drugs with known or predicted targets
involved searching the CDD databases for manually curated data. Theinvolved searching the CDD databases for manually curated data. The
structures and data were exported for combination with the other data.structures and data were exported for combination with the other data.
5.5.All data were combined with URL links to literature and TBDB andAll data were combined with URL links to literature and TBDB and
deposited in the CDD database.deposited in the CDD database.
Initially over 700 molecules in datasetInitially over 700 molecules in dataset
Dataset Curation: TB molecules and target informationDataset Curation: TB molecules and target information
database connects molecule, gene, pathway and literaturedatabase connects molecule, gene, pathway and literature
Sarker et al., Pharm Res 2012, 29, 2115-2127.
6. TB molecules and target information database connectsTB molecules and target information database connects
molecule, gene, pathway and literaturemolecule, gene, pathway and literature
7. iPhone Android
TB Mobile 1. layout on iPhone and AndroidTB Mobile 1. layout on iPhone and Android
8. 14 First line drugs active against14 First line drugs active against MtbMtb evaluated inevaluated in
TB Mobile app and the top 3 molecules shownTB Mobile app and the top 3 molecules shown
Confirms all in TB Mobile and retrieved
9. Predicted targets of GSK TB hits monthsPredicted targets of GSK TB hits months
earlier using TB Mobileearlier using TB Mobile
GSK report hits Dec 2012GSK report hits Dec 2012
2424thth
Jan 2013 http://goo.gl/9LKrPZJan 2013 http://goo.gl/9LKrPZ
GSK predict targets Oct 2013GSK predict targets Oct 2013
10. Ekins et al., Tuberculosis 94: 162-169 (2014)
Predicted targetsPredicted targets
using TB Mobileusing TB Mobile
No verificationNo verification
yetyet
11. PCA of 745 compounds with Mtb targets (blue) and 1200PCA of 745 compounds with Mtb targets (blue) and 1200
Mtb active and non cytotoxic hits compounds (yellow)Mtb active and non cytotoxic hits compounds (yellow)
Chemical property space of TB Mobile compoundsChemical property space of TB Mobile compounds
Ekins et al., Tuberculosis 94: 162-169 (2014)
12. PCA of 745 compounds with Mtb targets (blue) and 177PCA of 745 compounds with Mtb targets (blue) and 177
GSK Mtb leads (yellow)GSK Mtb leads (yellow)
Chemical property space of TB Mobile and GSK leadChemical property space of TB Mobile and GSK lead
compoundscompounds
Ekins et al., J Chem Inf Model 53: 3054 (2013)
13. TB Mobile 2. layout on iPhoneTB Mobile 2. layout on iPhone
About CDDAbout CDD
Molecule searchMolecule search
FiltersFilters
Action Menu Molecule prediction Clustering About TB MobileAction Menu Molecule prediction Clustering About TB Mobile
Control blockControl block
Compound listCompound list
Text searchText search
14. TB Mobile 2. iPhone vs TB Mobile 1. AndroidTB Mobile 2. iPhone vs TB Mobile 1. Android
Molecule Detail and LinksMolecule Detail and Links
iPhone Android
BookmarkBookmark
copycopy
open-inopen-in
clustercluster
closeclose
15. TB Mobile 2. iPhone vs TB Mobile 1.TB Mobile 2. iPhone vs TB Mobile 1.
Android Similarity Searching in the appAndroid Similarity Searching in the app
iPhone Android
16. TB Mobile 2. iPhone vs TB Mobile 1. AndroidTB Mobile 2. iPhone vs TB Mobile 1. Android
Filtering and Sharing FunctionsFiltering and Sharing Functions
Each molecule can be copied to the clipboard then
opened with other apps (e.g. MMDS, MolPrime,
MolSync, ChemSpider, and from these exported via
Twitter or email) or shared via Dropbox.
17. TB Mobile 2. – Filtering and SharingTB Mobile 2. – Filtering and Sharing
FunctionsFunctions
Data can also be filtered by target name, pathway name,
essentiality and human ortholog
18. PCA of 745 compounds with MtbPCA of 745 compounds with Mtb
targets (blue) and 60 newtargets (blue) and 60 new
compounds (yellow)compounds (yellow)
Chemical property space of screening hits andChemical property space of screening hits and
molecules evaluated in TB Mobile 2.molecules evaluated in TB Mobile 2.
PCA of 745 compounds with MtbPCA of 745 compounds with Mtb
targets (blue) and 20 new testtargets (blue) and 20 new test
compounds (yellow)compounds (yellow)
20. Open Extended Connectivity FingerprintsOpen Extended Connectivity Fingerprints
ECFP_6 FCFP_6
• Collected,Collected,
deduplicated,deduplicated,
hashedhashed
• Sparse integersSparse integers
• Invented for Pipeline Pilot: public method, proprietary detailsInvented for Pipeline Pilot: public method, proprietary details
• Often used with Bayesian models: many published papersOften used with Bayesian models: many published papers
• Built a new implementation: open source, Java, CDKBuilt a new implementation: open source, Java, CDK
– stable: fingerprints don't change with each new toolkit releasestable: fingerprints don't change with each new toolkit release
– well defined: easy to document precise stepswell defined: easy to document precise steps
– easy to port: already migrated to iOS (Objective-C) foreasy to port: already migrated to iOS (Objective-C) for TB MobileTB Mobile appapp
• Provides core basis feature for CDD open source model serviceProvides core basis feature for CDD open source model service
•Clark et al., J Cheminformatics 6: 38 (2014)Clark et al., J Cheminformatics 6: 38 (2014)
21. Testing the fingerprints – comparing to published dataTesting the fingerprints – comparing to published data
Dataset Leave one out ROC
Published
Reference Leave one out
ROC in this study
Combined model (5304
molecules) ECFP_6
fingerprints
N/A N/A 0.77
Combined model (5304
molecules) FCFP_6
fingerprints
0.71 J Chem Inf
Model
53:3054-
3063.
0.77
MLSMR dual event model
(2273 molecules) and
ECFP_6 fingerprints
N/A N/A 0.84
MLSMR dual event model
(2273 molecules) and
FCFP_6 fingerprints
0.86 PLOSONE
8:e63240
0.83
Published models also include 8 additional descriptors as well as fingerprints
•Clark et al., J Cheminformatics 6: 38 (2014)Clark et al., J Cheminformatics 6: 38 (2014)
22. Predictions for the InhA target: (a) the ROC curve with ECFP_6 and FCFP_6Predictions for the InhA target: (a) the ROC curve with ECFP_6 and FCFP_6
fingerprints; (b) modified Bayesian estimators for active and inactive compounds;fingerprints; (b) modified Bayesian estimators for active and inactive compounds;
(c) structures of selected binders.(c) structures of selected binders.
For each listed target with at least two binders, it is first assumed that all of theFor each listed target with at least two binders, it is first assumed that all of the
molecules in the collection that do not indicate this as one of their targets aremolecules in the collection that do not indicate this as one of their targets are
inactive.inactive.
In the app we used ECFP_6 fingerprintsIn the app we used ECFP_6 fingerprints
Building Bayesian models for each targetBuilding Bayesian models for each target
23. Predict targetsPredict targets
Cluster moleculesCluster molecules
Open in MMDSOpen in MMDS
Bayesian predictions, data export and clusteringBayesian predictions, data export and clustering
Clark et al., J Cheminformatics, 6: 38 (2014)
24. Draw structures either in app, paste or open from
other apps e.g. MMDS
TB Mobile ranks content
TB Mobile can use built in target Bayesian models to
predict target
Take a screenshot of results
Output bayesian model predictions to MMDS
Compare to published data
Annotate results, tabulate
Process used to evaluate TB MobileProcess used to evaluate TB Mobile
25. We have curated an additional set of 20 molecules that have activityWe have curated an additional set of 20 molecules that have activity
againstagainst MtbMtb and were identified by HTS or other methodsand were identified by HTS or other methods
Several targets were not in the databaseSeveral targets were not in the database
Molecules active againstMolecules active against MtbMtb evaluated in TB Mobile appevaluated in TB Mobile app
•Clark et al., J Cheminformatics 6: 38 (2014)Clark et al., J Cheminformatics 6: 38 (2014)
26. Continue to update with more dataContinue to update with more data
Outreach to increase awareness of app and dataOutreach to increase awareness of app and data
Add machine learning algorithms to predict activity (Add machine learning algorithms to predict activity (inin
vitro and in vivovitro and in vivo whole cell activity)whole cell activity)
Could we appify similar target data for other neglectedCould we appify similar target data for other neglected
diseases/ targets e.g. malariadiseases/ targets e.g. malaria
What next ?What next ?
27. In vitro data In vivo data
Target data
ADME/Tox data & Models
Drug-like scaffold creation
TB Prediction Tools TB Publications
Data sources and tools we could integrate
28. Exposure of CDD content from collaboration with SRIExposure of CDD content from collaboration with SRI
More visibility for brand in new placesMore visibility for brand in new places
Experiment in small database with focus on contentExperiment in small database with focus on content
deliverydelivery
A functional app to reach scientists that may not haveA functional app to reach scientists that may not have
cheminformatics or bioinformatics trainingcheminformatics or bioinformatics training
Pushing the boundaries of what an app can doPushing the boundaries of what an app can do
Benefits of creating TB MobileBenefits of creating TB Mobile
31. Papers published on TB Mobile or usingPapers published on TB Mobile or using
datasetdataset
Ekins et al., Tuberculosis 94: 162-169 (2014)Ekins et al., Tuberculosis 94: 162-169 (2014)
Ekins et al., J Chem Inf Model 53: 3054 (2013)Ekins et al., J Chem Inf Model 53: 3054 (2013)
Clark et al., J Cheminformatics 6:38 (2014)Clark et al., J Cheminformatics 6:38 (2014)
Ekins et al., J Cheminformatics 5:13 (2013)Ekins et al., J Cheminformatics 5:13 (2013)
32. You can find me @...
PAPER ID: 22104 “Collaborative sharing of molecules and data in the mobile age” (final paper number: 43)
DIVISION: COMP; DAY & TIME OF PRESENTATION: August 10, 2014 from 4:45 pm to 5:15 pm
LOCATION: Moscone Center, West Bldg., Room: 2005
PAPER ID: 22094 “Expanding the metabolite mimic approach to identify hits for Mycobacterium tuberculosis ” (final paper
number: 78)
DIVISION: COMP: DAY & TIME OF PRESENTATION: August 11, 2014 from 9:00 am to 9:30 am
LOCATION: Moscone Center, West Bldg., Room: 2005
PAPER ID: 22120 “Why there needs to be open data for ultrarare and rare disease drug discovery” (final paper number: 48)
DIVISION: CINF:SESSION DAY & TIME OF PRESENTATION: August 11, 2014 from 10:50 am to 11:20 am
LOCATION: Palace Hotel, Room: Marina
PAPER ID: 22183 “Progress in computational toxicology” (final paper number: 125)
DIVISION: TOXI: DAY & TIME OF PRESENTATION: August 12, 2014 from 6:30 pm to 10:30 pm
LOCATION: Moscone Center, North Bldg. , Room: 134
PAPER ID: 22091 “Examples of how to inspire the next generation to pursue computational chemistry/cheminformatics” (final
paper number: 100)
DIVISION: CINF: Division of Chemical Information DAY & TIME OF PRESENTATION: August 13, 2014 from 8:25 am to 8:50
am
LOCATION: Palace Hotel, Room: Presidio
PAPER ID: 22176 “Applying computational models for transporters to predict toxicity” (final paper number: 132)
DIVISION: TOXI: DAY & TIME OF PRESENTATION: August 13, 2014 from 9:45 am to 10:05 am
LOCATION: InterContinental San Francisco, Room: Grand Ballroom A
PAPER ID: 22186 “New target prediction and visualization tools incorporating open source molecular fingerprints for TB mobile
version 2” (final paper number: 123)
DIVISION: CINF: DAY & TIME OF PRESENTATION: August 13, 2014 from 1:35 pm to 2:05 pm
LOCATION: Palace Hotel, Room: California Parlor
33. All at CDD, SRI, IDRI and many others …Funding:All at CDD, SRI, IDRI and many others …Funding: 2R42AI088893-02 NIAID, NIH; 9R44TR000942-02
NCATS, NIH; CDD TB has been developed thanks to funding from the Bill and Melinda Gates Foundation
(Grant#49852)