SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Data Science in Drug
Discovery
Marina Sirota, PhD
Assistant Professor
Institute for Computational Health
Sciences
October 22, 2015
Data Driven Research
Integrative Personal “Omics”
Profiling
Genome
Transcriptome
Epigenome
MicrobiomeProteome / Metabolome
Antibodyome
Moore’s Law – Biology and
Computation
Cost Per Genome
Cost of Computational Resources
Can we use data integration
to…
Biomarker
Discovery
… to find
better
diagnostic
markers?
Disease
Mechanism
… understand
disease
better?
Therapeutics
… find new
uses for
existing
drugs?
Motivation
• Problem:
– Takes roughly 15 years and over
$800 million to develop and bring
a novel drug to market
– 90% of drugs fail in early
development
• Solution: Drug Repurposing
– Lower cost
– Reduce risk of failure
Problem Statement
Can we use public data to
systematically predict relationships
between drugs and diseases?
Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. Discovery and Validation of
Drug Indications Using Compendia of Public Gene Expression Data. Science Translational Medicine. Aug 2011.
Problem Statement
Can we use public data to
systematically predict relationships
between drugs and diseases?
Diseases
Drugs
Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. Discovery and Validation of
Drug Indications Using Compendia of Public Gene Expression Data. Science Translational Medicine. Aug 2011.
What is Gene Expression
Profiling?
• Global snapshot of cellular function and
activity
– Genome sequence – what might be going on
– Expression – what is actually going on
• 25,000 genes 1,000,000 proteins
• We can measure a few thousand proteins,
but gene expression is a global proxy
How Can We Measure Expression?
Microarrays
• Thousands of probes are hybridized to a solid
surface
• Takes advantage of complementary DNA
sequences
• Process:
– RNA is extracted from the sample
– Fluorescent labeling
– Hybridization and wash
– Scanning and signal processing
– Normalization and analysis!
Data Sources
• Collection of expression
data from cultured human
cells
• 453 experiments of 164
drugs
• Covers broad range of
effects
– FDA approved drugs
– Non drug bioactive small
molecules
• Publicly available
gene expression
repository
– Platforms – 11,745
– Samples – 961,202
– Series -39,679
• There are numerous
experiments dealing
with over 200
diseasesBarrett et al. NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 2009.
Lamb et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease.
Science. 2006.
Disease Gene Expression Data
(GEO)
Butte AJ, Chen R. AMIA,
2006.
Download all GDS
Experiments GEO
Identify Disease
Associated Experiments
Identify Normal vs.
Disease Experiments
176 datasets, 3113
arrays, 100 diseases
Dudley J, Butte AJ. PSB,
2008.
Dudley JT, Tibshirani R, Deshpande T, Butte AJ. Disease signatures are robust across tissues and experiments. Mol
Disease Gene Expression
Signature
Disease
Individuals
Healthy
Controls
Disease Gene Expression
Signature
Drug Gene Expression Profile
Treated
Sample
Untreated
Sample
Drug Gene Expression Profile
Up-regulated Down-regulated
Hypothesis
Gene Expression Profiles
Disease Drug BDisease Drug A Disease Drug C
Genes
Genes
Genes
Treatment Adverse Reaction
?
????
Lamb et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease.
Computational Pipeline
Disease Gene Expression Signature
Genes
Drugs
Disease-Drug Scores
Drugs Similar
to Disease
Drugs Opposite
to Disease
Drug-Disease Relationships
Drugs
Diseases
Positive Correlation – Adverse Reaction? Negative Correlation - Therapeutic
Similar Diseases Cluster
Based on Disease-Drug Similarity
Families of Drugs Cluster
Based on Disease-Drug Similarity
Families of Drugs Cluster
Based on Disease-Drug Similarity
Drug-Disease Relationships
Drugs
Diseases
Positive Correlation – Adverse Reaction? Negative Correlation - Therapeutic
Crohn’s Disease
• An inflammatory disease of the intestines
that has an autoimmune component
• Affects 500,000 people in North America
• No known pharmaceutical cure
• Current solutions:
– Reduce inflammation with anti-
inflammatory drugs and
corticosteroids (prednisone)
– Bad side effects
– Surgical solutions
Therapeutic Predictions for
Crohn’s Disease
Therapeutic Predictions for
Crohn’s Disease
Topiramate – An Anti-Seizure
Drug
• Suppresses the rapid and excessive
firing of neurons that start a seizure
• Enhances GABA-activation
• Used to treat epilepsy, bipolar disorder
• Antidepressant
• Investigated as potential
treatment for obesity and
type II diabetes
Topiramate and Crohn’s
Genes that are
up-regulated by the drug are
down-regulated in the disease
Genes that are
down-regulated by the drug ar
up-regulated in the disease
Animal Model for Crohn’s
• TNBS (trinitrobenzene sulfonic acid) +
ethanol induced rats:
– Excellent and reproducible experimental model
for Inflammatory Bowel Disease (Crohn’s and
Ulcerative Colitis)
– Toxin-based model
Normal TNBS Induced
Pilot Validation Study Design
• Pilot Study – 18 rats
– Healthy (control)
– TNBS-Induced Untreated
– TNBS-Induced Treated
• 80 mg/kg topiramate, injected daily
• Colon tissue macroscopic damage score
Reetesh Pai, Mohan Shenoy and Pankaj Jay
Validation Results
Two Follow-up Validation
Studies
• 48 rats each – 4 groups of 12 rats
– Healthy Controls
– TNBS + Vehicle
– TNBS + Prednisolone
– TNBS + Topiramate
• 7 days
• Clinical Signs, Pathology Score, Histology
• Endoscopy Images
Clinical Signs
Pathology Scores
B
S+Veh
B
S+Pred
B
S+Top
Vehicle
0
1
2
3
4
5
GrossPathologyScore
****
B
Histology
Endoscopy
Drug-Disease Signature
Ongoing work
• Extending the drug datasets to use structural
data
• Incorporating meta-analysis methods
• Application to cancer (lung cancer, liver
cancer, medulloblastoma)
• More focused cell line selection
• Looking at dosage response and combination
therapy prediction
• Leveraging EMR and clinical trial dataChen B, Sirota M, Fan-Minogue H, Hadley D, Butte AJ. Relating Hepatocellular Carcinoma Tumor Samples and Cell
Lines Using Gene Expression Data in Translational Research. BMC Medical Genomics, 2015.
Wu M, Sirota M, Butte AJ, Chen B. Characteristics of drug combination therapy in oncology by analyzing clinical trial
data on clinicaltrials.gov. Pac Symp Biocomput. 2015.
Can we use data integration
to…
Biomarker
Discovery
… to find
better
diagnostic
markers?
Disease
Mechanism
… understand
disease
better?
Therapeutics
… find new
uses for
existing
drugs?
Precision Medicine
responders
non-responders
test
Acknowledgements
Atul Butte
Joel Dudley Annie P. Chiang
Alex Morgan Pankaj Jay Pasricha
Mohan Shenoy Minnie Sarwal
Reetesh Pai Julien Sage
Silke Roedder Alejandro Sweet-
Cordero
Bin Chen Hanna Paik
Dexter Hadley
Institute for Computational Health
Sciences @ UCSF
Thanks!
marina.sirota@ucsf.edu

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

HIGH THROUGHPUT SCREENING Technology
HIGH THROUGHPUT SCREENING  TechnologyHIGH THROUGHPUT SCREENING  Technology
HIGH THROUGHPUT SCREENING Technology
 
Monte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and DynamicsMonte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and Dynamics
 
Computer-aided prediction of xenobiotics toxicity
Computer-aided prediction of xenobiotics toxicityComputer-aided prediction of xenobiotics toxicity
Computer-aided prediction of xenobiotics toxicity
 
Microarray data analysis _ by Ritesh Kumar
Microarray data analysis _ by Ritesh KumarMicroarray data analysis _ by Ritesh Kumar
Microarray data analysis _ by Ritesh Kumar
 
Personalized Medicine Overview
Personalized Medicine OverviewPersonalized Medicine Overview
Personalized Medicine Overview
 
Conformational analysis – Alignment of molecules in 3D QSAR
Conformational analysis  – Alignment of molecules in 3D QSARConformational analysis  – Alignment of molecules in 3D QSAR
Conformational analysis – Alignment of molecules in 3D QSAR
 
Precision Pain Medicine
Precision Pain MedicinePrecision Pain Medicine
Precision Pain Medicine
 
Genomics
GenomicsGenomics
Genomics
 
Precision Medicine - The Future of Healthcare
Precision Medicine - The Future of HealthcarePrecision Medicine - The Future of Healthcare
Precision Medicine - The Future of Healthcare
 
Pharmacogenomics
PharmacogenomicsPharmacogenomics
Pharmacogenomics
 
Precision Medicine: Opportunities and Challenges for Clinical Trials
Precision Medicine: Opportunities and Challenges for Clinical TrialsPrecision Medicine: Opportunities and Challenges for Clinical Trials
Precision Medicine: Opportunities and Challenges for Clinical Trials
 
Protein Data Bank
Protein Data BankProtein Data Bank
Protein Data Bank
 
Personalized medicine
Personalized medicinePersonalized medicine
Personalized medicine
 
Phylogenetics: Tree building
Phylogenetics: Tree buildingPhylogenetics: Tree building
Phylogenetics: Tree building
 
Personalized medicine ppt
Personalized medicine pptPersonalized medicine ppt
Personalized medicine ppt
 
analogue based drug design and discovery.pptx
analogue based drug design and discovery.pptxanalogue based drug design and discovery.pptx
analogue based drug design and discovery.pptx
 
Molecular Representation, Similarity and Search
Molecular Representation, Similarity and SearchMolecular Representation, Similarity and Search
Molecular Representation, Similarity and Search
 
Introduction to statistical software R
Introduction to statistical software RIntroduction to statistical software R
Introduction to statistical software R
 
Crispr/Cas9
Crispr/Cas9Crispr/Cas9
Crispr/Cas9
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
 

Andere mochten auch

Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
Max Tucker
 
Gene expression profiling in breast carcinoma
Gene expression profiling in breast carcinomaGene expression profiling in breast carcinoma
Gene expression profiling in breast carcinoma
ghoshparthanrs
 

Andere mochten auch (20)

Python, Pharmaceuticals, and Drug Discovery by Emlyn Clay
Python, Pharmaceuticals, and Drug Discovery by Emlyn ClayPython, Pharmaceuticals, and Drug Discovery by Emlyn Clay
Python, Pharmaceuticals, and Drug Discovery by Emlyn Clay
 
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
Digital RNAseq for Gene Expression Profiling: Digital RNAseq Webinar Part 2
 
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
Design and Synthesis of a Novel Thiolate Histone Deacetylase Inhibitor
 
Mistakes I've Made- Cam Davidson-Pilon
Mistakes I've Made- Cam Davidson-PilonMistakes I've Made- Cam Davidson-Pilon
Mistakes I've Made- Cam Davidson-Pilon
 
Sensor Data Wrangling: From Metal to Cloud
Sensor Data Wrangling: From Metal to CloudSensor Data Wrangling: From Metal to Cloud
Sensor Data Wrangling: From Metal to Cloud
 
Condense Fact from the Vapor of Nuance
Condense Fact from the Vapor of Nuance Condense Fact from the Vapor of Nuance
Condense Fact from the Vapor of Nuance
 
Wrangle 2016: Data Science for HR
Wrangle 2016: Data Science for HRWrangle 2016: Data Science for HR
Wrangle 2016: Data Science for HR
 
Wrangle 2016: Malware Tracking at Scale
Wrangle 2016: Malware Tracking at ScaleWrangle 2016: Malware Tracking at Scale
Wrangle 2016: Malware Tracking at Scale
 
Wrangle 2016: Driving Healthcare Operations with Small Data
Wrangle 2016: Driving Healthcare Operations with Small DataWrangle 2016: Driving Healthcare Operations with Small Data
Wrangle 2016: Driving Healthcare Operations with Small Data
 
From Science to Product (Company)
From Science to Product (Company)From Science to Product (Company)
From Science to Product (Company)
 
Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...
Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...
Wrangle 2016 - Digital Vulnerability: Characterizing Risks and Contemplating ...
 
The Unreasonable Effectiveness of Product Sense
The Unreasonable Effectiveness of Product SenseThe Unreasonable Effectiveness of Product Sense
The Unreasonable Effectiveness of Product Sense
 
Wrangle 2016: Staying Hippocratic with High Stakes Data
Wrangle 2016: Staying Hippocratic with High Stakes DataWrangle 2016: Staying Hippocratic with High Stakes Data
Wrangle 2016: Staying Hippocratic with High Stakes Data
 
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlowWrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
 
Histone protein
Histone proteinHistone protein
Histone protein
 
Cancer epigenetics
Cancer epigenetics Cancer epigenetics
Cancer epigenetics
 
Gene expression profiling i
Gene expression profiling  iGene expression profiling  i
Gene expression profiling i
 
Gene expression profiling in breast carcinoma
Gene expression profiling in breast carcinomaGene expression profiling in breast carcinoma
Gene expression profiling in breast carcinoma
 
A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation
 
Precision Medicine World Conference 2017
Precision Medicine World Conference 2017Precision Medicine World Conference 2017
Precision Medicine World Conference 2017
 

Ähnlich wie Data Science in Drug Discovery

(서울의대 공유용) 빅데이터 분석 유전체 정보와 개인라이프로그 정보 활용-2015_11_24
(서울의대 공유용) 빅데이터 분석  유전체 정보와 개인라이프로그 정보 활용-2015_11_24(서울의대 공유용) 빅데이터 분석  유전체 정보와 개인라이프로그 정보 활용-2015_11_24
(서울의대 공유용) 빅데이터 분석 유전체 정보와 개인라이프로그 정보 활용-2015_11_24
Hyung Jin Choi
 
DNA and Personalized medicine
DNA and Personalized medicineDNA and Personalized medicine
DNA and Personalized medicine
cancerdrg
 
Personalized Therapies for OA: Can Biomarkers Get Us There?
Personalized Therapies for OA: Can Biomarkers Get Us There?Personalized Therapies for OA: Can Biomarkers Get Us There?
Personalized Therapies for OA: Can Biomarkers Get Us There?
OARSI
 
biostatistics-220223232107.pdf
biostatistics-220223232107.pdfbiostatistics-220223232107.pdf
biostatistics-220223232107.pdf
BagalanaSteven
 

Ähnlich wie Data Science in Drug Discovery (20)

Repositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational ApproachesRepositioning Old Drugs For New Indications Using Computational Approaches
Repositioning Old Drugs For New Indications Using Computational Approaches
 
Biomedical big data and research clinical application for obesity
Biomedical big data and research clinical application for obesityBiomedical big data and research clinical application for obesity
Biomedical big data and research clinical application for obesity
 
Translational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
Translational Genomics towards Personalized medicine - Medhavi Vashisth.pptTranslational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
Translational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
 
의료 빅데이터와 인공지능의 현재와 미래
의료 빅데이터와 인공지능의 현재와 미래의료 빅데이터와 인공지능의 현재와 미래
의료 빅데이터와 인공지능의 현재와 미래
 
(서울의대 공유용) 빅데이터 분석 유전체 정보와 개인라이프로그 정보 활용-2015_11_24
(서울의대 공유용) 빅데이터 분석  유전체 정보와 개인라이프로그 정보 활용-2015_11_24(서울의대 공유용) 빅데이터 분석  유전체 정보와 개인라이프로그 정보 활용-2015_11_24
(서울의대 공유용) 빅데이터 분석 유전체 정보와 개인라이프로그 정보 활용-2015_11_24
 
20160119 디지털 헬스케어 의사모임 1월 전체 파일 v3
20160119 디지털 헬스케어 의사모임 1월 전체 파일 v320160119 디지털 헬스케어 의사모임 1월 전체 파일 v3
20160119 디지털 헬스케어 의사모임 1월 전체 파일 v3
 
DNA and Personalized medicine
DNA and Personalized medicineDNA and Personalized medicine
DNA and Personalized medicine
 
Medical Biotechnology (Recent Development)
Medical Biotechnology (Recent Development)Medical Biotechnology (Recent Development)
Medical Biotechnology (Recent Development)
 
BioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug DiscoveryBioVariance - Pediatric Pharmacogenomics in Drug Discovery
BioVariance - Pediatric Pharmacogenomics in Drug Discovery
 
drug discovery
drug discoverydrug discovery
drug discovery
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochure
 
2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool
2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool
2015 04-13 Pharma Nutrition 2015 Philadelphia Alain van Gool
 
Personalized Therapies for OA: Can Biomarkers Get Us There?
Personalized Therapies for OA: Can Biomarkers Get Us There?Personalized Therapies for OA: Can Biomarkers Get Us There?
Personalized Therapies for OA: Can Biomarkers Get Us There?
 
iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM...
 iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM... iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM...
iCAAD London 2019 - Antonio Metastasio - PERSONALISED MEDICINE IN THE TREATM...
 
2013 05 society for clinical trials
2013 05 society for clinical trials2013 05 society for clinical trials
2013 05 society for clinical trials
 
Probiotic symposium chennai 3 dec 2016
Probiotic symposium chennai 3 dec 2016Probiotic symposium chennai 3 dec 2016
Probiotic symposium chennai 3 dec 2016
 
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ..."Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
"Hacking the Software for Life" - Brad Perkins (Chief Medical Officer, Human ...
 
2014 07 ismb personalized medicine
2014 07 ismb personalized medicine2014 07 ismb personalized medicine
2014 07 ismb personalized medicine
 
biostatistics-220223232107.pdf
biostatistics-220223232107.pdfbiostatistics-220223232107.pdf
biostatistics-220223232107.pdf
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Data Science in Drug Discovery

  • 1. Data Science in Drug Discovery Marina Sirota, PhD Assistant Professor Institute for Computational Health Sciences October 22, 2015
  • 4. Moore’s Law – Biology and Computation Cost Per Genome Cost of Computational Resources
  • 5. Can we use data integration to… Biomarker Discovery … to find better diagnostic markers? Disease Mechanism … understand disease better? Therapeutics … find new uses for existing drugs?
  • 6. Motivation • Problem: – Takes roughly 15 years and over $800 million to develop and bring a novel drug to market – 90% of drugs fail in early development • Solution: Drug Repurposing – Lower cost – Reduce risk of failure
  • 7. Problem Statement Can we use public data to systematically predict relationships between drugs and diseases? Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. Discovery and Validation of Drug Indications Using Compendia of Public Gene Expression Data. Science Translational Medicine. Aug 2011.
  • 8. Problem Statement Can we use public data to systematically predict relationships between drugs and diseases? Diseases Drugs Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, Sage J, Butte AJ. Discovery and Validation of Drug Indications Using Compendia of Public Gene Expression Data. Science Translational Medicine. Aug 2011.
  • 9. What is Gene Expression Profiling? • Global snapshot of cellular function and activity – Genome sequence – what might be going on – Expression – what is actually going on • 25,000 genes 1,000,000 proteins • We can measure a few thousand proteins, but gene expression is a global proxy How Can We Measure Expression?
  • 10. Microarrays • Thousands of probes are hybridized to a solid surface • Takes advantage of complementary DNA sequences • Process: – RNA is extracted from the sample – Fluorescent labeling – Hybridization and wash – Scanning and signal processing – Normalization and analysis!
  • 11. Data Sources • Collection of expression data from cultured human cells • 453 experiments of 164 drugs • Covers broad range of effects – FDA approved drugs – Non drug bioactive small molecules • Publicly available gene expression repository – Platforms – 11,745 – Samples – 961,202 – Series -39,679 • There are numerous experiments dealing with over 200 diseasesBarrett et al. NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 2009. Lamb et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 2006.
  • 12. Disease Gene Expression Data (GEO) Butte AJ, Chen R. AMIA, 2006. Download all GDS Experiments GEO Identify Disease Associated Experiments Identify Normal vs. Disease Experiments 176 datasets, 3113 arrays, 100 diseases Dudley J, Butte AJ. PSB, 2008. Dudley JT, Tibshirani R, Deshpande T, Butte AJ. Disease signatures are robust across tissues and experiments. Mol
  • 14. Drug Gene Expression Profile Treated Sample Untreated Sample Drug Gene Expression Profile
  • 15. Up-regulated Down-regulated Hypothesis Gene Expression Profiles Disease Drug BDisease Drug A Disease Drug C Genes Genes Genes Treatment Adverse Reaction ? ???? Lamb et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease.
  • 16. Computational Pipeline Disease Gene Expression Signature Genes Drugs Disease-Drug Scores Drugs Similar to Disease Drugs Opposite to Disease
  • 17. Drug-Disease Relationships Drugs Diseases Positive Correlation – Adverse Reaction? Negative Correlation - Therapeutic
  • 18. Similar Diseases Cluster Based on Disease-Drug Similarity
  • 19. Families of Drugs Cluster Based on Disease-Drug Similarity
  • 20. Families of Drugs Cluster Based on Disease-Drug Similarity
  • 21. Drug-Disease Relationships Drugs Diseases Positive Correlation – Adverse Reaction? Negative Correlation - Therapeutic
  • 22. Crohn’s Disease • An inflammatory disease of the intestines that has an autoimmune component • Affects 500,000 people in North America • No known pharmaceutical cure • Current solutions: – Reduce inflammation with anti- inflammatory drugs and corticosteroids (prednisone) – Bad side effects – Surgical solutions
  • 25. Topiramate – An Anti-Seizure Drug • Suppresses the rapid and excessive firing of neurons that start a seizure • Enhances GABA-activation • Used to treat epilepsy, bipolar disorder • Antidepressant • Investigated as potential treatment for obesity and type II diabetes
  • 26. Topiramate and Crohn’s Genes that are up-regulated by the drug are down-regulated in the disease Genes that are down-regulated by the drug ar up-regulated in the disease
  • 27. Animal Model for Crohn’s • TNBS (trinitrobenzene sulfonic acid) + ethanol induced rats: – Excellent and reproducible experimental model for Inflammatory Bowel Disease (Crohn’s and Ulcerative Colitis) – Toxin-based model Normal TNBS Induced
  • 28. Pilot Validation Study Design • Pilot Study – 18 rats – Healthy (control) – TNBS-Induced Untreated – TNBS-Induced Treated • 80 mg/kg topiramate, injected daily • Colon tissue macroscopic damage score Reetesh Pai, Mohan Shenoy and Pankaj Jay
  • 30. Two Follow-up Validation Studies • 48 rats each – 4 groups of 12 rats – Healthy Controls – TNBS + Vehicle – TNBS + Prednisolone – TNBS + Topiramate • 7 days • Clinical Signs, Pathology Score, Histology • Endoscopy Images
  • 36. Ongoing work • Extending the drug datasets to use structural data • Incorporating meta-analysis methods • Application to cancer (lung cancer, liver cancer, medulloblastoma) • More focused cell line selection • Looking at dosage response and combination therapy prediction • Leveraging EMR and clinical trial dataChen B, Sirota M, Fan-Minogue H, Hadley D, Butte AJ. Relating Hepatocellular Carcinoma Tumor Samples and Cell Lines Using Gene Expression Data in Translational Research. BMC Medical Genomics, 2015. Wu M, Sirota M, Butte AJ, Chen B. Characteristics of drug combination therapy in oncology by analyzing clinical trial data on clinicaltrials.gov. Pac Symp Biocomput. 2015.
  • 37. Can we use data integration to… Biomarker Discovery … to find better diagnostic markers? Disease Mechanism … understand disease better? Therapeutics … find new uses for existing drugs?
  • 39. Acknowledgements Atul Butte Joel Dudley Annie P. Chiang Alex Morgan Pankaj Jay Pasricha Mohan Shenoy Minnie Sarwal Reetesh Pai Julien Sage Silke Roedder Alejandro Sweet- Cordero Bin Chen Hanna Paik Dexter Hadley
  • 40. Institute for Computational Health Sciences @ UCSF

Hinweis der Redaktion

  1. Good morning my name is Marina Sirota. I’m currently a lead research scientist in the division of systems medicine at Stanford university. Previously I worked at Pfizer under David Cox in a genetics group working on applying next gen sequencing technologies to discover novel drug targets and develop population stratification techniques for clinical trials. Today I will tell you a bit about translational bioinformatics, systems medicine and how it might impact transplantation practice in the near future
  2. Since then we have come a looong way. Thousands of people have been sequenced and millions of individuals have been genotyped. These are all resources that have been created and most importantly they are open to the public.
  3. People are also starting to use sequencing in creative ways – immunome, metagenome, epigenetics cell-free DNA.
  4. The observation made in 1965 by Gordon Moore, co-founder of Intel, that the number of transistors per square inch on integrated circuits had doubled every year since the integrated circuit was invented. Moore predicted that this trend would continue for the foreseeable future.
  5. Drug repurposing is finding a novel indication for a known FDA approved drugs Computational work can be instrumental in making this feasible Examples: Viagra - was initially studied for use in hypertension
  6. So far I have focused on the genetics piece or the DNA, I would like to talk a little bit about RNA or the middle piece of the central dogma and ways we can measure it using gene expression profiling Expression profiles can, for example, distinguish between cells that are actively dividing, or show how the cells react to a particular treatment Used to generate hypothesis, mechanism of action Post translational modification, alternative splicing
  7. Microarrays are one technology to measure gene expression. They were first developed in 1995, nearly 20 years ago.
  8. Have GEO data, but the format doesn’t make it easy to ask relevant questions We have built an infrastructure to enable this sort of analysis
  9. This approach is especially likely to yield good results for diseases with a strong gene dis-regulation component such as autoimmune disease or cancer
  10. If the up-regulated disease genes appear near the top (up-regulated) of the rank-ordered drug gene expression list and the down-regulated disease genes fall near the bottom (down-regulated) of the rank-ordered drug gene expression list we can conclude that the drug and the disease expression profiles are similar if the up-regulated disease genes fall near the bottom of the rank-ordered drug gene expression list and the down-regulated disease genes are near the top of the rank-ordered drug gene expression - therapeutic Randomization by picking a signature at random and recomputing drug disease scores 100 times FDR
  11. 100 diseases 164 drugs 16000 drug-disease pairs 53 diseases significant predictions Not everything is treatable
  12. Hieararchical clustering Brain cancers Other Cancers Lung Injury UC and crohn’s
  13. histone deacetylase (HDAC) inhibitors (in red) Drugs known to affect different parts of the same pathway also cluster together: phosphatidylinositol-3-kinase (PI3K) inhibitors LY−294002 and wortmannin (in green)
  14. heat shock protein 90 (HSP90) inhibitors (in orange)
  15. Chose Crohn’s but have others
  16. Known drug Two that are better One is FDA approved so go for this one
  17. Looked for an animal model of Crohn’s and found one
  18. Define macroscopic damage score Scale 0-6 what they mean Define axes
  19. Earlier this year President Obama launched a $215 million investment in Precision Medicine Initiative will pioneer a new model of patient-powered research that promises to accelerate biomedical discoveries and provide clinicians with new tools, knowledge, and therapies to select which treatments will work best for which patients.