A Systems Approach to Personalized Medicine
This talk discusses how one man used various omics technologies like genomics, metagenomics, metabolomics, and imaging to gain insights into his own health. Over a decade, he tracked over a billion data points about himself including his microbiome, genome, blood variables, and medical images. This led to the discovery that he had an inflammatory bowel disease. He then used multi-omics analyses and computing resources to study his condition and microbiome in detail over time. This is an example of a systems approach to personalized medicine.
1. “A Systems Approach
to Personalized Medicine”
Talk and Discussion
NASA Ames
Mountain View, CA
March 28, 2013
Dr. Larry Smarr
Director, California Institute for Telecommunications and Information
Technology
Harry E. Gruber Professor,
Dept. of Computer Science and Engineering
1
Jacobs School of Engineering, UCSD
2. From One to a Billion Data Points Defining Me:
The Exponential Rise in Body Data in Just One Decade!
Billion:Microbial Genome
My Full DNA,
MRI/CT Images
Improving Body
SNPs
Million: My DNA SNPs,
Zeo, FitBit
Discovering Disease
Blood
Variables
One: Hundred: My Blood Variables
Weight Weight
My
4. Visualizing Time Series of
150 LS Blood and Stool Variables, Each Over 5 Years
Calit2 64 megapixel VROOM
5. Only One of My Blood Measurements
Was Far Out of Range--Indicating Chronic Inflammation
27x Upper Limit
Episodic Peaks in Inflammation
Followed by Spontaneous Drops
Antibiotics
Antibiotics
Normal Range<1 mg/L
Normal
Complex Reactive Protein (CRP) is a Blood Biomarker
for Detecting Presence of Inflammation
6. High Values of Lactoferrin (Shed from Neutrophils)
From Stool Sample Suggested Inflammation in Colon
124x Upper Limit Typical
Lactoferrin
Value for
Stool Samples Analyzed Active
by www.yourfuturehealth.com IBD
Antibiotics
Antibiotics Normal Range
<7.3 µg/mL
Lactoferrin is a Sensitive and Specific Biomarker for
Detecting Presence of Inflammatory Bowel Disease (IBD)
7. High Lactoferrin Biomarker Led Me to Hypothesis
I Had Inflammatory Bowel Disease (IBD)
IBD is an Autoimmune Disease Which Comes in Two Subtypes:
Crohn’s and Ulcerative Colitis
Scand J Gastroenterol.
42, 1440-4 (2007)
My Values May 2011
My Values 2009-10
Colonoscopy Revealed
Inflamed Tissue
9. Confirming the IBD (Crohn’s) Hypothesis:
Finding the “Smoking Gun” with MRI Imaging
Liver I Obtained the MRI Slices
Transverse Colon
From UCSD Medical Services
and Converted to Interactive 3D
Working With
Small Intestine Calit2 Staff & DeskVOX Software
Descending Colon
MRI Jan 2012
Cross Section
Diseased Sigmoid Colon
Major Kink
Sigmoid Colon
Threading Iliac Arteries
11. An MRI Shows Sigmoid Colon Wall Thickened
Indicating Probable Diagnosis of Crohn’s Disease
12. Why Did I Have an Autoimmune Disease like IBD?
Despite decades of research,
the etiology of Crohn's disease
remains unknown.
Its pathogenesis may involve
a complex interplay between
host genetics,
immune dysfunction,
and microbial or environmental factors.
--The Role of Microbes in Crohn's Disease
So I Set Out to Quantify All Three!
Paul B. Eckburg & David A. Relman
Clin Infect Dis. 44:256-262 (2007)
13. I Wondered if Crohn’s is an Autoimmune Disease,
Did I Have a Personal Genomic Polymorphism?
From www.23andme.com Polymorphism in
Interleukin-23 Receptor Gene
— 80% Higher Risk
ATG16L1
of Pro-inflammatory
Immune Response
IRGM
NOD2 SNPs Associated with CD
Now Comparing
163 Known IBD SNPs
with 23andme SNP Chip
14. Four Immune Biomarkers Over Time
Compared with Four Signs/Symptoms
Gut Microbiome Samples
1/2009 1/2010 1/2011 1/2012 1/2013
Here Immune biomarkers are normalized 0 to 1,
with 1 being the highest value in five years
Source: Photo of Calit2 64-megapixel VROOM
15. However, Most Biological Diversity on Earth
is in the Microbial World
You
Are
Here
So You Have Many Phyla of Microbes Within You!
Source: Carl Woese, et al
16. Cultured Bacteria From Stool Tests
Showed Large Time Variations in Gut Microbiome
16 = All 4 at Full Strength
Antibiotics Antibiotics
Antibiotics: Levaquin & Metronidaloze
Values From www.yourfuturehealth.com stool test
17. But How Can You Determine
Which Microbes Are Within You?
“The emerging field
NRC Report: of metagenomics,
where the DNA of entire
Metagenomic communities of microbes
data should is studied simultaneously,
be made presents the greatest opportunity
publicly -- perhaps since the invention of
available in the microscope –
international to revolutionize understanding of
archives as the microbial world.” –
rapidly as
possible. National Research Council
March 27, 2007
18. Intense Scientific Research is Underway
on Understanding the Human Microbiome
June 8, 2012 June 14, 2012
From Culturing Bacteria to Sequencing Them
19. To Map My Gut Microbes, I Sent a Stool Sample to
the Venter Institute for Metagenomic Sequencing
Sequencing Shipped Stool Sample
Funding
Provided by
December 28, 2011
UCSD School of
Health Sciences I Received
a Disk Drive April 3, 2012
With 35 GB FASTQ Files
Weizhong Li, UCSD
NGS Pipeline:
230M Reads
Only 0.2% Human
Required 1/2 cpu-yr
Per Person Analyzed!
Gel Image of Extract from Smarr Sample-Next is Library Construction
Manny Torralba, Project Lead - Human Genomic Medicine
J Craig Venter Institute
January 25, 2012
20. We Used Weizhong Li Group’s Metagenomic
Computational NextGen Sequencing Pipeline
Reads QC
Raw reads
Raw reads HQ reads:
HQ reads: Bowtie/BWA against
Bowtie/BWA against
Filter human Human genome and
Human genome and
mRNAs
mRNAs
Filtered reads
Filtered reads
Filter duplicate CD-HIT-Dup
CD-HIT-Dup
For single or PE reads
For single or PE reads
Unique reads
Unique reads
FR-HIT against
FR-HIT against
Non-redundant Read recruitment Filter errors Cluster-based
Cluster-based
Non-redundant
microbial genomes Denoising
Denoising
microbial genomes
Further filtered
Further filtered
Taxonomy binning
Taxonomy binning Velvet,
Velvet,
reads
reads SOAPdenovo,
SOAPdenovo,
FRV Assemble Abyss
Abyss
-------
-------
Contigs K-mer setting
K-mer setting
Visualization
Visualization Contigs
Mapping BWA Bowtie
BWA Bowtie
Contigs with ORF-finder
Contigs with ORFs
Abundance Megagene ORFs
Abundance
tRNA-scan Pfam
Pfam
Cd-hit at 95% Tigrfam
rRNA - HMM Hmmer Tigrfam
Non redundant COG
COG
Non redundant RPS-blast
tRNAs
tRNAs ORFs KOG
KOG
ORFs blast
rRNAs
rRNAs PRK
PRK
Cd-hit at 60% KEGG
KEGG
eggNOG
eggNOG
Core ORF clusters
Core ORF clusters
Cd-hit at 30% 1e-6
Function
Function
Pathway
Pathway
Protein families
Protein families Annotation
Annotation
PI: (Weizhong Li, UCSD):
NIH R01HG005978 (2010-2013, $1.1M)
22. We Used SDSC’s Gordon Data-Intensive Supercomputer
to Analyze JCVI Sequences of LS Gut Microbiome
• Analyzed Healthy and IBD Patients: Venter Sequencing of
– LS, 13 Crohn's Disease & LS Gut Microbiome:
230 M Reads
11 Ulcerative Colitis Patients,
101 Bases Per Read
+ 150 HMP Healthy Subjects 23 Billion DNA Bases
• Gordon Compute Time
– ~1/2 CPU-Year Per Sample
– > 200,000 CPU-Hours so far Enabled by
• Gordon RAM Required a Grant of Time
– 64GB RAM for Most Steps on Gordon from
– 192GB RAM for Assembly SDSC Director Mike Norman
• Gordon Disk Required
– 8TB for All Subjects
– Input, Intermediate and Final Results
23. Analysis of Clusters of Orthologous Groups (COGs) -
Gene Family Distribution in LS Gut Microbiome
Analysis: Weizhong Li & Sitao Wu, UCSD
24. Using Calit2’s 64 Megapixel Tiled Display Wall
To Analyze Human Microbiome Complexity
Comparing 3 LS Time Snapshots (Left)
with Healthy, Crohn’s, UC (Right Top to Bottom)
Calit2 VROOM-FuturePatient Expedition
25. LS Gut Microbe Species 12/28/11 (red)
compared to Average of Healthy Subjects (blue)
Species are Organized by Microbial Phyla
Each Species is a Bar,
Height is Logarithmic Abundance,
Derived from metagenomic sequencing of LS stool sample.
Source: Photo of Calit2 64-megapixel VROOM
26. Almost All Abundant Species (≥1%) in Healthy Subjects
Are Severely Depleted in LS Gut
27. Top 20 Most Abundant Microbial Species
In LS vs. Average Healthy Subject
152x Number Above
LS Blue Bar is Multiple
of LS Abundance
765x
Compared to Average
148x Healthy Abundance
Per Species
849x
483x
220x
201x169x
522x
Source: Sequencing JCVI; Analysis Weizhong Li, UCSD
LS December 28, 2011 Stool Sample
28. 200 LS Gut Microbe Species at 3 Times
12/28/11, 4/3/12, 8/7/12
Red is at Highest Value of CRP
Blue is the Day After End of Antibiotic/Prednisone Therapy
Green is Four Months Later
Source: Photo of Calit2 64-megapixel VROOM
29. Closeup of Uncommon LS Microbes
12/28/11 Stool Sample
Two separate
45x research teams
Reduced have found
strikingly high
By 8% 90x concentrations
Therapy Increased Reduced of Fusobacterium
in tumor samples
By By collected from
Therapy Therapy colorectal cancer
patients.
October 18, 2011
30. DIY Systems Biology -
Toward P4 Healthcare
Over 1000 Downloads So Far
Download pdfs from Journal:
http://onlinelibrary.wiley.com/doi/10.1002/biot.201100495/full
32. CAMERA as an Example
for the NOMIC Portal Query/Hierarchy System
Source:
Jeff Grethe,
CRBS, UCSD
33. Ecosystem to Amplify Understanding of
Microbial Community Structure & Function
Source: Jeff Grethe, CRBS, UCSD
34. Access to Computing Resources Tailored by User’s
Requirements and Resources
Core CAMERA HPC
Resource
UCSD Triton
NSF/SDSC NSF/SDSC NSF/TACC NSF/TACC NSF/RCAC
Gordon Trestles Lonestar Ranger Steele
Infrastructure Services Extend
Infrastructure Services Extend
CAMERA Computations to
CAMERA Computations to
3rdrd
Party Compute Resources
3 Party Compute Resources
Source: EAGER: Multi-Domain, Workflow-Driven
Jeff Grethe, Computation System for
CRBS, UCSD Microbial Ecology Research and Analysis
35. PhyloMETARE
Explore, Analyze & Compare Transcriptomes
Data
P
Source:
Jeff Grethe,
Data Analysis CRBS, UCSD
Diverse Analysis Functions
A new community resource for comparing
complex microbial gene expression patterns
36. VIROME
Explore, Analyze &Compare Viral Genomes/Metagenomes
Data
Resource for analysis
of viral metagenomes
Data Analysis
Source:
Jeff Grethe,
CRBS, UCSD
Diverse Analysis Functions
37. Fragment Recruitment Viewer (FRV) Interface
X-axis is the genome coordinate, and y-axis is alignment identity (%). The top is genome coverage.
The bottom shows genes or other genomic features. Users can zoom, resize, and pan the plot by
mouse or using icons at corners in a similar way as Google Maps. Right illustrates new functions
and interface to be implemented in order to handle multiple integrated omics data types by using
multiple synchronized FRV panels.
Source: Weizhong Li, UCSD
38. Combined 16S, Metagenomics
and Metatranscriptomics Pipeline
Pooled 16S
Pooled 16S WGS, transcriptomics
WGS, transcriptomics
Raw reads
Raw reads Raw reads
Raw reads
Internal
Internal
Internal scripts to deconvolve QC QC scripts
QC scripts Human
Human
pooled samples, trim barcode 1 Human seq. BWA, Bowtie,
BWA, Bowtie, genome
genome
and primer sequences, and QC HQ reads removal FR-HIT, Blat etc & mRNAs
FR-HIT, Blat etc & mRNAs
HQ reads
data
2 Artificial duplicates Cd-hit-dup
Cd-hit-dup
removal
3 rRNA removal
Sample2 Sample nn Meta-RNA
Meta-RNA
Sample 11
Sample Sample2 Sample Taxonomy Transcriptomics
Taxonomy
Taxonomy profiling Filtered
Filtered only
ChimeraSlayer Ribosomal
ChimeraSlayer Ribosomal Seq. error & K-mer based
profile
profile FR-HIT, Blat, Blast reads
reads K-mer based
Mothur
Mothur Database
Database FR-HIT, Blat, Blast
redundancy Clustering-based
Clustering-based
Cd-hit-otu
Cd-hit-otu Project
Project Curated ref.
Curated ref.
MGAviewer
MGAviewer removal
genomes
genomes
Denoised
Denoised
Taxonomic classification
Taxonomic classification Alignment reads
Alignment reads Velvet
Velvet
identification of
identification of Visualization
Visualization Assembly SOAPdenovo
SOAPdenovo
Operational Taxonomic Units,
Operational Taxonomic Units, Abyss
Metagenome Reads Assembled Abyss
computation of community
computation of community Metagenome Assembled
richness and diversity Abundance
Abundance mapping metagenomes
metagenomes ORF_finder
richness and diversity BWA, Bowtie
BWA, Bowtie ORF_finder
ORF call Metagene
Metagene
FragGeneScan
FragGeneScan
Multivariate Gene
Gene
Multivariate Genes
Genes
Statistical approaches
Statistical approaches Abundance
Abundance Tigrfam
Tigrfam
Blastp
Blastp Pfam, COG
Annotation RPS-blast
RPS-blast Pfam, COG
Sample comparison Function, pathway HMMER3 KOG, KEGG
HMMER3 KOG, KEGG
Sample comparison Function, pathway eggNOG
eggNOG
clustering
clustering annotation
annotation
ordination
ordination
(a) Legend: Data Tool Database
Data Tool Database Proteomics
(b) Proteomics
analysis
analysis
Source: Weizhong Li, UCSD
39. UCSD Center for Computational Mass Spectrometry
Becoming Global MS Repository
ProteoSAFe: Compute-intensive MassIVE: repository and
discovery MS at the click of a button identification platform for all
MS data in the world
Source:
Nuno Bandeira,
Vineet Bafna,
Pavel Pevzner,
Ingolf Krueger,
UCSD
proteomics.ucsd.edu