The Codex of Business Writing Software for Real-World Solutions 2.pptx
Data Shapes Reveal Cancer Insights
1. Data has Shape, Shape has Meaning
Pek Lum, Ph.D.
Chief Data Scientist
VP of Solutions
2. 2000 2005 2008 2010 2013
NSF funds Stanford
Math Professor
Gunnar Carlsson to
research Topological
Data Analysis
Floodgate and
private investors
provide seed
capital
Khosla Ventures
and Institutional
Venture Partners
lead growth
financing rounds
DARPA invests in
applying TDA to
massive, complex,
multi-modal DoD
data
AYASDI founded
with DARPA and
IARPA funding
Company Timeline
COMPANY CONFIDENTIAL 2
Tony Tether, Director
Defense Advanced Research Projects Agency (2001-2009)
Ayasdi’s approach is using Topological Data Analysis one of the
top 10 innovations developed at DARPA in the last decade.
3. Ayasdi named one of the Top 10 Most Innovative
Companies in Big Data for 2013
Won the Gigaom Structure Data Award for Most Promising
Machine Learning/Artificial Intelligence Startup
In collaboration with UCSF won the NFL/GE Head Health
challenge for their work in mild-traumatic brain injury
COMPANY CONFIDENTIAL 8
Recent Awards
4.
5. The Problem
The Solution
How can the Life Sciences field fully leverage data to gain
knowledge and extract insights towards better outcomes
for health?
Why is complex (& big) data not tractable or accessible to
everyone who has a stake in the results? From computational
folks to physicians who collected the data?
Easy access to game changing analytical methods. Computer-
augmented analysis. Automation. Visualization. No need to think
of queries first- answers first, hypothesis building after that.
The Impact
Obtaining insights from data can become more
democratized, more collaborative among disparate
disciplines, more meaningful faster
6. TDA as a framework for ML methods
Automation
AYASDITDA
My talk today
Interactive visualization
What is TDA?
Why TDA
Shape of Data
Speed to insights
Leveraging big data for the rest of
us
8. TDA methods will transform the way that doctors triage patients,
through construction of non-linear, non-invasive medical statistics to
assess patients in intensive and critical care situations.
Introducing The Shape of Data
A mathematical
concept that began
in the 1700 s.
Uses the shape of data
to find unknown
phenomena.
Topological Data Analysis (TDA)
Math
+ Computer Science
+ User Experience
Automated discovery of
shapes
9. What do I mean by data having shape ?
Age, Weight and Height
sampled uniformly at random
In reality, age, weight and height
are correlated and that data
has a shape
10.
11.
12. 1. Coordinate free representations are vital when one is studying
data collected with different technologies- studies done at different
times, multi-variate data types collected
!
2. Deformation invariance has an effect of introducing a degree of
robustness into the analysis, which is important in the study of real
world data- human heterogeneity is complex and needs an approach that
is deformation (variation) resistant
!
3. Compact representations are important for visualizing large and
complex datasets- this is so that signals can be easily identified in the form
of “shapes” in the network
What does the TDA approach bring to the table?
13. TDA summarizes the shape of data with no pre-conceived
model of what it should be
14. Bringing TDA (math world)
into Real Data World=Ayasdi
Software
Automation
Interactive, intuitive visualization
Handling “Big Data”
No coding needed
Topology for the rest of us
TDA as a framework for ML
methods
15. A node represents a group of similar objects.
!
Edges between nodes are drawn when the nodes are
very similar to each other. Nodes that are not
connected are less similar to each other.
!
The coloring reflects values of interest. The position of a
node on the screen is irrelevant - only its connection to
other nodes matters.
Ayasdi Network Orientation
17. The Problem
The Approach
Landscape of cancer and path towards more
precise treatments
Cancer is very heterogenous. Data generated from TCGA is
extremely large and generally not viewable all at once and
certainly very hard to access and analyze by non-
computational folks
View 12 cancers at once before drilling down to sub-populations
The Results
The ability to view all cancers at once allows very quick scans
of the landscape of unique and common mutations. We also
show that TP53 mutation is the common link between Triple
Negatives Breast Cancer and Ovarian Cancer.
The Cancer Genome Atlas
18. Breast invasive carcinoma!
Kidney renal clear cell carcinoma!
Bladder Urothelial Carcinoma!
Cervical squamous cell carcinoma
and endocervical adenocarcinoma!
Lung squamous cell carcinoma!
Ovarian serous cystadenocarcinoma!
Uterine Corpus
Endometrioid Carcinoma!
Colon adenocarcinoma! Glioblastoma multiforme!
Prostate adenocarcinoma! Rectum adenocarcinoma! Acute Myeloid Leukemia!
12 cancer types from The Cancer Genome Atlas
DNA exome sequencing: High Volume, High Complexity!
Over 2400 tumors, 12 cancer types, over half a million unique variants analyzed simultaneously!
19. Landscape of p53 mutations across all 12 cancers!
Breast invasive carcinoma!
Kidney renal clear cell carcinoma!
Bladder Urothelial Carcinoma!
Cervical squamous cell carcinoma
and endocervical adenocarcinoma!
Lung squamous cell carcinoma!
Ovarian serous cystadenocarcinoma!
Uterine Corpus
Endometrioid Carcinoma!
Colon adenocarcinoma! Glioblastoma multiforme!
Prostate adenocarcinoma! Rectum adenocarcinoma! Acute Myeloid Leukemia!
red= enriched for p53 mutations; blue=not enriched!
21. Identifying commonalities between breast and ovarian cancers
Triple negative!
breast cancers!
Ovarian!
cancers!
Breast!
cancers!A!
A!
B!
Merging both DNA mutations + gene expression data together!
22. Triple negative!
breast cancers!
mutations in p53!
red= many; dark blue= none! Ovarian!
cancers!
Breast!
cancers!
TP53 mutation in the tumors is the common theme
between TNBC and Ovarian cancers
23. FOXA1 gene levels!
GO_pathway_positive regulation of
potassium ion transport!PSAT1 gene levels!
Mutations in TP53!
Triple negative
breast cancer!
Ovarian tumors!
Common
target for
TNBC and
Ovarian
Cancer?
24. Data has Shape, Shape has Meaning
Pek Lum, Ph.D.
Chief Data Scientist
VP of Solutions