SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Data has Shape, Shape has Meaning
Pek Lum, Ph.D.
Chief Data Scientist
VP of Solutions
2000 2005 2008 2010 2013
NSF funds Stanford
Math Professor
Gunnar Carlsson to
research Topological
Data Analysis
Floodgate and
private investors
provide seed
capital
Khosla Ventures
and Institutional
Venture Partners
lead growth
financing rounds
DARPA invests in
applying TDA to
massive, complex,
multi-modal DoD
data
AYASDI founded
with DARPA and
IARPA funding
Company Timeline
COMPANY CONFIDENTIAL 2
Tony Tether, Director
Defense Advanced Research Projects Agency (2001-2009)
Ayasdi’s approach is using Topological Data Analysis one of the
top 10 innovations developed at DARPA in the last decade.
Ayasdi named one of the Top 10 Most Innovative
Companies in Big Data for 2013
Won the Gigaom Structure Data Award for Most Promising
Machine Learning/Artificial Intelligence Startup
In collaboration with UCSF won the NFL/GE Head Health
challenge for their work in mild-traumatic brain injury
COMPANY CONFIDENTIAL 8
Recent Awards
The Problem
The Solution
How can the Life Sciences field fully leverage data to gain
knowledge and extract insights towards better outcomes
for health?
Why is complex (& big) data not tractable or accessible to
everyone who has a stake in the results? From computational
folks to physicians who collected the data?
Easy access to game changing analytical methods. Computer-
augmented analysis. Automation. Visualization. No need to think
of queries first- answers first, hypothesis building after that.
The Impact
Obtaining insights from data can become more
democratized, more collaborative among disparate
disciplines, more meaningful faster
TDA as a framework for ML methods
Automation
AYASDITDA
My talk today
Interactive visualization
What is TDA?
Why TDA
Shape of Data
Speed to insights
Leveraging big data for the rest of
us
Data has shape and!
shape has meaning.!
TDA methods will transform the way that doctors triage patients,
through construction of non-linear, non-invasive medical statistics to
assess patients in intensive and critical care situations.
Introducing The Shape of Data
A mathematical
concept that began
in the 1700 s.
Uses the shape of data
to find unknown
phenomena.
Topological Data Analysis (TDA)
Math
+ Computer Science
+ User Experience
Automated discovery of
shapes
What do I mean by data having shape ?
Age, Weight and Height
sampled uniformly at random
In reality, age, weight and height
are correlated and that data
has a shape
1. Coordinate free representations are vital when one is studying
data collected with different technologies- studies done at different
times, multi-variate data types collected
!
2. Deformation invariance has an effect of introducing a degree of
robustness into the analysis, which is important in the study of real
world data- human heterogeneity is complex and needs an approach that
is deformation (variation) resistant
!
3. Compact representations are important for visualizing large and
complex datasets- this is so that signals can be easily identified in the form
of “shapes” in the network
What does the TDA approach bring to the table?
TDA summarizes the shape of data with no pre-conceived
model of what it should be
Bringing TDA (math world)
into Real Data World=Ayasdi
Software
Automation
Interactive, intuitive visualization
Handling “Big Data”
No coding needed
Topology for the rest of us
TDA as a framework for ML
methods
A node represents a group of similar objects.
!
Edges between nodes are drawn when the nodes are
very similar to each other. Nodes that are not
connected are less similar to each other.
!
The coloring reflects values of interest. The position of a
node on the screen is irrelevant - only its connection to
other nodes matters.
Ayasdi Network Orientation
© 2012 Ayasdi inc.Case Study > Identification of patient subtypes
Ayasdi Network Orientation
Topological Map of Patient-Patient Relationships according to their tumors molecular characteristics (in this case, gene expression)!
Each node contains
subsets of patients!
These patients are
eccentric (away from the
center of the data)!
These patients are close
to the center of the data!
Color scheme!
The Problem
The Approach
Landscape of cancer and path towards more
precise treatments
Cancer is very heterogenous. Data generated from TCGA is
extremely large and generally not viewable all at once and
certainly very hard to access and analyze by non-
computational folks
View 12 cancers at once before drilling down to sub-populations
The Results
The ability to view all cancers at once allows very quick scans
of the landscape of unique and common mutations. We also
show that TP53 mutation is the common link between Triple
Negatives Breast Cancer and Ovarian Cancer.
The Cancer Genome Atlas
Breast invasive carcinoma!
Kidney renal clear cell carcinoma!
Bladder Urothelial Carcinoma!
Cervical squamous cell carcinoma
and endocervical adenocarcinoma!
Lung squamous cell carcinoma!
Ovarian serous cystadenocarcinoma!
Uterine Corpus
Endometrioid Carcinoma!
Colon adenocarcinoma! Glioblastoma multiforme!
Prostate adenocarcinoma! Rectum adenocarcinoma! Acute Myeloid Leukemia!
12 cancer types from The Cancer Genome Atlas
DNA exome sequencing: High Volume, High Complexity!
Over 2400 tumors, 12 cancer types, over half a million unique variants analyzed simultaneously!
Landscape of p53 mutations across all 12 cancers!
Breast invasive carcinoma!
Kidney renal clear cell carcinoma!
Bladder Urothelial Carcinoma!
Cervical squamous cell carcinoma
and endocervical adenocarcinoma!
Lung squamous cell carcinoma!
Ovarian serous cystadenocarcinoma!
Uterine Corpus
Endometrioid Carcinoma!
Colon adenocarcinoma! Glioblastoma multiforme!
Prostate adenocarcinoma! Rectum adenocarcinoma! Acute Myeloid Leukemia!
red= enriched for p53 mutations; blue=not enriched!
GATA3 mutations quite unique to breast cancer
Identifying commonalities between breast and ovarian cancers
Triple negative!
breast cancers!
Ovarian!
cancers!
Breast!
cancers!A!
A!
B!
Merging both DNA mutations + gene expression data together!
Triple negative!
breast cancers!
mutations in p53!
red= many; dark blue= none! Ovarian!
cancers!
Breast!
cancers!
TP53 mutation in the tumors is the common theme
between TNBC and Ovarian cancers
FOXA1 gene levels!
GO_pathway_positive regulation of
potassium ion transport!PSAT1 gene levels!
Mutations in TP53!
Triple negative
breast cancer!
Ovarian tumors!
Common
target for
TNBC and
Ovarian
Cancer?
Data has Shape, Shape has Meaning
Pek Lum, Ph.D.
Chief Data Scientist
VP of Solutions

Weitere ähnliche Inhalte

Ähnlich wie Data Shapes Reveal Cancer Insights

Data Science: Origins, Methods, Challenges and the future?
Data Science: Origins, Methods, Challenges and the future?Data Science: Origins, Methods, Challenges and the future?
Data Science: Origins, Methods, Challenges and the future?Cagatay Turkay
 
Opportunities in technology and connected health for population science
Opportunities in technology and connected health for population science Opportunities in technology and connected health for population science
Opportunities in technology and connected health for population science Warren Kibbe
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data SciencePhilip Bourne
 
Surfing a Great Wave: Data Science and Global Health
Surfing a Great Wave: Data Science and Global HealthSurfing a Great Wave: Data Science and Global Health
Surfing a Great Wave: Data Science and Global HealthMEASURE Evaluation
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeakin University
 
Bringing scientists to data to accelerate discoveries and improve human healt...
Bringing scientists to data to accelerate discoveries and improve human healt...Bringing scientists to data to accelerate discoveries and improve human healt...
Bringing scientists to data to accelerate discoveries and improve human healt...Sri Ambati
 
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...Health IT Conference – iHT2
 
Frankie Rybicki slide set for Deep Learning in Radiology / Medicine
Frankie Rybicki slide set for Deep Learning in Radiology / MedicineFrankie Rybicki slide set for Deep Learning in Radiology / Medicine
Frankie Rybicki slide set for Deep Learning in Radiology / MedicineFrank Rybicki
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewPhilip Bourne
 
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...Dana Gardner
 
Introduction to Data Science 1118.pptx
Introduction to Data Science 1118.pptxIntroduction to Data Science 1118.pptx
Introduction to Data Science 1118.pptxmark828
 
Supervised Multi Attribute Gene Manipulation For Cancer
Supervised Multi Attribute Gene Manipulation For CancerSupervised Multi Attribute Gene Manipulation For Cancer
Supervised Multi Attribute Gene Manipulation For Cancerpaperpublications3
 
Systems Genetics of Cancer - big data and all that
Systems Genetics of Cancer - big data and all thatSystems Genetics of Cancer - big data and all that
Systems Genetics of Cancer - big data and all thatFlorian Markowetz
 
Vph2012 20 sept12_shublaq_final
Vph2012 20 sept12_shublaq_finalVph2012 20 sept12_shublaq_final
Vph2012 20 sept12_shublaq_finalNour Shublaq
 
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...Warren Kibbe
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
The Field Guide to Data Science
The Field Guide to Data ScienceThe Field Guide to Data Science
The Field Guide to Data ScienceEMC
 
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...Servio Fernando Lima Reina
 

Ähnlich wie Data Shapes Reveal Cancer Insights (20)

Data Science: Origins, Methods, Challenges and the future?
Data Science: Origins, Methods, Challenges and the future?Data Science: Origins, Methods, Challenges and the future?
Data Science: Origins, Methods, Challenges and the future?
 
Opportunities in technology and connected health for population science
Opportunities in technology and connected health for population science Opportunities in technology and connected health for population science
Opportunities in technology and connected health for population science
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
Surfing a Great Wave: Data Science and Global Health
Surfing a Great Wave: Data Science and Global HealthSurfing a Great Wave: Data Science and Global Health
Surfing a Great Wave: Data Science and Global Health
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining I
 
Bringing scientists to data to accelerate discoveries and improve human healt...
Bringing scientists to data to accelerate discoveries and improve human healt...Bringing scientists to data to accelerate discoveries and improve human healt...
Bringing scientists to data to accelerate discoveries and improve human healt...
 
2015 04-18-wilson cg
2015 04-18-wilson cg2015 04-18-wilson cg
2015 04-18-wilson cg
 
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
 
Frankie Rybicki slide set for Deep Learning in Radiology / Medicine
Frankie Rybicki slide set for Deep Learning in Radiology / MedicineFrankie Rybicki slide set for Deep Learning in Radiology / Medicine
Frankie Rybicki slide set for Deep Learning in Radiology / Medicine
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
 
Introduction to Data Science 1118.pptx
Introduction to Data Science 1118.pptxIntroduction to Data Science 1118.pptx
Introduction to Data Science 1118.pptx
 
Supervised Multi Attribute Gene Manipulation For Cancer
Supervised Multi Attribute Gene Manipulation For CancerSupervised Multi Attribute Gene Manipulation For Cancer
Supervised Multi Attribute Gene Manipulation For Cancer
 
Systems Genetics of Cancer - big data and all that
Systems Genetics of Cancer - big data and all thatSystems Genetics of Cancer - big data and all that
Systems Genetics of Cancer - big data and all that
 
Vph2012 20 sept12_shublaq_final
Vph2012 20 sept12_shublaq_finalVph2012 20 sept12_shublaq_final
Vph2012 20 sept12_shublaq_final
 
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
Precision Oncology - using Genomics, Proteomics and Imaging to inform biology...
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
The Field Guide to Data Science
The Field Guide to Data ScienceThe Field Guide to Data Science
The Field Guide to Data Science
 
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...
Slima explainable deep learning using fuzzy logic human ist u fribourg ver 17...
 

Mehr von MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

Mehr von MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Kürzlich hochgeladen

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Kürzlich hochgeladen (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Data Shapes Reveal Cancer Insights

  • 1. Data has Shape, Shape has Meaning Pek Lum, Ph.D. Chief Data Scientist VP of Solutions
  • 2. 2000 2005 2008 2010 2013 NSF funds Stanford Math Professor Gunnar Carlsson to research Topological Data Analysis Floodgate and private investors provide seed capital Khosla Ventures and Institutional Venture Partners lead growth financing rounds DARPA invests in applying TDA to massive, complex, multi-modal DoD data AYASDI founded with DARPA and IARPA funding Company Timeline COMPANY CONFIDENTIAL 2 Tony Tether, Director Defense Advanced Research Projects Agency (2001-2009) Ayasdi’s approach is using Topological Data Analysis one of the top 10 innovations developed at DARPA in the last decade.
  • 3. Ayasdi named one of the Top 10 Most Innovative Companies in Big Data for 2013 Won the Gigaom Structure Data Award for Most Promising Machine Learning/Artificial Intelligence Startup In collaboration with UCSF won the NFL/GE Head Health challenge for their work in mild-traumatic brain injury COMPANY CONFIDENTIAL 8 Recent Awards
  • 4.
  • 5. The Problem The Solution How can the Life Sciences field fully leverage data to gain knowledge and extract insights towards better outcomes for health? Why is complex (& big) data not tractable or accessible to everyone who has a stake in the results? From computational folks to physicians who collected the data? Easy access to game changing analytical methods. Computer- augmented analysis. Automation. Visualization. No need to think of queries first- answers first, hypothesis building after that. The Impact Obtaining insights from data can become more democratized, more collaborative among disparate disciplines, more meaningful faster
  • 6. TDA as a framework for ML methods Automation AYASDITDA My talk today Interactive visualization What is TDA? Why TDA Shape of Data Speed to insights Leveraging big data for the rest of us
  • 7. Data has shape and! shape has meaning.!
  • 8. TDA methods will transform the way that doctors triage patients, through construction of non-linear, non-invasive medical statistics to assess patients in intensive and critical care situations. Introducing The Shape of Data A mathematical concept that began in the 1700 s. Uses the shape of data to find unknown phenomena. Topological Data Analysis (TDA) Math + Computer Science + User Experience Automated discovery of shapes
  • 9. What do I mean by data having shape ? Age, Weight and Height sampled uniformly at random In reality, age, weight and height are correlated and that data has a shape
  • 10.
  • 11.
  • 12. 1. Coordinate free representations are vital when one is studying data collected with different technologies- studies done at different times, multi-variate data types collected ! 2. Deformation invariance has an effect of introducing a degree of robustness into the analysis, which is important in the study of real world data- human heterogeneity is complex and needs an approach that is deformation (variation) resistant ! 3. Compact representations are important for visualizing large and complex datasets- this is so that signals can be easily identified in the form of “shapes” in the network What does the TDA approach bring to the table?
  • 13. TDA summarizes the shape of data with no pre-conceived model of what it should be
  • 14. Bringing TDA (math world) into Real Data World=Ayasdi Software Automation Interactive, intuitive visualization Handling “Big Data” No coding needed Topology for the rest of us TDA as a framework for ML methods
  • 15. A node represents a group of similar objects. ! Edges between nodes are drawn when the nodes are very similar to each other. Nodes that are not connected are less similar to each other. ! The coloring reflects values of interest. The position of a node on the screen is irrelevant - only its connection to other nodes matters. Ayasdi Network Orientation
  • 16. © 2012 Ayasdi inc.Case Study > Identification of patient subtypes Ayasdi Network Orientation Topological Map of Patient-Patient Relationships according to their tumors molecular characteristics (in this case, gene expression)! Each node contains subsets of patients! These patients are eccentric (away from the center of the data)! These patients are close to the center of the data! Color scheme!
  • 17. The Problem The Approach Landscape of cancer and path towards more precise treatments Cancer is very heterogenous. Data generated from TCGA is extremely large and generally not viewable all at once and certainly very hard to access and analyze by non- computational folks View 12 cancers at once before drilling down to sub-populations The Results The ability to view all cancers at once allows very quick scans of the landscape of unique and common mutations. We also show that TP53 mutation is the common link between Triple Negatives Breast Cancer and Ovarian Cancer. The Cancer Genome Atlas
  • 18. Breast invasive carcinoma! Kidney renal clear cell carcinoma! Bladder Urothelial Carcinoma! Cervical squamous cell carcinoma and endocervical adenocarcinoma! Lung squamous cell carcinoma! Ovarian serous cystadenocarcinoma! Uterine Corpus Endometrioid Carcinoma! Colon adenocarcinoma! Glioblastoma multiforme! Prostate adenocarcinoma! Rectum adenocarcinoma! Acute Myeloid Leukemia! 12 cancer types from The Cancer Genome Atlas DNA exome sequencing: High Volume, High Complexity! Over 2400 tumors, 12 cancer types, over half a million unique variants analyzed simultaneously!
  • 19. Landscape of p53 mutations across all 12 cancers! Breast invasive carcinoma! Kidney renal clear cell carcinoma! Bladder Urothelial Carcinoma! Cervical squamous cell carcinoma and endocervical adenocarcinoma! Lung squamous cell carcinoma! Ovarian serous cystadenocarcinoma! Uterine Corpus Endometrioid Carcinoma! Colon adenocarcinoma! Glioblastoma multiforme! Prostate adenocarcinoma! Rectum adenocarcinoma! Acute Myeloid Leukemia! red= enriched for p53 mutations; blue=not enriched!
  • 20. GATA3 mutations quite unique to breast cancer
  • 21. Identifying commonalities between breast and ovarian cancers Triple negative! breast cancers! Ovarian! cancers! Breast! cancers!A! A! B! Merging both DNA mutations + gene expression data together!
  • 22. Triple negative! breast cancers! mutations in p53! red= many; dark blue= none! Ovarian! cancers! Breast! cancers! TP53 mutation in the tumors is the common theme between TNBC and Ovarian cancers
  • 23. FOXA1 gene levels! GO_pathway_positive regulation of potassium ion transport!PSAT1 gene levels! Mutations in TP53! Triple negative breast cancer! Ovarian tumors! Common target for TNBC and Ovarian Cancer?
  • 24. Data has Shape, Shape has Meaning Pek Lum, Ph.D. Chief Data Scientist VP of Solutions