SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Rapid identification of phylogenetically informative
data from next-gen sequencing
Rachel Schwartz
The Biodesign Institute
Arizona State University
Rachel.Schwartz@asu.edu
July 16, 2015
Big data for phylogenetics
Phylogenomics requires a lot of time and money
SISRS: Site Identification from Short Read Sequences
A composite genome for reference
Shotgun sequencingShotgun sequencing Shotgun sequencing
Assemble composite genome
A composite genome for reference
A composite genome for reference
Shotgun sequencingShotgun sequencing Shotgun sequencing
Assemble composite genome
Align reads to
composite genome
A composite genome for reference
Shotgun sequencingShotgun sequencing Shotgun sequencing
Assemble composite genome
Align reads to
composite genome
Call genotype at each
site for each sample
A composite genome for reference
Shotgun sequencingShotgun sequencing Shotgun sequencing
Assemble composite genome
Align reads to
composite genome
Call genotype at each
site for each sample
Remove sites
with missing data
A composite genome for reference
Shotgun sequencingShotgun sequencing Shotgun sequencing
Assemble composite genome
Align reads to
composite genome
Call genotype at each
site for each sample
Remove sites
with missing data
Output alignment
Simulations
A
B
C
D
E
F
G
H
Laddertrees
Equal branch length
A
B
C
D
E
F
G
H
Long deep branches
A
B
C
D
E
F
G
H
Short deep branches
A
B
C
D
E
F
G
H
Balancedtrees
A
B
C
D
E
F
G
H
A
B
C
D
E
F
G
H
Simulation Results: 1 million bp genome
A
B
C
D
E
F
G
H
Laddertrees
Equal branch length
A
B
C
D
E
F
G
H
Long deep branches
A
B
C
D
E
F
G
H
Short deep branches
G
H
G
H
G
H
Coverage
Numberofcorrectmappablesites
1
10
100
1000
10000
100000
1 2 4 8 10 20 50
●
●
●
● ● ● ●
Slow genes
Fast genes
●
Schwartz et al. (2015) BMC Bioinformatics
Simulation Results: by depth
Coverage
Numberofcorrectmappablesites
1
10
100
1000
10000
100000
1 2 4 8 10 20 50
q
q
q
q q q q
q
q
q
q
q q
A
B
C
D
E
F
G
H
Laddertrees
Equal branch length Long
C
D
E
F
G
H
Balancedtrees
Depth 1
Depth 2
Depth 3
Depth 4
∗ Depth 5
• Depth 6
Schwartz et al. (2015) BMC Bioinformatics
Phylogeny of apes from SISRS data
Bonobo
Human
Gorilla
Orangutan
Rhesus macaque
Crab macaque
Chimp
Phylogeny of mammals from SISRS data
treeshrew
horse
pig
cow
toothed whale
baleen whale
pangolin
dog
cat
bat
megabat
shrew
star nosed mole
aardvark
tenrec
elephant shrew
manatee
elephant
sloth
armadillo
opossum
wallaby
rabbit
pika
rat
mouse
colugo
lemur
human
macaque
100
90
100
91
61
100
100
100
100
100
86
51
100
100
100
100
100
100
100
100
72
100
100
100
100
100
100
Schwartz et al. (2015) BMC Bioinformatics
Phylogeny of mammals from SISRS data
colugo
sn mole
shrew
horse
pig
cow
baleenwhale
toothedwhale
pangolin
dog
cat
bat
megabat
aardvark
tenrec
e shrew
elephant
manatee
opossum
wallaby
sloth
armadillo
treeshrew
rat
mouse
rabbit
pika
lemur
human
macaque
100
60
100
100
100
100
100
100
100
100
53
100
100
100
100
100
100
100
99
100
100
75
100
100
100
100
treeshrew
horse
pig
cow
toothedwhale
baleenwhale
pangolin
dog
cat
bat
megabat
shrew
sn mole
aardvark
tenrec
e shrew
manatee
elephant
sloth
armadillo
opossum
wallaby
rabbit
pika
rat
mouse
colugo
lemur
human
macaque
100
90
100
91
61
100
100
100
100
100
86
51
100
100
100
100
100
100
100
100
72
100
100
100
100
100
100
lemur
colugo
bat
megabat
horse
pig
cow
toothedwhale
baleenwhale
pangolin
dog
cat
sn mole
shrew
manatee
elephant
tenrec
aardvark
e shrew
wallaby
opossum
armadillo
sloth
treeshrew
rabbit
pika
rat
mouse
human
macaque
100
100
61
62
61
100
100
100
100
100
87
80
100
100
100
100
100
100
100
100
100
100
92
100
100
100
100
Schwartz et al. (2015) BMC Bioinformatics
Phylogeny of mammals from SISRS data
Opossum
Wallaby
aardvarkG
armadillo
baleenwhaleG
bat
cat
colugoG
cow
dog
elephant
eshrewG
horse
human
lemur
macaque
manateeG
megabatG
mouse
pangolinG
pig
pika
rabbit
ratT
shrew
slothG
sn moleG
tenrecG
toothedwhale
treeshrew
Phylogenies of angiosperms from SISRS data
q
q
qq
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
50
75
100
125
20 40 60
Number of Gaps Allowed At A Site
Robinson−FouldsDistance
q
q
Distance
Nodes In Tree
Comparing Trees Generated With Varying Amounts of Missing Data
Adam Orr
SISRS rapidly identifies phylogenetically informative
data from next-gen sequencing reads
Apes: 3 days
Mammals: 7 days
Leishmania: 12 hours
No reference genome is required.
Minimal assembly required (completely automated).
Results are comparable to slower, labor-intensive methods.
Divergence dating
Branch length estimation
T5
T4
T2
T1
T8
T3
T6
T7
T9
T8
T4
T1
T2
T7
T6
T9
T5
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
T8
T4
T1
T2
T7
T6
T9
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
Tree height = 1
Branch length estimation
T5
T4
T2
T1
T8
T3
T6
T7
T9
T8
T4
T1
T2
T7
T6
T9
T5
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
T8
T4
T1
T2
T7
T6
T9
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
Tree height = 1
q
q
q
qq q
qq
q
qq
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Simulated branch length
Estimatedbranchlength
Branch length estimation
T5
T4
T2
T1
T8
T3
T6
T7
T9
T8
T4
T1
T2
T7
T6
T9
T5
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
T8
T4
T1
T2
T7
T6
T9
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
Height = Sim height
q
qq
qqqqqq
qq
q
q
q
qqq
q
q
q
qq
q
q
q
qq
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
qqqq
q
qqq
qqq
qq
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
0.0 0.1 0.2 0.3 0.4 0.5 0.6
0.00.10.20.30.40.50.6
Simulated branch length
Estimatedbranchlength
Branch length estimation: most variable loci
qq
q
q
qqq
q
q
q
qq
q
q
q
q
q
0.0 0.1 0.2 0.3 0.4 0.5 0.6
0.00.10.20.30.40.50.6
Simulated branch length
Estimatedbranchlength
Slope / Cor / Max Brlen
2.06 1 0.03
1.3 0.99 0.05
0.7 0.99 0.08
0.29 1 0.28
0.24 0.93 0.29
0.21 0.61 0.52
Branch length estimation: most conserved loci
qqq
q
qq
qq
qq
qqqq
qq
q
0.0 0.1 0.2 0.3 0.4 0.5 0.6
0.00.10.20.30.40.50.6
Simulated branch length
Estimatedbranchlength
Slope / Cor / Max Brlen
0.85 0.72 0.03
0.73 0.96 0.05
0.73 0.9 0.08
0.3 0.97 0.28
0.19 0.58 0.29
0.17 0.78 0.52
Conclusions
SISRS rapidly identifies data for phylogenetics from
next-gen sequencing reads
Different (SISRS) data = alternative topologies
Use SISRS data to estimate branch lengths and
divergence dates accurately
Acknowledgements
Co-authors / Collaborators
Reed Cartwright (ASU)
Kelly Harkins (ASU and
UCSC)
Anne Stone (ASU)
Kael Dai (ASU)
Adam Orr (ASU)
Mike Miller (Villanova)
Funding
NSF DBI-1356548
NIH R01-GM101352-01A1
NSF DDIG BCS-1232582
ASU Startup Funds
SISRS is available at
https://github.com/rachelss/SISRS

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxSimeonChristian
 

Kürzlich hochgeladen (20)

Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
 

Empfohlen

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Empfohlen (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

SMBE 2015: Rapid Identification of Phylogenetically Informative Data from Next-Gen Sequencing

  • 1. Rapid identification of phylogenetically informative data from next-gen sequencing Rachel Schwartz The Biodesign Institute Arizona State University Rachel.Schwartz@asu.edu July 16, 2015
  • 2. Big data for phylogenetics
  • 3. Phylogenomics requires a lot of time and money
  • 4. SISRS: Site Identification from Short Read Sequences
  • 5. A composite genome for reference Shotgun sequencingShotgun sequencing Shotgun sequencing Assemble composite genome
  • 6. A composite genome for reference
  • 7. A composite genome for reference Shotgun sequencingShotgun sequencing Shotgun sequencing Assemble composite genome Align reads to composite genome
  • 8. A composite genome for reference Shotgun sequencingShotgun sequencing Shotgun sequencing Assemble composite genome Align reads to composite genome Call genotype at each site for each sample
  • 9. A composite genome for reference Shotgun sequencingShotgun sequencing Shotgun sequencing Assemble composite genome Align reads to composite genome Call genotype at each site for each sample Remove sites with missing data
  • 10. A composite genome for reference Shotgun sequencingShotgun sequencing Shotgun sequencing Assemble composite genome Align reads to composite genome Call genotype at each site for each sample Remove sites with missing data Output alignment
  • 11. Simulations A B C D E F G H Laddertrees Equal branch length A B C D E F G H Long deep branches A B C D E F G H Short deep branches A B C D E F G H Balancedtrees A B C D E F G H A B C D E F G H
  • 12. Simulation Results: 1 million bp genome A B C D E F G H Laddertrees Equal branch length A B C D E F G H Long deep branches A B C D E F G H Short deep branches G H G H G H Coverage Numberofcorrectmappablesites 1 10 100 1000 10000 100000 1 2 4 8 10 20 50 ● ● ● ● ● ● ● Slow genes Fast genes ● Schwartz et al. (2015) BMC Bioinformatics
  • 13. Simulation Results: by depth Coverage Numberofcorrectmappablesites 1 10 100 1000 10000 100000 1 2 4 8 10 20 50 q q q q q q q q q q q q q A B C D E F G H Laddertrees Equal branch length Long C D E F G H Balancedtrees Depth 1 Depth 2 Depth 3 Depth 4 ∗ Depth 5 • Depth 6 Schwartz et al. (2015) BMC Bioinformatics
  • 14. Phylogeny of apes from SISRS data Bonobo Human Gorilla Orangutan Rhesus macaque Crab macaque Chimp
  • 15. Phylogeny of mammals from SISRS data treeshrew horse pig cow toothed whale baleen whale pangolin dog cat bat megabat shrew star nosed mole aardvark tenrec elephant shrew manatee elephant sloth armadillo opossum wallaby rabbit pika rat mouse colugo lemur human macaque 100 90 100 91 61 100 100 100 100 100 86 51 100 100 100 100 100 100 100 100 72 100 100 100 100 100 100 Schwartz et al. (2015) BMC Bioinformatics
  • 16. Phylogeny of mammals from SISRS data colugo sn mole shrew horse pig cow baleenwhale toothedwhale pangolin dog cat bat megabat aardvark tenrec e shrew elephant manatee opossum wallaby sloth armadillo treeshrew rat mouse rabbit pika lemur human macaque 100 60 100 100 100 100 100 100 100 100 53 100 100 100 100 100 100 100 99 100 100 75 100 100 100 100 treeshrew horse pig cow toothedwhale baleenwhale pangolin dog cat bat megabat shrew sn mole aardvark tenrec e shrew manatee elephant sloth armadillo opossum wallaby rabbit pika rat mouse colugo lemur human macaque 100 90 100 91 61 100 100 100 100 100 86 51 100 100 100 100 100 100 100 100 72 100 100 100 100 100 100 lemur colugo bat megabat horse pig cow toothedwhale baleenwhale pangolin dog cat sn mole shrew manatee elephant tenrec aardvark e shrew wallaby opossum armadillo sloth treeshrew rabbit pika rat mouse human macaque 100 100 61 62 61 100 100 100 100 100 87 80 100 100 100 100 100 100 100 100 100 100 92 100 100 100 100 Schwartz et al. (2015) BMC Bioinformatics
  • 17. Phylogeny of mammals from SISRS data Opossum Wallaby aardvarkG armadillo baleenwhaleG bat cat colugoG cow dog elephant eshrewG horse human lemur macaque manateeG megabatG mouse pangolinG pig pika rabbit ratT shrew slothG sn moleG tenrecG toothedwhale treeshrew
  • 18. Phylogenies of angiosperms from SISRS data q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q 50 75 100 125 20 40 60 Number of Gaps Allowed At A Site Robinson−FouldsDistance q q Distance Nodes In Tree Comparing Trees Generated With Varying Amounts of Missing Data Adam Orr
  • 19. SISRS rapidly identifies phylogenetically informative data from next-gen sequencing reads Apes: 3 days Mammals: 7 days Leishmania: 12 hours No reference genome is required. Minimal assembly required (completely automated). Results are comparable to slower, labor-intensive methods.
  • 22. Branch length estimation T5 T4 T2 T1 T8 T3 T6 T7 T9 T8 T4 T1 T2 T7 T6 T9 T5 T3 T9 T7 T6 T3 T8 T1 T2 T4 T5 T8 T4 T1 T2 T7 T6 T9 T3 T9 T7 T6 T3 T8 T1 T2 T4 T5 Tree height = 1 q q q qq q qq q qq q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 Simulated branch length Estimatedbranchlength
  • 23. Branch length estimation T5 T4 T2 T1 T8 T3 T6 T7 T9 T8 T4 T1 T2 T7 T6 T9 T5 T3 T9 T7 T6 T3 T8 T1 T2 T4 T5 T8 T4 T1 T2 T7 T6 T9 T3 T9 T7 T6 T3 T8 T1 T2 T4 T5 Height = Sim height q qq qqqqqq qq q q q qqq q q q qq q q q qq q q qq q qq q q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq q q qqqq q qqq qqq qq q q q q qq q q q q q q q q q q q q q q q q 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.00.10.20.30.40.50.6 Simulated branch length Estimatedbranchlength
  • 24. Branch length estimation: most variable loci qq q q qqq q q q qq q q q q q 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.00.10.20.30.40.50.6 Simulated branch length Estimatedbranchlength Slope / Cor / Max Brlen 2.06 1 0.03 1.3 0.99 0.05 0.7 0.99 0.08 0.29 1 0.28 0.24 0.93 0.29 0.21 0.61 0.52
  • 25. Branch length estimation: most conserved loci qqq q qq qq qq qqqq qq q 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.00.10.20.30.40.50.6 Simulated branch length Estimatedbranchlength Slope / Cor / Max Brlen 0.85 0.72 0.03 0.73 0.96 0.05 0.73 0.9 0.08 0.3 0.97 0.28 0.19 0.58 0.29 0.17 0.78 0.52
  • 26. Conclusions SISRS rapidly identifies data for phylogenetics from next-gen sequencing reads Different (SISRS) data = alternative topologies Use SISRS data to estimate branch lengths and divergence dates accurately
  • 27. Acknowledgements Co-authors / Collaborators Reed Cartwright (ASU) Kelly Harkins (ASU and UCSC) Anne Stone (ASU) Kael Dai (ASU) Adam Orr (ASU) Mike Miller (Villanova) Funding NSF DBI-1356548 NIH R01-GM101352-01A1 NSF DDIG BCS-1232582 ASU Startup Funds SISRS is available at https://github.com/rachelss/SISRS