SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Rapid identification of phylogenetically informative
data from next-gen sequencing
Rachel Schwartz
The Biodesign Institute
Arizona State University
Rachel.Schwartz@asu.edu
July 16, 2015
Big data for phylogenetics
Phylogenomics requires a lot of time and money
SISRS: Site Identification from Short Read Sequences
A composite genome for reference
Shotgun sequencingShotgun sequencing Shotgun sequencing
Assemble composite genome
A composite genome for reference
A composite genome for reference
Shotgun sequencingShotgun sequencing Shotgun sequencing
Assemble composite genome
Align reads to
composite genome
A composite genome for reference
Shotgun sequencingShotgun sequencing Shotgun sequencing
Assemble composite genome
Align reads to
composite genome
Call genotype at each
site for each sample
A composite genome for reference
Shotgun sequencingShotgun sequencing Shotgun sequencing
Assemble composite genome
Align reads to
composite genome
Call genotype at each
site for each sample
Remove sites
with missing data
A composite genome for reference
Shotgun sequencingShotgun sequencing Shotgun sequencing
Assemble composite genome
Align reads to
composite genome
Call genotype at each
site for each sample
Remove sites
with missing data
Output alignment
Simulations
A
B
C
D
E
F
G
H
Laddertrees
Equal branch length
A
B
C
D
E
F
G
H
Long deep branches
A
B
C
D
E
F
G
H
Short deep branches
A
B
C
D
E
F
G
H
Balancedtrees
A
B
C
D
E
F
G
H
A
B
C
D
E
F
G
H
Simulation Results: 1 million bp genome
A
B
C
D
E
F
G
H
Laddertrees
Equal branch length
A
B
C
D
E
F
G
H
Long deep branches
A
B
C
D
E
F
G
H
Short deep branches
G
H
G
H
G
H
Coverage
Numberofcorrectmappablesites
1
10
100
1000
10000
100000
1 2 4 8 10 20 50
●
●
●
● ● ● ●
Slow genes
Fast genes
●
Schwartz et al. (2015) BMC Bioinformatics
Simulation Results: by depth
Coverage
Numberofcorrectmappablesites
1
10
100
1000
10000
100000
1 2 4 8 10 20 50
q
q
q
q q q q
q
q
q
q
q q
A
B
C
D
E
F
G
H
Laddertrees
Equal branch length Long
C
D
E
F
G
H
Balancedtrees
Depth 1
Depth 2
Depth 3
Depth 4
∗ Depth 5
• Depth 6
Schwartz et al. (2015) BMC Bioinformatics
Phylogeny of apes from SISRS data
Bonobo
Human
Gorilla
Orangutan
Rhesus macaque
Crab macaque
Chimp
Phylogeny of mammals from SISRS data
treeshrew
horse
pig
cow
toothed whale
baleen whale
pangolin
dog
cat
bat
megabat
shrew
star nosed mole
aardvark
tenrec
elephant shrew
manatee
elephant
sloth
armadillo
opossum
wallaby
rabbit
pika
rat
mouse
colugo
lemur
human
macaque
100
90
100
91
61
100
100
100
100
100
86
51
100
100
100
100
100
100
100
100
72
100
100
100
100
100
100
Schwartz et al. (2015) BMC Bioinformatics
Phylogeny of mammals from SISRS data
colugo
sn mole
shrew
horse
pig
cow
baleenwhale
toothedwhale
pangolin
dog
cat
bat
megabat
aardvark
tenrec
e shrew
elephant
manatee
opossum
wallaby
sloth
armadillo
treeshrew
rat
mouse
rabbit
pika
lemur
human
macaque
100
60
100
100
100
100
100
100
100
100
53
100
100
100
100
100
100
100
99
100
100
75
100
100
100
100
treeshrew
horse
pig
cow
toothedwhale
baleenwhale
pangolin
dog
cat
bat
megabat
shrew
sn mole
aardvark
tenrec
e shrew
manatee
elephant
sloth
armadillo
opossum
wallaby
rabbit
pika
rat
mouse
colugo
lemur
human
macaque
100
90
100
91
61
100
100
100
100
100
86
51
100
100
100
100
100
100
100
100
72
100
100
100
100
100
100
lemur
colugo
bat
megabat
horse
pig
cow
toothedwhale
baleenwhale
pangolin
dog
cat
sn mole
shrew
manatee
elephant
tenrec
aardvark
e shrew
wallaby
opossum
armadillo
sloth
treeshrew
rabbit
pika
rat
mouse
human
macaque
100
100
61
62
61
100
100
100
100
100
87
80
100
100
100
100
100
100
100
100
100
100
92
100
100
100
100
Schwartz et al. (2015) BMC Bioinformatics
Phylogeny of mammals from SISRS data
Opossum
Wallaby
aardvarkG
armadillo
baleenwhaleG
bat
cat
colugoG
cow
dog
elephant
eshrewG
horse
human
lemur
macaque
manateeG
megabatG
mouse
pangolinG
pig
pika
rabbit
ratT
shrew
slothG
sn moleG
tenrecG
toothedwhale
treeshrew
Phylogenies of angiosperms from SISRS data
q
q
qq
q
q
q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
50
75
100
125
20 40 60
Number of Gaps Allowed At A Site
Robinson−FouldsDistance
q
q
Distance
Nodes In Tree
Comparing Trees Generated With Varying Amounts of Missing Data
Adam Orr
SISRS rapidly identifies phylogenetically informative
data from next-gen sequencing reads
Apes: 3 days
Mammals: 7 days
Leishmania: 12 hours
No reference genome is required.
Minimal assembly required (completely automated).
Results are comparable to slower, labor-intensive methods.
Divergence dating
Branch length estimation
T5
T4
T2
T1
T8
T3
T6
T7
T9
T8
T4
T1
T2
T7
T6
T9
T5
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
T8
T4
T1
T2
T7
T6
T9
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
Tree height = 1
Branch length estimation
T5
T4
T2
T1
T8
T3
T6
T7
T9
T8
T4
T1
T2
T7
T6
T9
T5
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
T8
T4
T1
T2
T7
T6
T9
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
Tree height = 1
q
q
q
qq q
qq
q
qq
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
Simulated branch length
Estimatedbranchlength
Branch length estimation
T5
T4
T2
T1
T8
T3
T6
T7
T9
T8
T4
T1
T2
T7
T6
T9
T5
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
T8
T4
T1
T2
T7
T6
T9
T3
T9
T7
T6
T3
T8
T1
T2
T4
T5
Height = Sim height
q
qq
qqqqqq
qq
q
q
q
qqq
q
q
q
qq
q
q
q
qq
q
q
qq
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
q
q
q
q
qq
q
q
qqqq
q
qqq
qqq
qq
q
q
q
q
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
0.0 0.1 0.2 0.3 0.4 0.5 0.6
0.00.10.20.30.40.50.6
Simulated branch length
Estimatedbranchlength
Branch length estimation: most variable loci
qq
q
q
qqq
q
q
q
qq
q
q
q
q
q
0.0 0.1 0.2 0.3 0.4 0.5 0.6
0.00.10.20.30.40.50.6
Simulated branch length
Estimatedbranchlength
Slope / Cor / Max Brlen
2.06 1 0.03
1.3 0.99 0.05
0.7 0.99 0.08
0.29 1 0.28
0.24 0.93 0.29
0.21 0.61 0.52
Branch length estimation: most conserved loci
qqq
q
qq
qq
qq
qqqq
qq
q
0.0 0.1 0.2 0.3 0.4 0.5 0.6
0.00.10.20.30.40.50.6
Simulated branch length
Estimatedbranchlength
Slope / Cor / Max Brlen
0.85 0.72 0.03
0.73 0.96 0.05
0.73 0.9 0.08
0.3 0.97 0.28
0.19 0.58 0.29
0.17 0.78 0.52
Conclusions
SISRS rapidly identifies data for phylogenetics from
next-gen sequencing reads
Different (SISRS) data = alternative topologies
Use SISRS data to estimate branch lengths and
divergence dates accurately
Acknowledgements
Co-authors / Collaborators
Reed Cartwright (ASU)
Kelly Harkins (ASU and
UCSC)
Anne Stone (ASU)
Kael Dai (ASU)
Adam Orr (ASU)
Mike Miller (Villanova)
Funding
NSF DBI-1356548
NIH R01-GM101352-01A1
NSF DDIG BCS-1232582
ASU Startup Funds
SISRS is available at
https://github.com/rachelss/SISRS

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
LeenakshiTyagi
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
University of Hertfordshire
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 

Kürzlich hochgeladen (20)

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 

Empfohlen

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Empfohlen (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

SMBE 2015: Rapid Identification of Phylogenetically Informative Data from Next-Gen Sequencing

  • 1. Rapid identification of phylogenetically informative data from next-gen sequencing Rachel Schwartz The Biodesign Institute Arizona State University Rachel.Schwartz@asu.edu July 16, 2015
  • 2. Big data for phylogenetics
  • 3. Phylogenomics requires a lot of time and money
  • 4. SISRS: Site Identification from Short Read Sequences
  • 5. A composite genome for reference Shotgun sequencingShotgun sequencing Shotgun sequencing Assemble composite genome
  • 6. A composite genome for reference
  • 7. A composite genome for reference Shotgun sequencingShotgun sequencing Shotgun sequencing Assemble composite genome Align reads to composite genome
  • 8. A composite genome for reference Shotgun sequencingShotgun sequencing Shotgun sequencing Assemble composite genome Align reads to composite genome Call genotype at each site for each sample
  • 9. A composite genome for reference Shotgun sequencingShotgun sequencing Shotgun sequencing Assemble composite genome Align reads to composite genome Call genotype at each site for each sample Remove sites with missing data
  • 10. A composite genome for reference Shotgun sequencingShotgun sequencing Shotgun sequencing Assemble composite genome Align reads to composite genome Call genotype at each site for each sample Remove sites with missing data Output alignment
  • 11. Simulations A B C D E F G H Laddertrees Equal branch length A B C D E F G H Long deep branches A B C D E F G H Short deep branches A B C D E F G H Balancedtrees A B C D E F G H A B C D E F G H
  • 12. Simulation Results: 1 million bp genome A B C D E F G H Laddertrees Equal branch length A B C D E F G H Long deep branches A B C D E F G H Short deep branches G H G H G H Coverage Numberofcorrectmappablesites 1 10 100 1000 10000 100000 1 2 4 8 10 20 50 ● ● ● ● ● ● ● Slow genes Fast genes ● Schwartz et al. (2015) BMC Bioinformatics
  • 13. Simulation Results: by depth Coverage Numberofcorrectmappablesites 1 10 100 1000 10000 100000 1 2 4 8 10 20 50 q q q q q q q q q q q q q A B C D E F G H Laddertrees Equal branch length Long C D E F G H Balancedtrees Depth 1 Depth 2 Depth 3 Depth 4 ∗ Depth 5 • Depth 6 Schwartz et al. (2015) BMC Bioinformatics
  • 14. Phylogeny of apes from SISRS data Bonobo Human Gorilla Orangutan Rhesus macaque Crab macaque Chimp
  • 15. Phylogeny of mammals from SISRS data treeshrew horse pig cow toothed whale baleen whale pangolin dog cat bat megabat shrew star nosed mole aardvark tenrec elephant shrew manatee elephant sloth armadillo opossum wallaby rabbit pika rat mouse colugo lemur human macaque 100 90 100 91 61 100 100 100 100 100 86 51 100 100 100 100 100 100 100 100 72 100 100 100 100 100 100 Schwartz et al. (2015) BMC Bioinformatics
  • 16. Phylogeny of mammals from SISRS data colugo sn mole shrew horse pig cow baleenwhale toothedwhale pangolin dog cat bat megabat aardvark tenrec e shrew elephant manatee opossum wallaby sloth armadillo treeshrew rat mouse rabbit pika lemur human macaque 100 60 100 100 100 100 100 100 100 100 53 100 100 100 100 100 100 100 99 100 100 75 100 100 100 100 treeshrew horse pig cow toothedwhale baleenwhale pangolin dog cat bat megabat shrew sn mole aardvark tenrec e shrew manatee elephant sloth armadillo opossum wallaby rabbit pika rat mouse colugo lemur human macaque 100 90 100 91 61 100 100 100 100 100 86 51 100 100 100 100 100 100 100 100 72 100 100 100 100 100 100 lemur colugo bat megabat horse pig cow toothedwhale baleenwhale pangolin dog cat sn mole shrew manatee elephant tenrec aardvark e shrew wallaby opossum armadillo sloth treeshrew rabbit pika rat mouse human macaque 100 100 61 62 61 100 100 100 100 100 87 80 100 100 100 100 100 100 100 100 100 100 92 100 100 100 100 Schwartz et al. (2015) BMC Bioinformatics
  • 17. Phylogeny of mammals from SISRS data Opossum Wallaby aardvarkG armadillo baleenwhaleG bat cat colugoG cow dog elephant eshrewG horse human lemur macaque manateeG megabatG mouse pangolinG pig pika rabbit ratT shrew slothG sn moleG tenrecG toothedwhale treeshrew
  • 18. Phylogenies of angiosperms from SISRS data q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q 50 75 100 125 20 40 60 Number of Gaps Allowed At A Site Robinson−FouldsDistance q q Distance Nodes In Tree Comparing Trees Generated With Varying Amounts of Missing Data Adam Orr
  • 19. SISRS rapidly identifies phylogenetically informative data from next-gen sequencing reads Apes: 3 days Mammals: 7 days Leishmania: 12 hours No reference genome is required. Minimal assembly required (completely automated). Results are comparable to slower, labor-intensive methods.
  • 22. Branch length estimation T5 T4 T2 T1 T8 T3 T6 T7 T9 T8 T4 T1 T2 T7 T6 T9 T5 T3 T9 T7 T6 T3 T8 T1 T2 T4 T5 T8 T4 T1 T2 T7 T6 T9 T3 T9 T7 T6 T3 T8 T1 T2 T4 T5 Tree height = 1 q q q qq q qq q qq q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 Simulated branch length Estimatedbranchlength
  • 23. Branch length estimation T5 T4 T2 T1 T8 T3 T6 T7 T9 T8 T4 T1 T2 T7 T6 T9 T5 T3 T9 T7 T6 T3 T8 T1 T2 T4 T5 T8 T4 T1 T2 T7 T6 T9 T3 T9 T7 T6 T3 T8 T1 T2 T4 T5 Height = Sim height q qq qqqqqq qq q q q qqq q q q qq q q q qq q q qq q qq q q q q q q q q q q q q q q q q q q q q qq q q q q q q q qq q q qqqq q qqq qqq qq q q q q qq q q q q q q q q q q q q q q q q 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.00.10.20.30.40.50.6 Simulated branch length Estimatedbranchlength
  • 24. Branch length estimation: most variable loci qq q q qqq q q q qq q q q q q 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.00.10.20.30.40.50.6 Simulated branch length Estimatedbranchlength Slope / Cor / Max Brlen 2.06 1 0.03 1.3 0.99 0.05 0.7 0.99 0.08 0.29 1 0.28 0.24 0.93 0.29 0.21 0.61 0.52
  • 25. Branch length estimation: most conserved loci qqq q qq qq qq qqqq qq q 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.00.10.20.30.40.50.6 Simulated branch length Estimatedbranchlength Slope / Cor / Max Brlen 0.85 0.72 0.03 0.73 0.96 0.05 0.73 0.9 0.08 0.3 0.97 0.28 0.19 0.58 0.29 0.17 0.78 0.52
  • 26. Conclusions SISRS rapidly identifies data for phylogenetics from next-gen sequencing reads Different (SISRS) data = alternative topologies Use SISRS data to estimate branch lengths and divergence dates accurately
  • 27. Acknowledgements Co-authors / Collaborators Reed Cartwright (ASU) Kelly Harkins (ASU and UCSC) Anne Stone (ASU) Kael Dai (ASU) Adam Orr (ASU) Mike Miller (Villanova) Funding NSF DBI-1356548 NIH R01-GM101352-01A1 NSF DDIG BCS-1232582 ASU Startup Funds SISRS is available at https://github.com/rachelss/SISRS