SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
Transposable element proliferation and genome size evolution
                          in Asparagales
                                     Naturehills.com
                                                                                   Kate L. Hertweck                                              Erica Wheeler     ag.arizona.edu



                                  National Evolutionary Synthesis Center (NESCent), Durham, NC, USA                                                                                                         wikicommons



    k8hertweck@gmail.com, Twitter: @k8hert, http://k8hert.blogspot.com, http://www.slideshare.net/katehertweck
                                                                                                                                      Subfamily                      # of           Chromosome #     Genome size (pg/1C)
                                                                                                                                                                   species          (taxa sampled)     (taxa sampled)

Introduction                                                                                                                           Xeronemataceae                  2                 34
                                                                                                                                                                                         (1)
                                                                                                                                                                                                                  3.28
                                                                                                                                                                                                                   (1)            outgroup




                                                                                                                                                                                                                                    Amaryllidaceae Xanthorrhoeaceae
Asparagales as a model system                                                                                                          Asphodeloideae*               785                12–78
                                                                                                                                                                                         (128)
                                                                                                                                                                                                           5.25-38.3
                                                                                                                                                                                                             (139)
●  14 families, 1122 genera, ~26000 species; diverged over 100 mya (Stevens,
   2001 onwards)                                                                                                                       Hemerocallodoideae             85                 32
                                                                                                                                                                                         (1)
                                                                                                                                                                                                                  0.76
                                                                                                                                                                                                                   (1)
● Many edible and ornamental species
                                                                                                                                       Xanthorrhoeoideae*             30                 22                       1.04
● Variation in karyotype and genome size (Pires et al., 2006; Figure 1)                                                                                                                  (1)                       (1)

● Paucity of genomic resources, especially for TEs (see below)
                                                                                                                                       Agapanthoideae                  9                 30               11.23-23.78
                                                                                                                                                                                         (7)                   (9)
● “Core” Asparagales represents monophyletic lineage of three closely related
                                                                                                                                       Allioideae                    795                10-66               7.6-74.5
  families                                                                                                                                                                              (153)                 (162)

Transposable elements (TEs)                                                                                                            Amaryllidoideae               800                10-72
                                                                                                                                                                                         (93)
                                                                                                                                                                                                           6.15-82.15
                                                                                                                                                                                                              (112)
●   Mobile genetic elements able to replicate and move throughout a genome
                                                                                                                                       Lomandroideae                 178                 8-32              1.25-25.3
●   Represent at least 50% of the DNA in many eukaryotic genomes                                                                                                                          (6)                 (8)
●   Both fine and coarse scale implications in genomic and organismal evolution                                                        Asparagoideae               165-295              20-112             1.28-4.18
                                                                                                                                                                                          (3)                 (3)
●   Contribute to increases in genome size independent of, but sometimes in




                                                                                                                                                                                                                                    Asparagaecae
    conjunction with, polyploidy and other types of sequence duplication                                                               Nolinoideae*                  475                30-108
                                                                                                                                                                                          (11)
                                                                                                                                                                                                           0.93-53.5
                                                                                                                                                                                                              (33)
    (Federoff, 2012)
                                                                                                                                       Aphyllanthoideae                1                 N/A                      0.65
Research objectives                                                                                                                                                                                                (1)

●   Assemble consensus sequences of the most abundant (recently proliferated)                                                          Agavoideae*                   637                16-180
                                                                                                                                                                                         (56)
                                                                                                                                                                                                           2.55-19.6
                                                                                                                                                                                                              (98)
    TEs in Asparagales genomes
●   Estimate the relative abundance of each type of TE                                                                                 Scilloideae                770-1000               6-54               2.6-75.9
                                                                                                                                                                                         (75)                 (109)

                                                                                                                                       Brodiaeoideae                  62                   4              10.65-18.15
                                                                                                                                                                                          (1)                 (3)

                   Filter out scaffolds that BLAST to reference organellar genomes                               Figure 1. Phylogeny of subfamilies in core Asparagales based on all plastome genes (Steele et al.,
                                                                                                                 2012). Classification based on APGIII (2009). Subfamilies in green were included in sampling for the
                                                                                                                 present study. Species estimates for each subfamily are from the Angiosperm Phylogeny Website
        Raw fastq files from low coverage, anonymous, sequencing of total genomic DNA                            (Stevens, 2001 onwards). Chromosome number and genome size ranges obtained from the Plant DNA
                                                                                                                 C-values Database (Bennett, 2010). Asterisks (*) indicate subfamilies containing taxa with bimodal
                                De novo genome assembly                                                          karyotypes.
               (MSR-CA, http://www.genome.umd.edu/SR_CA_MANUAL.htm)
                                                                                                                 Results, conclusions, future directions
                      Run RepeatMasker to identify similarity to known repeats                                   ● Assembly of all classes of TEs possible from GSS
                             (3110 repeats, 98.7% are from grasses )                                             ● Most scaffolds are partial sequences, although full-length TEs occur

                                                                                                                 ● Proportion of different TEs varies independent of genome size and


        Discard unknown scaffolds and “unimportant” repeats, categorize others by type                             phylogeny
                                                                                                                 ● What variation is there among families of each TE type?

                                                                                                                 ● Are there unique TE families in Asparagales?
               Map raw reads back to scaffolds to estimate relative proportion of TE
                                                                                                                 ● What is the sequence variation of reads mapping to these scaffolds?

Figure 2. Diagram of the bioinformatics pipeline to assemble and annotate TEs from Illumina GSS data in          ● Are there correlations between TE presence/abundance and life
this study.
                                                                                                                   history traits?

                                                                                                                               100                                                                    25
Methods                                                                                                                         90

                                                                                                                                                                                                            genome size (pg/1C)
Sequencing                                                                                                                      80                                                                    20
                                                                                                                  percentage




●  Genome survey sequences (GSS): anonymous, low-coverage sequencing                                                            70
   from total genomic DNA
● Illumina GAIIx, single-end, 80 bp reads (Steele et al., 2012)
                                                                                                                                60                                                                    15
● Proof-of-concept and quality control with six Poaceae taxa, the monocot
                                                                                                                                50
  genomic model system (data not shown)                                                                                         40                                                                    10
                                                                                                                                30
Bioinformatics
● De novo genome assembly, TE annotation, scaffold filtering, read mapping                                                      20                                                                    5
  (Figure 2)                                                                                                                    10
● Custom scripts available at http://github.com/k8hertweck/AsparagalesTEscripts
                                                                                                                                0                                                                     0


Taxon                 Subfamily                Genome size Average   MSR     % organellar % repeat                                                                                                     LINEs
                                                 (pg/1C)    genome scaffolds  scaffolds   scaffolds                                                                                                    Copia LTRs
                                                           coverage                                                                                                                                    Gypsy LTRs
Asphodeloideae        Haworthia                    15.2      0.02X   1360        2.6        29.1                                                                                                       DNA TEs
                                                                                                               Figure 3. Percentage of different TE types of total repetitive fraction of
Agapanthoideae        Agapanthus                   10.5      0.01X   438         7.8        34.9               representative core Asparagales taxa, arranged in order of increasing                   other (RC, satellite, low
                                                                                                                                                                                                       complexity, simple repeats)
Allioideae            Allium                       13.2      0.03X   1858        7.6        23.9               total genome size.
                                                                                                                                                                                                       Genome size (pg/1C)
Amaryllidoideae       Scadoxus                     22.1      0.02X   1336        4.1        30.0
Lomandroideae         Lomandra                     1.15      0.33X   1491        7.6        29.2               Acknowledgements
Asparagoideae         Asparagus                    1.36      0.30X   1977        2.6        26.5               I acknowledge the National Science Foundation for funding (DEB 0829849 and DEB 1146603), as well as
                                                                                                               collaborators on the Monocot AToL project.
Nolinoideae           Sansevieria                  1.25      0.32X   835         6.9        26.7
Aphyllanthoideae      Aphyllanthes                 0.65      0.34X   436         15.3       38.0               References
Agavoideae            Hosta                        19.6       N/A    1084        6.1        34.5               APGIII. 2009. An update of the Angiosperm Phylogeny Group classification for the orders and families of
                                                                                                               flowering plants: APG III. Botanical Journal Of The Linnean Society 161: 105-121.
Scilloideae           Ledebouria                   8.85      0.04X   2481        4.4        24.8
                                                                                                               Bennett, M. D., and I. J. Leitch. 2010. Angiosperm DNA C-values database. http://www.kew.org/cvalues.
Brodiaeoideae         Dichelostemma                9.35      0.03X   1706        1.5        27.7               Fedoroff, N. V. 2012. Transposable Elements, Epigenetics,and Genome Evolution. Science 338:758-767.
                                                                                                               Pires, J. C., I. J. Maureira, T. J. Givnish, K. J. Sytsma, O. Seberg, G. Petersen, J. I. Davis, et al.
Table 1. Results of TE assembly and annotation from Asparagales taxa following the bioinformatics methods      2006. Phylogeny, genome size, and chromosome evolution of Asparagales. Aliso 22: 285-302.
in Figure 2. Genome size data for samples sequenced is described in Steele et al. (2012). Average genome       Steele, P. R., K. L. Hertweck, D. Mayfield, M. R. McKain, J. Leebens-Mack, and J. C. Pires. 2012.
coverage calculated from 1C genome size, read length, and number of reads for each sample; coverage data for   Quality and quantity of data recovered from massively parallel sequencing: Examples in Asparagales and
Hosta is unavailable as sequencing was performed on DNA enriched for plastome. Percentages represent           Poaceae. American Journal Of Botany 99: 330-348.
proportion of reads belonging to organelles and annotated repeats from total number of MSR scaffolds.          Stevens, P. F. 2001 onwards. Angiosperm Phylogeny Website
                                                                                                               http://www.mobot.org/MOBOT/research/APweb/ [accessed Jan 2013].

Weitere ähnliche Inhalte

Mehr von Kate Hertweck

Archives of a Future Commons: Seeds and/as Data
Archives of a Future Commons:  Seeds and/as DataArchives of a Future Commons:  Seeds and/as Data
Archives of a Future Commons: Seeds and/as DataKate Hertweck
 
Hertweck Evolution 2017
Hertweck Evolution 2017Hertweck Evolution 2017
Hertweck Evolution 2017Kate Hertweck
 
Hertweck AB3ACBS presentation
Hertweck AB3ACBS presentationHertweck AB3ACBS presentation
Hertweck AB3ACBS presentationKate Hertweck
 
Transposable elements of Agavoideae
Transposable elements of AgavoideaeTransposable elements of Agavoideae
Transposable elements of AgavoideaeKate Hertweck
 
Developing an undergraduate bioinformatics course
Developing an undergraduate bioinformatics courseDeveloping an undergraduate bioinformatics course
Developing an undergraduate bioinformatics courseKate Hertweck
 
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)Kate Hertweck
 
Hertweck Evolution 2014
Hertweck Evolution 2014Hertweck Evolution 2014
Hertweck Evolution 2014Kate Hertweck
 
Hertweck Monocots V Presentation
Hertweck Monocots V PresentationHertweck Monocots V Presentation
Hertweck Monocots V PresentationKate Hertweck
 
iEvoBio Hertweck abstract 2012
iEvoBio Hertweck abstract 2012iEvoBio Hertweck abstract 2012
iEvoBio Hertweck abstract 2012Kate Hertweck
 
iEvoBio Hertweck presentation 2012
iEvoBio Hertweck presentation 2012iEvoBio Hertweck presentation 2012
iEvoBio Hertweck presentation 2012Kate Hertweck
 

Mehr von Kate Hertweck (15)

Archives of a Future Commons: Seeds and/as Data
Archives of a Future Commons:  Seeds and/as DataArchives of a Future Commons:  Seeds and/as Data
Archives of a Future Commons: Seeds and/as Data
 
Hertweck Evolution 2017
Hertweck Evolution 2017Hertweck Evolution 2017
Hertweck Evolution 2017
 
Hertweck AB3ACBS presentation
Hertweck AB3ACBS presentationHertweck AB3ACBS presentation
Hertweck AB3ACBS presentation
 
Transposable elements of Agavoideae
Transposable elements of AgavoideaeTransposable elements of Agavoideae
Transposable elements of Agavoideae
 
Careers in Botany
Careers in BotanyCareers in Botany
Careers in Botany
 
Developing an undergraduate bioinformatics course
Developing an undergraduate bioinformatics courseDeveloping an undergraduate bioinformatics course
Developing an undergraduate bioinformatics course
 
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
 
Hertweck Evolution 2014
Hertweck Evolution 2014Hertweck Evolution 2014
Hertweck Evolution 2014
 
Hertweck Monocots V Presentation
Hertweck Monocots V PresentationHertweck Monocots V Presentation
Hertweck Monocots V Presentation
 
Phylolecture
PhylolecturePhylolecture
Phylolecture
 
Hertweck bbl2012
Hertweck bbl2012Hertweck bbl2012
Hertweck bbl2012
 
Hertweck uva2012
Hertweck uva2012Hertweck uva2012
Hertweck uva2012
 
iEvoBio Hertweck abstract 2012
iEvoBio Hertweck abstract 2012iEvoBio Hertweck abstract 2012
iEvoBio Hertweck abstract 2012
 
iEvoBio Hertweck presentation 2012
iEvoBio Hertweck presentation 2012iEvoBio Hertweck presentation 2012
iEvoBio Hertweck presentation 2012
 
Evolution 2012
Evolution 2012Evolution 2012
Evolution 2012
 

Transposable element proliferation and genome size evolution in Asparagales

  • 1. Transposable element proliferation and genome size evolution in Asparagales Naturehills.com Kate L. Hertweck Erica Wheeler ag.arizona.edu National Evolutionary Synthesis Center (NESCent), Durham, NC, USA wikicommons k8hertweck@gmail.com, Twitter: @k8hert, http://k8hert.blogspot.com, http://www.slideshare.net/katehertweck Subfamily # of Chromosome # Genome size (pg/1C) species (taxa sampled) (taxa sampled) Introduction Xeronemataceae 2 34 (1) 3.28 (1) outgroup Amaryllidaceae Xanthorrhoeaceae Asparagales as a model system Asphodeloideae* 785 12–78 (128) 5.25-38.3 (139) ● 14 families, 1122 genera, ~26000 species; diverged over 100 mya (Stevens, 2001 onwards) Hemerocallodoideae 85 32 (1) 0.76 (1) ● Many edible and ornamental species Xanthorrhoeoideae* 30 22 1.04 ● Variation in karyotype and genome size (Pires et al., 2006; Figure 1) (1) (1) ● Paucity of genomic resources, especially for TEs (see below) Agapanthoideae 9 30 11.23-23.78 (7) (9) ● “Core” Asparagales represents monophyletic lineage of three closely related Allioideae 795 10-66 7.6-74.5 families (153) (162) Transposable elements (TEs) Amaryllidoideae 800 10-72 (93) 6.15-82.15 (112) ● Mobile genetic elements able to replicate and move throughout a genome Lomandroideae 178 8-32 1.25-25.3 ● Represent at least 50% of the DNA in many eukaryotic genomes (6) (8) ● Both fine and coarse scale implications in genomic and organismal evolution Asparagoideae 165-295 20-112 1.28-4.18 (3) (3) ● Contribute to increases in genome size independent of, but sometimes in Asparagaecae conjunction with, polyploidy and other types of sequence duplication Nolinoideae* 475 30-108 (11) 0.93-53.5 (33) (Federoff, 2012) Aphyllanthoideae 1 N/A 0.65 Research objectives (1) ● Assemble consensus sequences of the most abundant (recently proliferated) Agavoideae* 637 16-180 (56) 2.55-19.6 (98) TEs in Asparagales genomes ● Estimate the relative abundance of each type of TE Scilloideae 770-1000 6-54 2.6-75.9 (75) (109) Brodiaeoideae 62 4 10.65-18.15 (1) (3) Filter out scaffolds that BLAST to reference organellar genomes Figure 1. Phylogeny of subfamilies in core Asparagales based on all plastome genes (Steele et al., 2012). Classification based on APGIII (2009). Subfamilies in green were included in sampling for the present study. Species estimates for each subfamily are from the Angiosperm Phylogeny Website Raw fastq files from low coverage, anonymous, sequencing of total genomic DNA (Stevens, 2001 onwards). Chromosome number and genome size ranges obtained from the Plant DNA C-values Database (Bennett, 2010). Asterisks (*) indicate subfamilies containing taxa with bimodal De novo genome assembly karyotypes. (MSR-CA, http://www.genome.umd.edu/SR_CA_MANUAL.htm) Results, conclusions, future directions Run RepeatMasker to identify similarity to known repeats ● Assembly of all classes of TEs possible from GSS (3110 repeats, 98.7% are from grasses ) ● Most scaffolds are partial sequences, although full-length TEs occur ● Proportion of different TEs varies independent of genome size and Discard unknown scaffolds and “unimportant” repeats, categorize others by type phylogeny ● What variation is there among families of each TE type? ● Are there unique TE families in Asparagales? Map raw reads back to scaffolds to estimate relative proportion of TE ● What is the sequence variation of reads mapping to these scaffolds? Figure 2. Diagram of the bioinformatics pipeline to assemble and annotate TEs from Illumina GSS data in ● Are there correlations between TE presence/abundance and life this study. history traits? 100 25 Methods 90 genome size (pg/1C) Sequencing 80 20 percentage ● Genome survey sequences (GSS): anonymous, low-coverage sequencing 70 from total genomic DNA ● Illumina GAIIx, single-end, 80 bp reads (Steele et al., 2012) 60 15 ● Proof-of-concept and quality control with six Poaceae taxa, the monocot 50 genomic model system (data not shown) 40 10 30 Bioinformatics ● De novo genome assembly, TE annotation, scaffold filtering, read mapping 20 5 (Figure 2) 10 ● Custom scripts available at http://github.com/k8hertweck/AsparagalesTEscripts 0 0 Taxon Subfamily Genome size Average MSR % organellar % repeat LINEs (pg/1C) genome scaffolds scaffolds scaffolds Copia LTRs coverage Gypsy LTRs Asphodeloideae Haworthia 15.2 0.02X 1360 2.6 29.1 DNA TEs Figure 3. Percentage of different TE types of total repetitive fraction of Agapanthoideae Agapanthus 10.5 0.01X 438 7.8 34.9 representative core Asparagales taxa, arranged in order of increasing other (RC, satellite, low complexity, simple repeats) Allioideae Allium 13.2 0.03X 1858 7.6 23.9 total genome size. Genome size (pg/1C) Amaryllidoideae Scadoxus 22.1 0.02X 1336 4.1 30.0 Lomandroideae Lomandra 1.15 0.33X 1491 7.6 29.2 Acknowledgements Asparagoideae Asparagus 1.36 0.30X 1977 2.6 26.5 I acknowledge the National Science Foundation for funding (DEB 0829849 and DEB 1146603), as well as collaborators on the Monocot AToL project. Nolinoideae Sansevieria 1.25 0.32X 835 6.9 26.7 Aphyllanthoideae Aphyllanthes 0.65 0.34X 436 15.3 38.0 References Agavoideae Hosta 19.6 N/A 1084 6.1 34.5 APGIII. 2009. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG III. Botanical Journal Of The Linnean Society 161: 105-121. Scilloideae Ledebouria 8.85 0.04X 2481 4.4 24.8 Bennett, M. D., and I. J. Leitch. 2010. Angiosperm DNA C-values database. http://www.kew.org/cvalues. Brodiaeoideae Dichelostemma 9.35 0.03X 1706 1.5 27.7 Fedoroff, N. V. 2012. Transposable Elements, Epigenetics,and Genome Evolution. Science 338:758-767. Pires, J. C., I. J. Maureira, T. J. Givnish, K. J. Sytsma, O. Seberg, G. Petersen, J. I. Davis, et al. Table 1. Results of TE assembly and annotation from Asparagales taxa following the bioinformatics methods 2006. Phylogeny, genome size, and chromosome evolution of Asparagales. Aliso 22: 285-302. in Figure 2. Genome size data for samples sequenced is described in Steele et al. (2012). Average genome Steele, P. R., K. L. Hertweck, D. Mayfield, M. R. McKain, J. Leebens-Mack, and J. C. Pires. 2012. coverage calculated from 1C genome size, read length, and number of reads for each sample; coverage data for Quality and quantity of data recovered from massively parallel sequencing: Examples in Asparagales and Hosta is unavailable as sequencing was performed on DNA enriched for plastome. Percentages represent Poaceae. American Journal Of Botany 99: 330-348. proportion of reads belonging to organelles and annotated repeats from total number of MSR scaffolds. Stevens, P. F. 2001 onwards. Angiosperm Phylogeny Website http://www.mobot.org/MOBOT/research/APweb/ [accessed Jan 2013].