Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

"Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15

1.430 Aufrufe

Veröffentlicht am

Talk by Jonathan Eisen for Diversity of Life Workshop, Pacifica, CA.

Veröffentlicht in: Wissenschaft
  • Login to see the comments

"Searching for Novel Forms of Life" talk by Jonathan Eisen 12/16/15

  1. 1. Sea Searching for Novel Forms of Life Jonathan A. Eisen UC Davis @phylogenomics Diversity of Life Workshop Pacifica, CA December 16, 2015
  2. 2. Once You Find Something Alive … You find a CLE
  3. 3. Once You Find Something Alive … You find a CLE Separate Origin from Known Life? Common Origin with Known Life?
  4. 4. Once You Find Something Alive … You find a CLE Separate Origin from Known Life? Common Origin with Known Life? Homologies w/ Known Life?
  5. 5. Once You Find Something Alive … You find a CLE Separate Origin from Known Life? Common Origin with Known Life? Homologies w/ Known Life? No
  6. 6. Once You Find Something Alive … You find a CLE Separate Origin from Known Life? Common Origin with Known Life? Homologies w/ Known Life? Yes How Novel Is It?
  7. 7. Once You Find Something Alive … You find a CLE Separate Origin from Known Life? Common Origin with Known Life? Homologies w/ Known Life? Yes How Novel Is It?
  8. 8. • Novel form • Novel function • Novel phylogeny How Novel Is It?
  9. 9. • Novel form • Novel function • Novel phylogeny How Novel Is It?
  10. 10. Phylogeny
  11. 11. Archaea Worse Classification of Cultured Taxa by rRNA rRNA rRNArRNA ACUGC ACCUAU CGUUCG ACUCC AGCUAU CGAUCG ACCCC AGCUCU CGCUCG Taxa Characters S ACUGCACCUAUCGUUCG R ACUCCACCUAUCGUUCG E ACUCCAGCUAUCGAUCG F ACUCCAGGUAUCGAUCG C ACCCCAGCUCUCGCUCG W ACCCCAGCUCUGGCUCG EukaryotesBacteria Carl Woese
  12. 12. Woese 3 Domain Tree
  13. 13. rRNA Phylotyping: One Taxon DNA ACTGC ACCTAT CGTTCG ACTGC ACCTAT CGTTCG ACTGC ACCTAT CGTTCG Taxa Characters B1 ACTGCACCTATCGTTCG B2 ACTCCACCTATCGTTCG E1 ACTCCAGCTATCGATCG E2 ACTCCAGGTATCGATCG A1 ACCCCAGCTCTCGCTCG A2 ACCCCAGCTCTGGCTCG New1 ACTGCACCTATCGTTCG EukaryotesBacteria Archaea Many sequences from one sample all point to the same branch on the tree Norm Pace
  14. 14. Expanded Tree (Pace 1997) Archaea Eukaryotes Bacteria Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007. Based on tree from Pace 1997 Science 276:734-740
  15. 15. Is There Anything Like This? Archaea Eukaryotes Bacteria Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007. Based on tree from Pace 1997 Science 276:734-740 ??????
  16. 16. Metagenomics metagenomics ACUGC ACCUAU CGUUCG ACUCC AGCUAU CGAUCG ACCCC AGCUCU CGCUCG Taxa Characters S ACUGCACCUAUCGUUCG R ACUCCACCUAUCGUUCG E ACUCCAGCUAUCGAUCG F ACUCCAGGUAUCGAUCG C ACCCCAGCUCUCGCUCG W ACCCCAGCUCUGGCUCG Taxa Characters S ACUGCACCUAUCGUUCG E ACUCCAGCUAUCGAUCG C ACCCCAGCUCUCGCUCG EukaryotesBacteria Archaea
  17. 17. rRNA Tree of Life Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007. Based on tree from Pace 1997 Science 276:734-740 Eukaryotes ?????? Archaea Bacteria Scanned through GOS data for rRNAs that fit this pattern
  18. 18. rRNA Tree of Life Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007. Based on tree from Pace 1997 Science 276:734-740 Eukaryotes ?????? Archaea Bacteria ??????????
  19. 19. RecA vs. rRNA Eisen 1995 Journal of Molecular Evolution 41: 1105-1123..
  20. 20. Venter et al., Science 304: 66. 2004 RecA Phylotyping - Sargasso Metagenome
  21. 21. RecA Tree of Life? Archaea Eukaryotes Bacteria ??????????? Figure from Barton, Eisen et al. “Evolution”, CSHL Press. 2007. Based on tree from Pace 1997 Science 276:734-740
  22. 22. GOS 1 GOS 2 GOS 3 GOS 4 GOS 5 Novel RecA Sequences in GOS Data Wu et al PLoS One 2011
  23. 23. Novel RpoBs too Wu et al PLoS One 2011
  24. 24. GOS 1 GOS 2 GOS 3 GOS 4 GOS 5 Wu et al PLoS One 2011 I am happy to wellcome you as a new member of the 4th domain club. If by chance you are passing through Europe I will be delighted to invite you to give a seminar in Marseille and show you our strange bugs. Kind regards Didier Phylogenetic ID of Novel Lineages
  25. 25. Virus Origins
  26. 26. 2007-2014: GEBA Figure from Barton, Eisen et al. “Evolution”, CSHL Press based on Baldauf et al Tree
  27. 27. Synapomorphies Exist
  28. 28. Missing Microbes?
  29. 29. Challenge: Poor Sampling From Wu et al. 2009 Nature 462, 1056-1060
  30. 30. JGI Dark Matter Project environmental samples (n=9) isolation of single cells (n=9,600) whole genome amplification (n=3,300) SSU rRNA gene based identification (n=2,000) genome sequencing, assembly and QC (n=201) draft genomes (n=201) SAK HSM ETLTG HOT GOM GBS EPR TAETL T PR EBS AK E SM G TATTG OM OT seawater brackish/freshwater hydrothermal sediment bioreactor GN04 WS3 (Latescibacteria) GN01 +Gí LD1 WS1 Poribacteria BRC1 Lentisphaerae Verrucomicrobia OP3 (Omnitrophica) Chlamydiae Planctomycetes NKB19 (Hydrogenedentes) WYO Armatimonadetes WS4 Actinobacteria Gemmatimonadetes NC10 SC4 WS2 Cyanobacteria :36í2 Deltaproteobacteria EM19 (Calescamantes) 2FW6SDí )HUYLGLEDFWHULD
  31. 31. GAL35 Aquificae EM3 Thermotogae Dictyoglomi SPAM GAL15 CD12 (Aerophobetes) OP8 (Aminicenantes) AC1 SBR1093 Thermodesulfobacteria Deferribacteres Synergistetes OP9 (Atribacteria) :36í2 Caldiserica AD3 Chloroflexi Acidobacteria Elusimicrobia Nitrospirae 49S1 2B Caldithrix GOUTA4 6$5 0DULQLPLFURELD
  32. 32. Chlorobi )LUPLFXWHV Tenericutes )XVREDFWHULD Chrysiogenetes Proteobacteria )LEUREDFWHUHV TG3 Spirochaetes WWE1 (Cloacamonetes) 70 ZB3 093í 'HLQRFRFFXVí7KHUPXV OP1 (Acetothermia) Bacteriodetes TM7 GN02 (Gracilibacteria) SR1 BH1 OD1 (Parcubacteria) :6 OP11 (Microgenomates) Euryarchaeota Micrarchaea DSEG (Aenigmarchaea) Nanohaloarchaea Nanoarchaea Cren MCG Thaumarchaeota Cren C2 Aigarchaeota Cren pISA7 Cren Thermoprotei Korarchaeota pMC2A384 (Diapherotrites) BACTERIA ARCHAEA archaeal toxins (Nanoarchaea) lytic murein transglycosylase stringent response (Diapherotrites, Nanoarchaea) ppGpp limiting amino acids SpotT RelA (GTP or GDP) + PPi GTP or GDP +ATP limiting phosphate, fatty acids, carbon, iron DksA Expression of components for stress response sigma factor (Diapherotrites, Nanoarchaea) ı4 ȕ ȕ¶ ı2ı3 ı1 -35 -10 Į17' Į7' 51$ SROPHUDVH oxidoretucase + +e- donor e- acceptor H 1 Ribo ADP + 1+2 O Reduction Oxidation H 1 Ribo ADP 1+ O 2H 1$' + H 1$'++ + - HGT from Eukaryotes (Nanoarchaea) Eukaryota O +2+2 OH 1+ 2+3 O O +2+2 1+ 2+3 O tetra- peptide O +2+2 OH 1+ 2+3 O O +2+2 1+ 2+3 O tetra- peptide murein (peptido-glycan) archaeal type purine synthesis (Microgenomates) PurF PurD 3XU1 PurL/Q PurM PurK PurE 3XU PurB PurP ? Archaea adenine guanine O + 12 + 1 1+2 1 1 H H 1 1 1 H H H1 1 H PRPP )$,$5 IMP $,$5 A GUA G U G U A G U A U A U A U Growing AA chain W51$*O
  33. 33. recognizes UGA P51$ UGA recoded for Gly (Gracilibacteria) ribosome Woyke et al. Nature 2013. Tanja
 Woyke
  34. 34. Microbial Dark Matter Part 2 • Ramunas Stepanauskas • Tanja Woyke • Jonathan Eisen • Duane Moser • Tullis Onstott
  35. 35. • More accurate phylogeny • Rooting • Incorporating New and Fragmented Data • Lateral gene transfer • More biology about the “novel” lineages Challenge: Reference Information
  36. 36. Three Domains of Life Bacteria Archaea Eukaryotes
  37. 37. Bacteria Archaea Eukaryotes Archaea and Bacteria as Sister
  38. 38. Bacteria ArchaeaEukaryotes Eukaryotes and Bacteria as
  39. 39. Bacteria Archaea Eukaryotes Archaea and Eukaryotes as
  40. 40. Bacteria Archaea Eukaryotes Other Patterns Archaea
  41. 41. Bacteria Archaea Eukaryotes Outgroup for the Tree of Life?
  42. 42. Bacteria Archaea Eukaryotes Lateral Gene Transfer Archaea
  43. 43. Bacteria Archaea Eukaryotes Lateral Gene Transfer Archaea
  44. 44. Automated Genome Tree Lang JM, Darling AE, Eisen JA (2013) Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees and Supermatrices. PLoS ONE 8(4): e62510. doi:10.1371/journal.pone.0062510 Jenna Lang
  45. 45. Better Reference Data (e.g., PhyEco Markers) Phylogenetic group Genome Number Gene Number Maker Candidates Archaea 62 145415 106 Actinobacteria 63 267783 136 Alphaproteobacteria 94 347287 121 Betaproteobacteria 56 266362 311 Gammaproteobacteria 126 483632 118 Deltaproteobacteria 25 102115 206 Epislonproteobacteria 18 33416 455 Bacteriodes 25 71531 286 Chlamydae 13 13823 560 Chloroflexi 10 33577 323 Cyanobacteria 36 124080 590 Firmicutes 106 312309 87 Spirochaetes 18 38832 176 Thermi 5 14160 974 Thermotogae 9 17037 684 Wu D, Jospin G, Eisen JA (2013) Systematic Identification of Gene Families for Use as “Markers” for Phylogenetic and Phylogeny-Driven Ecological Studies of Bacteria and Archaea and Their Major Subgroups. PLoS ONE 8(10): e77033. doi:10.1371/journal.pone.0077033
  46. 46. Better Binning (e.g., HIC) Beitel CW, Froenicke L, Lang JM, Korf IF, Michelmore RW, Eisen JA, Darling AE. (2014) Strain- and plasmid-level deconvolution of a synthetic metagenome by sequencing proximity ligation products. PeerJ 2:e415 http://dx.doi.org/10.7717/peerj.415 Table 1 Species alignment fractions. The number of reads aligning to each replicon present in the synthetic microbial community are shown before and after filtering, along with the percent of total constituted by each species. The GC content (“GC”) and restriction site counts (“#R.S.”) of each replicon, species, and strain are shown. Bur1: B. thailandensis chromosome 1. Bur2: B. thailandensis chromosome 2. Lac0: L. brevis chromosome, Lac1: L. brevis plasmid 1, Lac2: L. brevis plasmid 2, Ped: P. pentosaceus, K12: E. coli K12 DH10B, BL21: E. coli BL21. An expanded version of this table can be found in Table S2. Sequence Alignment % of Total Filtered % of aligned Length GC #R.S. Lac0 10,603,204 26.17% 10,269,562 96.85% 2,291,220 0.462 629 Lac1 145,718 0.36% 145,478 99.84% 13,413 0.386 3 Lac2 691,723 1.71% 665,825 96.26% 35,595 0.385 16 Lac 11,440,645 28.23% 11,080,865 96.86% 2,340,228 0.46 648 Ped 2,084,595 5.14% 2,022,870 97.04% 1,832,387 0.373 863 BL21 12,882,177 31.79% 2,676,458 20.78% 4,558,953 0.508 508 K12 9,693,726 23.92% 1,218,281 12.57% 4,686,137 0.507 568 E. coli 22,575,903 55.71% 3,894,739 17.25% 9,245,090 0.51 1076 Bur1 1,886,054 4.65% 1,797,745 95.32% 2,914,771 0.68 144 Bur2 2,536,569 6.26% 2,464,534 97.16% 3,809,201 0.672 225 Bur 4,422,623 10.91% 4,262,279 96.37% 6,723,972 0.68 369 Figure 1 Hi-C insert distribution. The distribution of genomic distances between Hi-C read pairs is shown for read pairs mapping to each chromosome. For each read pair the minimum path length on the circular chromosome was calculated and read pairs separated by less than 1000 bp were discarded. The 2.5 Mb range was divided into 100 bins of equal size and the number of read pairs in each bin was recorded for each chromosome. Bin values for each chromosome were normalized to sum to 1 and plotted. E. coli K12 genome were distributed in a similar manner as previously reported (Fig. 1; (Lieberman-Aiden et al., 2009)). We observed a minor depletion of alignments spanning the linearization point of the E. coli K12 assembly (e.g., near coordinates 0 and 4686137) due to edge eVects induced by BWA treating the sequence as a linear chromosome rather than circular. OI 10.7717/peerj.415 9/19 Figure 2 Metagenomic Hi-C associations. The log-scaled, normalized number of Hi-C read pairs associating each genomic replicon in the synthetic community is shown as a heat map (see color scale, blue to yellow: low to high normalized, log scaled association rates). Bur1: B. thailandensis chromosome 1. Bur2: B. thailandensis chromosome 2. Lac0: L. brevis chromosome, Lac1: L. brevis plasmid 1, Lac2: L. brevis plasmid 2, Ped: P. pentosaceus, K12: E. coli K12 DH10B, BL21: E. coli BL21. reference assemblies of the members of our synthetic microbial community with the same alignment parameters as were used in the top ranked clustering (described above). We first counted the number of Hi-C reads associating each reference assembly replicon (Fig. 2; Figure 3 Contigs associated by Hi-C reads. A graph is drawn with nodes depicting contigs and depicting associations between contigs as indicated by aligned Hi-C read pairs, with the count t depicted by the weight of edges. Nodes are colored to reflect the species to which they belong (see l with node size reflecting contig size. Contigs below 5 kb and edges with weights less than 5 were exc Contig associations were normalized for variation in contig size. typically represent the reads and variant sites as a variant graph wherein variant sit represented as nodes, and sequence reads define edges between variant sites observ the same read (or read pair). We reasoned that variant graphs constructed from H data would have much greater connectivity (where connectivity is defined as the m path length between randomly sampled variant positions) than graphs constructed Chris Beitel @datscimed Aaron Darling @koadman
  47. 47. Phylosift - Automated Bayesian Phylogenomics Input Sequences rRNA workflow protein workflow profile HMMs used to align candidates to reference alignment Taxonomic Summaries parallel option hmmalign multiple alignment LAST fast candidate search pplacer phylogenetic placement LAST fast candidate search LAST fast candidate search search input against references hmmalign multiple alignment hmmalign multiple alignment Infernal multiple alignment LAST fast candidate search 600 bp 600 bp Sample Analysis Comparison Krona plots, Number of reads placed for each marker gene Edge PCA, Tree visualization, Bayes factor tests eachinputsequencescannedagainstbothworkflows Aaron Darling @koadman Erik Matsen @ematsen Holly Bik @hollybik Guillaume Jospin @guillaumejospin Darling AE, Jospin G, Lowe E, Matsen FA IV, Bik HM, Eisen JA. (2014) PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ 2:e243 http://dx.doi.org/10.7717/peerj. 243 Erik Lowe
  48. 48. Normalizing Across Genes Tree OTU Wu, D., Doroud, L, Eisen, JA 2013. arXiv. TreeOTU: Operational Taxonomic Unit Classification Based on Phylogenetic Dongying Wu
  49. 49. Challenge: Engaging Public
  50. 50. The Rise of Citizen Microbiology Darlene Cavalier
  51. 51. Eisen Lab Citizen Microbiology Kitty Microbiome Georgia Barguil Jack Gilbert Project MERCCURI Phone and Shoes tinyurl/kittybiome Holly Ganz David Coil
  52. 52. Acknowledgements DOE JGI Sloan GBMF NSF DHS DARPA Aaron Darling
 Lizzy Wilbanks Jenna Lang Russell Neches Rob Knight Jack Gilbert Tanja Woyke Rob Dunn Katie Pollard Jessica Green Darlene Cavalier Eddy RubinWendy Brown Dongying Wu Phil Hugenholtz DSMZ Sundar Srijak Bhatnagar David Coil Alex Alexiev Hannah Holland-Moritz Holly Bik John Zhang Holly Menninger Guillaume Jospin David Lang Cassie Ettinger Tim HarkinsJennifer Gardy Holly Ganz

×