SlideShare ist ein Scribd-Unternehmen logo
1 von 73
TIGRTIGRTIGRTIGR
“Nothing in biology makes sense
except in the light of evolution.”
T. H. Dobzhansky (1973)
TIGRTIGR
Talk Outline
• Complete Genome Projects - history and current
status
• What have we learned about evolutionary history
and processes from recent genome projects
• Two main themes - completeness and closeness
• Coming attractions
• Why we need more genomes
TIGRTIGR
The Institute for Genomic
Research
• A not for profit institution, staff ~230
• Departments:
– Eukaryotic Genomics
– Microbial Genomics
– Functional Genomics
– Bioinformatics
– Sequencing Core
TIGRTIGR
General Steps in Analysis of
Complete Genomes
• Identification/prediction of genes
• Characterization of gene features
• Characterization of genome features
• Prediction of gene function
• Prediction of pathways
• Integration with known biological data
• Comparative genomics
TIGRTIGR
Complete Genome/Chromosome Progress
0
10
20
30
40
50
Complete Genomes
1995 1996 1997 1998 1999 2000
Year
Eukaryote
Archaea
Bacteria
TIGRTIGR
Limitations of Genome Analysis
• Functional predictions are PREDICTIONS
• Need to follow up all predictions with
experimental work
• Each genome sequence is a snapshots of one clone
• Genome analysis is not able to identify novel
processes
• Annotation needs to be updated
• Assembly can be wrong
• Some parts of genome may be missed (e.g., low
copy plasmids)
TIGRTIGR
Evolutionary Genomics I:
Selection of Species
• Phylogenetic diversity
• Relatedness to model organism
• Understanding major evolutionary
transitions
• Determining right depth
• Short branch lengths
TIGRTIGR
rRNA Tree - Complete/In Progress
EuryarchaeotaCrenarchaeotaAlpha
Proteobacteria
Epsilon
Proteobacteria
Delta
Proteobacteria
SpirochetesGreen Sulfur
bacteria
ChlamydiaCyanobacteriaThermotogalesThermophilic
O2 reducers
Deinococcus/
Thermus
Beta
Proteobacteria
Gamma
Proteobacteria
Low GC Gram-positive bacteriaHigh GCGram-positive bacteriaGreen Non-
Sulfur bacteria
TIGRTIGR
Bacteria Archaea
Evolutionary Diversity Still Poorly
Represented in Complete Genomes
TIGRTIGR
Close Relatives vs Year
0510152025303540199519961997199819992000Solo generaMultiple species
TIGRTIGR
TIGRTIGR
Genome sequences and evolution
• Origin of new gene function
• Gene loss
• Genome degradation
• Gene and genome duplication
• Rates and patterns of mutation,
recombination
• Gene transfer
• Species evolution
TIGRTIGR
Evolutionary Genome Analysis I:
Functional Prediction
TIGRTIGR
Evolutionary Genome Analysis II:
Gene Loss
TIGRTIGR
EuksArchBacteriaLossEvolutionary Origin of GeneMTMJSCHSAADRTABSMGMPBBTPHPHIECSSMTPresence ( ) or Absence of GeneSpecies AbbreviationKingdom
Example of Tracing Gene Loss
TIGRTIGR
TIGRTIGR
Why Identify Gene Loss
• Indicates that gene is not absolutely required for
survival
• Parallel loss of same gene in different species may
indicate selective advantage of loss of that gene
• Correlated loss of genes in a pathway indicates a
conserved association among those genes
(important for phylogenetic profiles)
• Loss in organellar genomes frequently
accompanied by gain in nuclear genome
TIGRTIGR
Duplication and Loss of Mismatch
Repair Genes
51234*
E. coliH. influenzaeN. gonorrhoaeaH. pyloriSyn. spB. subtilisS. pyogenesM. pneumoniaeM. genitaliumA. aeolicusD. radioduransT.pallidumB.burgdorferiSyn. spB. subtilisS. pyogenesA. aeolicusD. radioduransB. burgdorferiMutS1MutS-IlineageMutS-II lineageSpecies TreeGene loss*Gene Duplications1-5Gene LossA.B.A. aeolicusS pyogenesB. subtilisSyn. spD. radioduransMutS2B.burgdorferi
TIGRTIGR
Evolution and Complete Genomes II:
Gene and Genome Duplication
TIGRTIGR
Why Duplications Are Useful to Identify
• Allows division into orthologs and paralogs
• Improves functional predictions
• Helps identify mechanisms of duplication
• Can be used to study mutation processes in
different parts of a genome
• Lineage specific duplications may be indicative
of species’ specific adaptations
TIGRTIGR
Expansion of MCP Family in V. choleraeE.coli gi1787690B.subtilis gi2633766Synechocystis sp. gi1001299Synechocystis sp. gi1001300Synechocystis sp. gi1652276Synechocystis sp. gi1652103H.pylori gi2313716H.pylori99 gi4155097C.jejuni Cj1190cC.jejuni Cj1110cA.fulgidus gi2649560A.fulgidus gi2649548B.subtilis gi2634254B.subtilis gi2632630B.subtilis gi2635607B.subtilis gi2635608B.subtilis gi2635609B.subtilis gi2635610B.subtilis gi2635882E.coli gi1788195E.coli gi2367378E.coli gi1788194E.coli gi1789453C.jejuni Cj0144C.jejuni Cj0262cH.pylori gi2313186H.pylori99 gi4154603C.jejuni Cj1564C.jejuni Cj1506cH.pylori gi2313163H.pylori99 gi4154575H.pylori gi2313179H.pylori99 gi4154599C.jejuni Cj0019cC.jejuni Cj0951cC.jejuni Cj0246cB.subtilis gi2633374T.maritima TM0014T.pallidum gi3322777T.pallidum gi3322939T.pallidum gi3322938B.burgdorferi gi2688522T.pallidum gi3322296B.burgdorferi gi2688521T.maritima TM0429T.maritima TM0918T.maritima TM0023T.maritima TM1428T.maritima TM1143T.maritima TM1146P.abyssi PAB1308P.horikoshii gi3256846P.abyssi PAB1336P.horikoshii gi3256896P.abyssi PAB2066P.horikoshii gi3258290P.abyssi PAB1026P.horikoshii gi3256884D.radiodurans DRA00354D.radiodurans DRA0353D.radiodurans DRA0352P.abyssi PAB1189P.horikoshii gi3258414B.burgdorferi gi2688621M.tuberculosis gi1666149V.cholerae VC0512V.cholerae VCA1034V.cholerae VCA0974V.cholerae VCA0068V.cholerae VC0825V.cholerae VC0282V.cholerae VCA0906V.cholerae VCA0979V.cholerae VCA1056V.cholerae VC1643V.cholerae VC2161V.cholerae VCA0923V.cholerae VC0514V.cholerae VC1868V.cholerae VCA0773V.cholerae VC1313V.cholerae VC1859V.cholerae VC1413V.cholerae VCA0268V.cholerae VCA0658V.cholerae VC1405V.cholerae VC1298V.cholerae VC1248V.cholerae VCA0864V.cholerae VCA0176V.cholerae VCA0220V.cholerae VC1289V.cholerae VCA1069V.cholerae VC2439V.cholerae VC1967V.cholerae VCA0031V.cholerae VC1898V.cholerae VCA0663V.cholerae VCA0988V.cholerae VC0216V.cholerae VC0449V.cholerae VCA0008V.cholerae VC1406V.cholerae VC1535V.cholerae VC0840V.cholerae VC0098V.cholerae VCA1092V.cholerae VC1403V.cholerae VCA1088V.cholerae VC1394V.cholerae VC0622NJ*******************************************************************************
TIGRTIGR
C. pneumoniae Paralogs by Position
0
250000
500000
750000
1000000
1250000
Subject Orf Position
0 250000 500000 750000 1000000 1250000
Query Orf Position
TIGRTIGR
C. pneumoniae Paralogs -
Lineage Specific
0
250000
500000
750000
1000000
1250000
Subject Orf Position
0 250000 500000 750000 1000000 1250000
Query Orf Position
TIGRTIGR
Evolution and Complete Genomes III:
Genome Rearrangements
TIGRTIGR
X-files
Eisen et al. 2000. Genome Biology 1(6): 11.1-11.9
Also see Tillier and Collins. 2000. Nature Genetics
26(2):195-7.
TIGRTIGR
V. cholerae vs. E. coli
Best Matching Proteins by Location
0
1000000
2000000
3000000
4000000
5000000
E. coli
ORF Coordinates
0 500000 1000000 1500000 2000000 2500000 3000000
V. cholerae ORF Coordinates
TIGRTIGR
M. leprae vs. M. tuberculosis Whole
Genome Alignment
0
1000000
2000000
3000000
4000000
Mycobacterium tuberculosis
0 1000000 2000000 3000000
Mycobacterium leprae
TIGRTIGR
Duplication and Gene Loss Model
A
B
CD
E
F
A
B
CD
E
F
A
B
C
D
E
F
A
B
C
D
E
F
A’
B’
C’
D’
E’
F’
A
B
C
D
E
F
A’
B’
C’
D’
E’
F’
A
C
D
F
A’
B’
E’
E. coli
E. coli
B
C
D
F
A’
B’
D’
E’
V. cholerae
A
B
C
D
E
F
A’
B’
C’
D’
E’
F’
TIGRTIGR C. trachomatis MoPn
C.pneumoniaeAR39
Origin
Terminus
C. trachomatis vs C. pneumoniae Dot Plot
TIGRTIGR
B1A1B2A2B3A3B3B22423222120191817161514131211109672582627282930123453132
B131326789101112131415161718192021222324252627282930123453132
B32423222120191817161514131211109672582627282933231304521
A131326789101112131415161718192021222324252627282930123453132
A231326789101112131918171615142021222324252627282930123453132
A32678910111213191817161514202122232425262754331302928132
B2Inversion
Around
Terminus (*)
Inversion
Around
Terminus (*)
Inversion
Around
Origin (*)
Inversion
Around
Origin (*)
******** Common
Ancestor of
A and B
31326789101112131415161718192021222324252627282930123453132
A2A1A2A3B2B1
Symmetric Inversion Model
TIGRTIGR
Why are Inversions Symmetrical
Around Origin
• Genetic studies in Salmonella and E. coli
suggest that there may be strong selection
against other inversions
– Mahan, Segall, Schmid and Roth
– Liu and Sanderson
– Rebollo, Francois, and, Louarn
TIGRTIGR
Evolution and Complete Genomes IV:
Gene Transfer
TIGRTIGR
Why Gene Transfers Are Useful to Identify
• Laterally transferred genes frequently involved in
environmental adaptations and/or pathogenicity
• Helps identify transposons, integrons, and other
vectors of gene transfer
• Helps identify species associations in the
environment
TIGRTIGR
Tree of Life or Web of Life?
TIGRTIGR
Most ‘Evidence’ for Gene Transfer
has Alternative Explanations
TIGRTIGR
How to Infer Gene Transfers
• Unusual distribution patterns
• Unusual nucleotide composition
• High sequence similarity to supposedly
distantly related species
• Unusual gene trees
• Observe transfer events
TIGRTIGR
100s of DNA Islands in O157:H7 vs.
K12: Gene Loss or Transfer?
TIGRTIGR
Lateral Transfer Inference Based
on Complete Genome Analysis I:
Organellar to Nuclear Transfers
in A. thaliana
TIGRTIGR
A. thaliana Nuclear Proteins:
Best Matches to Complete Genomes
0
1000
2000
3000
4000
BestMatches
CHLTE
PORGI
BACSU
MCYTU
BBUR
TREPA
CHLPN
ECOLI
NEIME
RICPR
CAUCR
HELPY
SYNSP
AQUAE
DEIRA
THEMA
AERPE
ARCFU
METJA
METTH
PYRAB
CELEG
YEAST
DROME
B A E
TIGRTIGR
SYNSP0100200300400500600700800900
Number of Best Matches to This Species050010001500200025003000350040004500
Number of ORFs in Complete Genome
Best Matches vs. Prokaryotes
TIGRTIGR
Organellar HSP60s
DROMECG12101DROMECG7235DROMECG2830DROMECG16954ARATH At2g33210ARATH F14O13.19ARATH MCP4.7YEAST SWCAUCR ORF03639RICPR gi|3861167ECOLI gi|1790586NEIMEb gi|7227233.AQUAE gi|2984379CHLPN gi|4376399|DEIRA ORF02245BACSU gi|2632916SYNSP gi|1652489SYNSP gi|1001103ARATH At2g28000ARATH MRP15.11MCYTU gi|2909515MCYTU gi|1449370THEMA TM0506BBUR gi|2688576TREPA gi|3322286PORGI ORF00933CHLTE ORF00173HELPY gi|2313084
Mitochondrial
Forms
α−ΠροτεοΧψανοβαχτεριαΠλαστιδ Φορµσ
TIGRTIGR
Lateral Transfer Inference Based
on Complete Genome Analysis II:
Bacterial to Vertebrate Transfers
Based on Analysis of the Human
Genome
TIGRTIGR
Lander et al. ‘Evidence’
• Genes match bacteria not non-vertebrate
eukaryotes
• Or, genes have stronger match to bacteria
than non-vertebrates
• A set of ~120 of these genes found in many
bacterial species
TIGRTIGR
Alternative explanations
• Gene loss from non-vertebrate eukaryotes
• Rapid divergence in non-vertebrate
eukaryotes
• Incomplete genomes (e.g., D.
melanogaster)
• Bad annotation/gene finding
• Contamination
TIGRTIGR
Evolutionary Rate Variation
231456
TIGRTIGR
Trees Don’t Support Transfer
Paramecium bursaria Chlorella virus 1Homo sapiens HAS1Mus musculus HAS1Xenopus laevisXenopus laevisDanio rerioHomo sapiensMus musculusDanio rerioXenopus laevisGallus gallusBos taurusHomo sapiensMus musculusRattus norvegicusBradyrhizobium sp SNU001Rhizobium leguminosarumRhizobium spRhizobium lotiRhizobium tropiciRhizobium sp. NodCMesorhizobium sp 7653RSinorhizobium melilotiRhizobium melilotiRhizobium leguminosarumRhizobium galegaeAzorhizobium caulinodansStigmatella aurantiacaStreptomyces coelicolorStreptococcus uberisStreptococcus equisimilisStreptococcus pyogenes HASAStreptococcus pneumoniae0.2
BacteriaVertebratesVirusIIIIII
TIGRTIGR
Number of pBVTs is Dependent
on # of Genomes Analyzed
TIGRTIGR
Birney et al, same issue of Nature
as complete genome
“The unfinished human genomic DNA may contain
contamination, particularly from bacteria but also
from other sources. Contaminating DNA is routinely
removed from finished sequence, but some is still
present in unfinished sequence. If the predicted gene
matches a bacterial gene more closely than any
vertebrate gene then it will almost always be a
contaminant.”
TIGRTIGR
Evolution and Complete Genomes V:
Species Evolution
TIGRTIGR
Whole Genome “Phylogeny”
TIGRTIGR
Whole Genome vs. rRNA
hanobacterium thermoautotrophicumhaeoglobus fulgidusococcus horikoshiihanococcus jannaschiieropyrum pernixchangeschaeaobacterium tuberculosislus subtilischocystis sp.Aquifex aeolicusermotoga maritimaeinococcus radioduranseponema pallidumorrelia burgdorferiobacter pyloripylobacter jejuniseria meningitidiserichia colio choleraemophilus influenzaeettsia prowazekiioplasma pneumoniaeoplasma genitaliummydia trachomatismydia pneumoniaecterianorhabditis elegansophila melanogastercharomyces cerevisiaekarya
TIGRTIGR
Deinococcus radiodurans
2a) RecA2b) SS-rRNAErwinia carotovaraEscherichia coliShigella flexneriEnterobacter agglomeransYersinia pestisSerratia marcescensProteus vulgarisProteus mirabilisVibrio anguilarrumVibrio choleraeHaemophilus influenzaeArabidopsis thaliana CPSTAcetobacter polyoxogenesMethylobacillus flagellatumMethylomonas claraMethylophilus methylotrophusMagnetispirillum magnetotacticumRhizobium phaseoliRhizobium viciaeCorynebacterium glutamicumStreptomyces violaceusMycobacterium lepraeMycobacterium tuberculosisStreptomyces ambofaciensStreptomyces lividansBorrelia burgdorferiBacteroides fragilisChlamydia trachomatisThermus aquaticusThermus thermophilusAquifex pyrophilusThermotoga maritimaLactococcus lactisStreptococcus pneumoniaeBacillus subtilisStaphylococcus aureusAcholeplasma laidlawiiSynechococcus sp. PCC7002Synechococcus sp. PCC7942Anabaena variabilisCampylobacter jejuniHelicobacter pyloriAgrobacterium tumefaciensRhizobium melilotiRhodobacter sphaeroidesRhodobacter capsulatusRickettsia prowazekiiMyxococcus xanthus2Myxococcus xanthus1Xanthomonas oryzaeThiobacillus ferrooxidansAcidiphilium facilisBrucella abortusNeisseria gonorrhoeaePseudomonas fluorescencsPseudomonas aeruginosaAzotobacter vinelandiiPseudomonas putidaAcinetobacter calcoaceticusLegionella pneumophilaBurkholderia cepaciaBordetella pertussisMycoplasma mycoidesMycoplasma pulmonisErwinia carotovaraEscherichia coliEnterobacter agglomeransYersinia pestisSerratia marcescensProteus vulgarisArsenophonus nasoniaeVibrio anguilarrumVibrio choleraeHaemophilus influenzae"Flavobacterium" lutescensNicotiana tabacum CPSTAcetobacter pasterianusMethylobacillus flagellatumMethylomonas methylovoraMethylophilus methylotrophusMagnetispirillum magnetotacticumRhizobium phaseoliRhizobium viciaeCorynebacterium glutamicumStreptomyces coelicolorMycobacterium lepraeMycobacterium tuberculosisStreptomyces ambofaciensStreptomyces lividansBorrelia burgdorferiBacteroides fragilisChlamydia trachomatisThermus aquaticusThermus thermophilusDeinococcus radioduransAquifex pyrophilusThermotoga maritimaLactococcus lactisStreptococcus salivariusBacillus subtilisStaphylococcus aureusAcholeplasma laidlawiiSynechococcus sp. PCC6301Phormidium minutumAnabaena sp. PCC7120Campylobacter jejuniHelicobacter pyloriAgrobacterium tumefaciensRhizobium melilotiRhodobacter sphaeroidesRhodobacter capsulatusRickettsia prowazekiiMyxococcus xanthusXanthomonas oryzaeThiobacillus caldusAcidiphilium facilisBrucella abortusNeisseria gonorrhoeaePseudomonas flavescensPseudomonas aeruginosaPseudomonas putidaAcinetobacter calcoaceticusLegionella pneumophilaBurkholderia cepaciaBordetella pertussisMycoplasma mycoidesMycoplasma pulmonisγ1γ2βαΛοωΓΧΗιγηΓΧδεΧψανο∆/Τ
TIGRTIGR
Coming Attractions I:
Phylogenetic Profiles
TIGRTIGR
Phylogenetic Profile - E.coli
Flagellar GenesfhiAfliMfliPfliGflgGfliFflgIflhAflhBgcpE
TIGRTIGR
PG Profile. C. tepidum
Chlorophyll Synthesis
CbiGCbiPDsrNCbiACbiJHCobNBchH1BchH2CobN2BchH3ChlIChlI2ChlI3
TIGRTIGR
Coming Attractions II:
Uncultured Environmental Species
TIGRTIGR
TIGRTIGR
Genomics does not require initial
culturing step.
• Isolate, by filtration, all bacteria in a water sample
• Extract total DNA in very large pieces
• Clone those pieces as BACs into E.coli to get enough.
• Sequence the BACs like a bacterial genome.
Natural
Water
Filter
concentrate
Extract
DNA
Clone
Into
BACs
Sequence
Gene
List
TIGRTIGR
Bacterial Rhodopsin:
a new photosynthesis system in the oceans
SAR86, an
uncultured
bacteria
BAC
Sequenced and
Analyzed
Beja O, et.al., Science 2000 289:1902-6
Bacterial rhodopsin: evidence for a new type of phototrophy in the sea.
Rhodopsin
found
H+
light
H+
ADP ATP
Cloned into
E. coli E. coli pumps
protons in the
light
TIGRTIGR
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
0 m
80 m
750 m
γ
α
β
ε
Proteobacteria
Archaea
Best Matches of Bac Ends
TIGRTIGR
RecA-Bacteroides/Cytophaga in
Monterey Bay BACs
Chlorobium
tepidum
Cytophaga hutchinsonii
Prevotella
ruminocola
Bacteroides
fragilis
Porphyromonas
gingivalis
MBBAD68TR
MBBAD65TR
TIGRTIGR
TIGRTIGR
TIGRTIGR
Wither Genomics? Not yet.
• Despite limitations, a great deal can still be
learned from genome sequence analysis.
TIGRTIGR
Evolutionary Diversity Still Poorly
Represented in Complete Genomes
Tmf-pendenR-rubrum3Azs-brasi2Rm-vannielRhb-legum8Bdr-japoniSpg-capsulRic-prowazSte-maltopSpr-volutaRub-gelat2Rcy-purpurNis-gonor1Hrh-halch2Alm-vinosmPs-aerugi3E-coliMyx-xanthuBde-stolpiDsv-desulfDsb-postgaC-leptumC-butyric4C-pasteuriEub-barkerC-quercicoHel-chlor2Acp-laidlaM-capricolC-ramosumB-stearothEco-faecalLis-monoc3B-cereus4B-subtilisStc-therm3L-delbruckL-caseiFus-nucleaGlb-violacOlst-lut_CZea mays CNost-muscrSyn-6301Tnm-lapsumFlx-litoraCy-lyticaEmb-brevi2Bac-fragilPrv-rumcolPrb-diffluCy-hutchinFlx-canadaSap-grandiChl-limicoWln-succi2Hlb-pylor6Cam-jejun5Stm-ambofaArb-globifCor-xerosiBif-bifiduCfx-aurantTmc-roseumAqu-pyrophenv-SBAR12env-SBAR16Msr-barkerTpl-acidopMsp-hungatHf-volcaniMb-formiciMt-fervid1Tc-celerArg-fulgidMpy-kandl1Mc-vannielMc-jannascenv-pJP27Sul-acaldaThp-tenaxenv-pJP89Tt-maritimFer-islandMei-ruber4D-radiodurChd-psittaAcbt-capslenv-MC18Pir-staleyLpn-illiniLps-interKSpi-stenosTrp-pallidBor-burgdoSpi-halophBrs-hyodysFib-sucS85Tmf-pendenR-rubrum3Azs-brasi2Rm-vannielRhb-legum8Bdr-japoniSpg-capsulRic-prowazSte-maltopSpr-volutaRub-gelat2Rcy-purpurNis-gonor1Hrh-halch2Alm-vinosmPs-aerugi3E-coliMyx-xanthuBde-stolpiDsv-desulfDsb-postgaC-leptumC-butyric4C-pasteuriEub-barkerC-quercicoHel-chlor2Acp-laidlaM-capricolC-ramosumB-stearothEco-faecalLis-monoc3B-cereus4B-subtilisStc-therm3L-delbruckL-caseiFus-nucleaGlb-violacOlst-lut_CZea mays CNost-muscrSyn-6301Tnm-lapsumFlx-litoraCy-lyticaEmb-brevi2Bac-fragilPrv-rumcolPrb-diffluCy-hutchinFlx-canadaSap-grandiChl-limicoWln-succi2Hlb-pylor6Cam-jejun5Stm-ambofaArb-globifCor-xerosiBif-bifiduCfx-aurantTmc-roseumAqu-pyrophenv-SBAR12env-SBAR16Msr-barkerTpl-acidopMsp-hungatHf-volcaniMb-formiciMt-fervid1Tc-celerArg-fulgidMpy-kandl1Mc-vannielMc-jannascenv-pJP27Sul-acaldaThp-tenaxenv-pJP89Tt-maritimFer-islandMei-ruber4D-radiodurChd-psittaAcbt-capslenv-MC18Pir-staleyLpn-illiniLps-interKSpi-stenosTrp-pallidBor-burgdoSpi-halophBrs-hyodysFib-sucS85
BacteriaArchaeaBacteriaArchaeaA. rRNA tree of Bacterial and Archaeal Major GroupsB. Groups with Completed Genomes Highlighted
TIGRTIGR
Limited Ecological and Physiological
Diversity
• All genomes from cultured species or
pathogens/symbionts
• Limited ecological diversity
– most are from pathogens or thermophiles
• Limited physiological diversity
– need whole range for particular physiologies,
not just extremes
TIGRTIGR
TIGRTIGR
Why Completeness is Important
• Improves characterization of genome features
– Gene order, replication origins
• Better comparative genomics
– Genome duplications, inversions
• Presence and absence of particular genes can be very
important (e.g., gene loss)
• Missing sequence might be important (e.g.,
centromere)
• Allows researchers to focus on biology not sequencing
• Facilitates large scale correlation studies
TIGRTIGR
Acknowledgements
• Genome inversions: S. Salzberg, J. Heidelberg, O. White, A.
Stoltzfus, J. Peterson, H. Ochman
• Genome sequences and analysis: J. Heidelberg, T. Read, H.
Tettelin, K. Nelson, J. Peterson, R. Fleischmann, D. Bryant
• Horizontal transfers: K. Nelson, W. F. Doolittle
• TIGR: C. Fraser, J. Venter, M-I. Benito, S. Kaul, Seqcore
• $$$: NSF, NIH, ONR, DOE
TIGRTIGR
Evolutionary Studies Improve
Most Aspects of Genome Analysis
• Phylogeny of species places comparative data in perspective
• Evolution of genes and gene families
– Functional predictions
– Identification of orthologs and paralogs
– Species specific mutation patterns
• Evolution of pathways
– Convergence
– Prediction of function
• Evolution of gene order/genome rearrangements
• Phylogenetic distribution patterns
• Identification of novel features
TIGRTIGR
Genome Information and Analysis
Improves Studies of Evolution
• Complete genome information particularly useful
• Unbiased sampling
• More sequences of genes
• Presence/absence information needed to infer certain
events (e.g., gene loss, duplication)
• Genome wide mutation and substitution patterns (e.g.,
strand bias)
• Diversification and duplication
TIGRTIGR
TIGRTIGR
TIGRTIGR
Tracing Gene Loss
• Need presence and absence information of orthologous
genes from different species
• Determining absence requires a complete genome
• May still miss some homologs (e.g., due to rapid
divergence)
• Helps to have closely related species
• Use standard character state reconstruction methods to
infer gene gain and loss

Weitere ähnliche Inhalte

Ähnlich wie Talk by Jonathan Eisen on "Phylogenomics" at Gordon Conference in 2001

Penn State Tomato Breeding Program
Penn State Tomato Breeding ProgramPenn State Tomato Breeding Program
Penn State Tomato Breeding Program
heathermerk
 

Ähnlich wie Talk by Jonathan Eisen on "Phylogenomics" at Gordon Conference in 2001 (20)

Jonathan Eisen lecture for MBL Molecular Evolution Course 2003
Jonathan Eisen lecture for MBL Molecular Evolution Course 2003Jonathan Eisen lecture for MBL Molecular Evolution Course 2003
Jonathan Eisen lecture for MBL Molecular Evolution Course 2003
 
Talk on Phylogenomics for MBL Molecular Evolution Course 2004
Talk on Phylogenomics for MBL Molecular Evolution Course 2004Talk on Phylogenomics for MBL Molecular Evolution Course 2004
Talk on Phylogenomics for MBL Molecular Evolution Course 2004
 
Clostridium botulinum ppt
Clostridium botulinum pptClostridium botulinum ppt
Clostridium botulinum ppt
 
RO Microbiology .pdf
RO Microbiology .pdfRO Microbiology .pdf
RO Microbiology .pdf
 
Water microbiology for reverse osmosis plants
Water microbiology for reverse osmosis plantsWater microbiology for reverse osmosis plants
Water microbiology for reverse osmosis plants
 
Obstetrics and gynaecology infections 1
Obstetrics and gynaecology infections 1Obstetrics and gynaecology infections 1
Obstetrics and gynaecology infections 1
 
Aug2014 acrometrix
Aug2014 acrometrixAug2014 acrometrix
Aug2014 acrometrix
 
Seminario
SeminarioSeminario
Seminario
 
Uti
UtiUti
Uti
 
Plant & Animal Genome 2019 - The fate of deleterious variants
Plant & Animal Genome 2019 - The fate of deleterious variantsPlant & Animal Genome 2019 - The fate of deleterious variants
Plant & Animal Genome 2019 - The fate of deleterious variants
 
2015 Keynote for The Clinical Genome Conference
2015 Keynote for The Clinical Genome Conference2015 Keynote for The Clinical Genome Conference
2015 Keynote for The Clinical Genome Conference
 
Pancreas transplant pathology report banff 2011
Pancreas transplant pathology report banff 2011Pancreas transplant pathology report banff 2011
Pancreas transplant pathology report banff 2011
 
Le infezioni nel cirrotico: aspetti clinici - Gastrolearning®
Le infezioni nel cirrotico: aspetti clinici - Gastrolearning®Le infezioni nel cirrotico: aspetti clinici - Gastrolearning®
Le infezioni nel cirrotico: aspetti clinici - Gastrolearning®
 
The Ciliate Species Names List Project
The Ciliate Species Names List ProjectThe Ciliate Species Names List Project
The Ciliate Species Names List Project
 
Group C and G Streptococci, their role in acute pharyngitis
Group C and G Streptococci, their role in acute pharyngitis Group C and G Streptococci, their role in acute pharyngitis
Group C and G Streptococci, their role in acute pharyngitis
 
Development of potential Biocontrol agents & its exploitation in agriculture-ppt
Development of potential Biocontrol agents & its exploitation in agriculture-pptDevelopment of potential Biocontrol agents & its exploitation in agriculture-ppt
Development of potential Biocontrol agents & its exploitation in agriculture-ppt
 
Penn State Tomato Breeding Program
Penn State Tomato Breeding ProgramPenn State Tomato Breeding Program
Penn State Tomato Breeding Program
 
Tumor board
Tumor boardTumor board
Tumor board
 
Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...
Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...
Phylogenomic Case Studies: The Benefits (and Occasional Drawbacks) of Integra...
 
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
 

Mehr von Jonathan Eisen

EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
Jonathan Eisen
 

Mehr von Jonathan Eisen (20)

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdf
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of Microbes
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meeting
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current Actions
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 Introduction
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 Vaccines
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA Detection
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 Introduction
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID Testing
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID Transmission
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
 

Kürzlich hochgeladen

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
University of Hertfordshire
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 

Kürzlich hochgeladen (20)

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 

Talk by Jonathan Eisen on "Phylogenomics" at Gordon Conference in 2001

  • 1. TIGRTIGRTIGRTIGR “Nothing in biology makes sense except in the light of evolution.” T. H. Dobzhansky (1973)
  • 2. TIGRTIGR Talk Outline • Complete Genome Projects - history and current status • What have we learned about evolutionary history and processes from recent genome projects • Two main themes - completeness and closeness • Coming attractions • Why we need more genomes
  • 3. TIGRTIGR The Institute for Genomic Research • A not for profit institution, staff ~230 • Departments: – Eukaryotic Genomics – Microbial Genomics – Functional Genomics – Bioinformatics – Sequencing Core
  • 4. TIGRTIGR General Steps in Analysis of Complete Genomes • Identification/prediction of genes • Characterization of gene features • Characterization of genome features • Prediction of gene function • Prediction of pathways • Integration with known biological data • Comparative genomics
  • 5. TIGRTIGR Complete Genome/Chromosome Progress 0 10 20 30 40 50 Complete Genomes 1995 1996 1997 1998 1999 2000 Year Eukaryote Archaea Bacteria
  • 6. TIGRTIGR Limitations of Genome Analysis • Functional predictions are PREDICTIONS • Need to follow up all predictions with experimental work • Each genome sequence is a snapshots of one clone • Genome analysis is not able to identify novel processes • Annotation needs to be updated • Assembly can be wrong • Some parts of genome may be missed (e.g., low copy plasmids)
  • 7. TIGRTIGR Evolutionary Genomics I: Selection of Species • Phylogenetic diversity • Relatedness to model organism • Understanding major evolutionary transitions • Determining right depth • Short branch lengths
  • 8. TIGRTIGR rRNA Tree - Complete/In Progress EuryarchaeotaCrenarchaeotaAlpha Proteobacteria Epsilon Proteobacteria Delta Proteobacteria SpirochetesGreen Sulfur bacteria ChlamydiaCyanobacteriaThermotogalesThermophilic O2 reducers Deinococcus/ Thermus Beta Proteobacteria Gamma Proteobacteria Low GC Gram-positive bacteriaHigh GCGram-positive bacteriaGreen Non- Sulfur bacteria
  • 9. TIGRTIGR Bacteria Archaea Evolutionary Diversity Still Poorly Represented in Complete Genomes
  • 10. TIGRTIGR Close Relatives vs Year 0510152025303540199519961997199819992000Solo generaMultiple species
  • 12. TIGRTIGR Genome sequences and evolution • Origin of new gene function • Gene loss • Genome degradation • Gene and genome duplication • Rates and patterns of mutation, recombination • Gene transfer • Species evolution
  • 13. TIGRTIGR Evolutionary Genome Analysis I: Functional Prediction
  • 15. TIGRTIGR EuksArchBacteriaLossEvolutionary Origin of GeneMTMJSCHSAADRTABSMGMPBBTPHPHIECSSMTPresence ( ) or Absence of GeneSpecies AbbreviationKingdom Example of Tracing Gene Loss TIGRTIGR
  • 16. TIGRTIGR Why Identify Gene Loss • Indicates that gene is not absolutely required for survival • Parallel loss of same gene in different species may indicate selective advantage of loss of that gene • Correlated loss of genes in a pathway indicates a conserved association among those genes (important for phylogenetic profiles) • Loss in organellar genomes frequently accompanied by gain in nuclear genome
  • 17. TIGRTIGR Duplication and Loss of Mismatch Repair Genes 51234* E. coliH. influenzaeN. gonorrhoaeaH. pyloriSyn. spB. subtilisS. pyogenesM. pneumoniaeM. genitaliumA. aeolicusD. radioduransT.pallidumB.burgdorferiSyn. spB. subtilisS. pyogenesA. aeolicusD. radioduransB. burgdorferiMutS1MutS-IlineageMutS-II lineageSpecies TreeGene loss*Gene Duplications1-5Gene LossA.B.A. aeolicusS pyogenesB. subtilisSyn. spD. radioduransMutS2B.burgdorferi
  • 18. TIGRTIGR Evolution and Complete Genomes II: Gene and Genome Duplication
  • 19. TIGRTIGR Why Duplications Are Useful to Identify • Allows division into orthologs and paralogs • Improves functional predictions • Helps identify mechanisms of duplication • Can be used to study mutation processes in different parts of a genome • Lineage specific duplications may be indicative of species’ specific adaptations
  • 20. TIGRTIGR Expansion of MCP Family in V. choleraeE.coli gi1787690B.subtilis gi2633766Synechocystis sp. gi1001299Synechocystis sp. gi1001300Synechocystis sp. gi1652276Synechocystis sp. gi1652103H.pylori gi2313716H.pylori99 gi4155097C.jejuni Cj1190cC.jejuni Cj1110cA.fulgidus gi2649560A.fulgidus gi2649548B.subtilis gi2634254B.subtilis gi2632630B.subtilis gi2635607B.subtilis gi2635608B.subtilis gi2635609B.subtilis gi2635610B.subtilis gi2635882E.coli gi1788195E.coli gi2367378E.coli gi1788194E.coli gi1789453C.jejuni Cj0144C.jejuni Cj0262cH.pylori gi2313186H.pylori99 gi4154603C.jejuni Cj1564C.jejuni Cj1506cH.pylori gi2313163H.pylori99 gi4154575H.pylori gi2313179H.pylori99 gi4154599C.jejuni Cj0019cC.jejuni Cj0951cC.jejuni Cj0246cB.subtilis gi2633374T.maritima TM0014T.pallidum gi3322777T.pallidum gi3322939T.pallidum gi3322938B.burgdorferi gi2688522T.pallidum gi3322296B.burgdorferi gi2688521T.maritima TM0429T.maritima TM0918T.maritima TM0023T.maritima TM1428T.maritima TM1143T.maritima TM1146P.abyssi PAB1308P.horikoshii gi3256846P.abyssi PAB1336P.horikoshii gi3256896P.abyssi PAB2066P.horikoshii gi3258290P.abyssi PAB1026P.horikoshii gi3256884D.radiodurans DRA00354D.radiodurans DRA0353D.radiodurans DRA0352P.abyssi PAB1189P.horikoshii gi3258414B.burgdorferi gi2688621M.tuberculosis gi1666149V.cholerae VC0512V.cholerae VCA1034V.cholerae VCA0974V.cholerae VCA0068V.cholerae VC0825V.cholerae VC0282V.cholerae VCA0906V.cholerae VCA0979V.cholerae VCA1056V.cholerae VC1643V.cholerae VC2161V.cholerae VCA0923V.cholerae VC0514V.cholerae VC1868V.cholerae VCA0773V.cholerae VC1313V.cholerae VC1859V.cholerae VC1413V.cholerae VCA0268V.cholerae VCA0658V.cholerae VC1405V.cholerae VC1298V.cholerae VC1248V.cholerae VCA0864V.cholerae VCA0176V.cholerae VCA0220V.cholerae VC1289V.cholerae VCA1069V.cholerae VC2439V.cholerae VC1967V.cholerae VCA0031V.cholerae VC1898V.cholerae VCA0663V.cholerae VCA0988V.cholerae VC0216V.cholerae VC0449V.cholerae VCA0008V.cholerae VC1406V.cholerae VC1535V.cholerae VC0840V.cholerae VC0098V.cholerae VCA1092V.cholerae VC1403V.cholerae VCA1088V.cholerae VC1394V.cholerae VC0622NJ*******************************************************************************
  • 21. TIGRTIGR C. pneumoniae Paralogs by Position 0 250000 500000 750000 1000000 1250000 Subject Orf Position 0 250000 500000 750000 1000000 1250000 Query Orf Position
  • 22. TIGRTIGR C. pneumoniae Paralogs - Lineage Specific 0 250000 500000 750000 1000000 1250000 Subject Orf Position 0 250000 500000 750000 1000000 1250000 Query Orf Position
  • 23. TIGRTIGR Evolution and Complete Genomes III: Genome Rearrangements
  • 24. TIGRTIGR X-files Eisen et al. 2000. Genome Biology 1(6): 11.1-11.9 Also see Tillier and Collins. 2000. Nature Genetics 26(2):195-7.
  • 25. TIGRTIGR V. cholerae vs. E. coli Best Matching Proteins by Location 0 1000000 2000000 3000000 4000000 5000000 E. coli ORF Coordinates 0 500000 1000000 1500000 2000000 2500000 3000000 V. cholerae ORF Coordinates
  • 26. TIGRTIGR M. leprae vs. M. tuberculosis Whole Genome Alignment 0 1000000 2000000 3000000 4000000 Mycobacterium tuberculosis 0 1000000 2000000 3000000 Mycobacterium leprae
  • 27. TIGRTIGR Duplication and Gene Loss Model A B CD E F A B CD E F A B C D E F A B C D E F A’ B’ C’ D’ E’ F’ A B C D E F A’ B’ C’ D’ E’ F’ A C D F A’ B’ E’ E. coli E. coli B C D F A’ B’ D’ E’ V. cholerae A B C D E F A’ B’ C’ D’ E’ F’
  • 28. TIGRTIGR C. trachomatis MoPn C.pneumoniaeAR39 Origin Terminus C. trachomatis vs C. pneumoniae Dot Plot
  • 30. TIGRTIGR Why are Inversions Symmetrical Around Origin • Genetic studies in Salmonella and E. coli suggest that there may be strong selection against other inversions – Mahan, Segall, Schmid and Roth – Liu and Sanderson – Rebollo, Francois, and, Louarn
  • 31. TIGRTIGR Evolution and Complete Genomes IV: Gene Transfer
  • 32. TIGRTIGR Why Gene Transfers Are Useful to Identify • Laterally transferred genes frequently involved in environmental adaptations and/or pathogenicity • Helps identify transposons, integrons, and other vectors of gene transfer • Helps identify species associations in the environment
  • 33. TIGRTIGR Tree of Life or Web of Life?
  • 34. TIGRTIGR Most ‘Evidence’ for Gene Transfer has Alternative Explanations
  • 35. TIGRTIGR How to Infer Gene Transfers • Unusual distribution patterns • Unusual nucleotide composition • High sequence similarity to supposedly distantly related species • Unusual gene trees • Observe transfer events
  • 36. TIGRTIGR 100s of DNA Islands in O157:H7 vs. K12: Gene Loss or Transfer?
  • 37. TIGRTIGR Lateral Transfer Inference Based on Complete Genome Analysis I: Organellar to Nuclear Transfers in A. thaliana
  • 38. TIGRTIGR A. thaliana Nuclear Proteins: Best Matches to Complete Genomes 0 1000 2000 3000 4000 BestMatches CHLTE PORGI BACSU MCYTU BBUR TREPA CHLPN ECOLI NEIME RICPR CAUCR HELPY SYNSP AQUAE DEIRA THEMA AERPE ARCFU METJA METTH PYRAB CELEG YEAST DROME B A E
  • 39. TIGRTIGR SYNSP0100200300400500600700800900 Number of Best Matches to This Species050010001500200025003000350040004500 Number of ORFs in Complete Genome Best Matches vs. Prokaryotes
  • 40. TIGRTIGR Organellar HSP60s DROMECG12101DROMECG7235DROMECG2830DROMECG16954ARATH At2g33210ARATH F14O13.19ARATH MCP4.7YEAST SWCAUCR ORF03639RICPR gi|3861167ECOLI gi|1790586NEIMEb gi|7227233.AQUAE gi|2984379CHLPN gi|4376399|DEIRA ORF02245BACSU gi|2632916SYNSP gi|1652489SYNSP gi|1001103ARATH At2g28000ARATH MRP15.11MCYTU gi|2909515MCYTU gi|1449370THEMA TM0506BBUR gi|2688576TREPA gi|3322286PORGI ORF00933CHLTE ORF00173HELPY gi|2313084 Mitochondrial Forms α−ΠροτεοΧψανοβαχτεριαΠλαστιδ Φορµσ
  • 41. TIGRTIGR Lateral Transfer Inference Based on Complete Genome Analysis II: Bacterial to Vertebrate Transfers Based on Analysis of the Human Genome
  • 42. TIGRTIGR Lander et al. ‘Evidence’ • Genes match bacteria not non-vertebrate eukaryotes • Or, genes have stronger match to bacteria than non-vertebrates • A set of ~120 of these genes found in many bacterial species
  • 43. TIGRTIGR Alternative explanations • Gene loss from non-vertebrate eukaryotes • Rapid divergence in non-vertebrate eukaryotes • Incomplete genomes (e.g., D. melanogaster) • Bad annotation/gene finding • Contamination
  • 45. TIGRTIGR Trees Don’t Support Transfer Paramecium bursaria Chlorella virus 1Homo sapiens HAS1Mus musculus HAS1Xenopus laevisXenopus laevisDanio rerioHomo sapiensMus musculusDanio rerioXenopus laevisGallus gallusBos taurusHomo sapiensMus musculusRattus norvegicusBradyrhizobium sp SNU001Rhizobium leguminosarumRhizobium spRhizobium lotiRhizobium tropiciRhizobium sp. NodCMesorhizobium sp 7653RSinorhizobium melilotiRhizobium melilotiRhizobium leguminosarumRhizobium galegaeAzorhizobium caulinodansStigmatella aurantiacaStreptomyces coelicolorStreptococcus uberisStreptococcus equisimilisStreptococcus pyogenes HASAStreptococcus pneumoniae0.2 BacteriaVertebratesVirusIIIIII
  • 46. TIGRTIGR Number of pBVTs is Dependent on # of Genomes Analyzed
  • 47. TIGRTIGR Birney et al, same issue of Nature as complete genome “The unfinished human genomic DNA may contain contamination, particularly from bacteria but also from other sources. Contaminating DNA is routinely removed from finished sequence, but some is still present in unfinished sequence. If the predicted gene matches a bacterial gene more closely than any vertebrate gene then it will almost always be a contaminant.”
  • 48. TIGRTIGR Evolution and Complete Genomes V: Species Evolution
  • 50. TIGRTIGR Whole Genome vs. rRNA hanobacterium thermoautotrophicumhaeoglobus fulgidusococcus horikoshiihanococcus jannaschiieropyrum pernixchangeschaeaobacterium tuberculosislus subtilischocystis sp.Aquifex aeolicusermotoga maritimaeinococcus radioduranseponema pallidumorrelia burgdorferiobacter pyloripylobacter jejuniseria meningitidiserichia colio choleraemophilus influenzaeettsia prowazekiioplasma pneumoniaeoplasma genitaliummydia trachomatismydia pneumoniaecterianorhabditis elegansophila melanogastercharomyces cerevisiaekarya
  • 51. TIGRTIGR Deinococcus radiodurans 2a) RecA2b) SS-rRNAErwinia carotovaraEscherichia coliShigella flexneriEnterobacter agglomeransYersinia pestisSerratia marcescensProteus vulgarisProteus mirabilisVibrio anguilarrumVibrio choleraeHaemophilus influenzaeArabidopsis thaliana CPSTAcetobacter polyoxogenesMethylobacillus flagellatumMethylomonas claraMethylophilus methylotrophusMagnetispirillum magnetotacticumRhizobium phaseoliRhizobium viciaeCorynebacterium glutamicumStreptomyces violaceusMycobacterium lepraeMycobacterium tuberculosisStreptomyces ambofaciensStreptomyces lividansBorrelia burgdorferiBacteroides fragilisChlamydia trachomatisThermus aquaticusThermus thermophilusAquifex pyrophilusThermotoga maritimaLactococcus lactisStreptococcus pneumoniaeBacillus subtilisStaphylococcus aureusAcholeplasma laidlawiiSynechococcus sp. PCC7002Synechococcus sp. PCC7942Anabaena variabilisCampylobacter jejuniHelicobacter pyloriAgrobacterium tumefaciensRhizobium melilotiRhodobacter sphaeroidesRhodobacter capsulatusRickettsia prowazekiiMyxococcus xanthus2Myxococcus xanthus1Xanthomonas oryzaeThiobacillus ferrooxidansAcidiphilium facilisBrucella abortusNeisseria gonorrhoeaePseudomonas fluorescencsPseudomonas aeruginosaAzotobacter vinelandiiPseudomonas putidaAcinetobacter calcoaceticusLegionella pneumophilaBurkholderia cepaciaBordetella pertussisMycoplasma mycoidesMycoplasma pulmonisErwinia carotovaraEscherichia coliEnterobacter agglomeransYersinia pestisSerratia marcescensProteus vulgarisArsenophonus nasoniaeVibrio anguilarrumVibrio choleraeHaemophilus influenzae"Flavobacterium" lutescensNicotiana tabacum CPSTAcetobacter pasterianusMethylobacillus flagellatumMethylomonas methylovoraMethylophilus methylotrophusMagnetispirillum magnetotacticumRhizobium phaseoliRhizobium viciaeCorynebacterium glutamicumStreptomyces coelicolorMycobacterium lepraeMycobacterium tuberculosisStreptomyces ambofaciensStreptomyces lividansBorrelia burgdorferiBacteroides fragilisChlamydia trachomatisThermus aquaticusThermus thermophilusDeinococcus radioduransAquifex pyrophilusThermotoga maritimaLactococcus lactisStreptococcus salivariusBacillus subtilisStaphylococcus aureusAcholeplasma laidlawiiSynechococcus sp. PCC6301Phormidium minutumAnabaena sp. PCC7120Campylobacter jejuniHelicobacter pyloriAgrobacterium tumefaciensRhizobium melilotiRhodobacter sphaeroidesRhodobacter capsulatusRickettsia prowazekiiMyxococcus xanthusXanthomonas oryzaeThiobacillus caldusAcidiphilium facilisBrucella abortusNeisseria gonorrhoeaePseudomonas flavescensPseudomonas aeruginosaPseudomonas putidaAcinetobacter calcoaceticusLegionella pneumophilaBurkholderia cepaciaBordetella pertussisMycoplasma mycoidesMycoplasma pulmonisγ1γ2βαΛοωΓΧΗιγηΓΧδεΧψανο∆/Τ
  • 53. TIGRTIGR Phylogenetic Profile - E.coli Flagellar GenesfhiAfliMfliPfliGflgGfliFflgIflhAflhBgcpE
  • 54. TIGRTIGR PG Profile. C. tepidum Chlorophyll Synthesis CbiGCbiPDsrNCbiACbiJHCobNBchH1BchH2CobN2BchH3ChlIChlI2ChlI3
  • 57. TIGRTIGR Genomics does not require initial culturing step. • Isolate, by filtration, all bacteria in a water sample • Extract total DNA in very large pieces • Clone those pieces as BACs into E.coli to get enough. • Sequence the BACs like a bacterial genome. Natural Water Filter concentrate Extract DNA Clone Into BACs Sequence Gene List
  • 58. TIGRTIGR Bacterial Rhodopsin: a new photosynthesis system in the oceans SAR86, an uncultured bacteria BAC Sequenced and Analyzed Beja O, et.al., Science 2000 289:1902-6 Bacterial rhodopsin: evidence for a new type of phototrophy in the sea. Rhodopsin found H+ light H+ ADP ATP Cloned into E. coli E. coli pumps protons in the light
  • 59. TIGRTIGR 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 0 m 80 m 750 m γ α β ε Proteobacteria Archaea Best Matches of Bac Ends
  • 60. TIGRTIGR RecA-Bacteroides/Cytophaga in Monterey Bay BACs Chlorobium tepidum Cytophaga hutchinsonii Prevotella ruminocola Bacteroides fragilis Porphyromonas gingivalis MBBAD68TR MBBAD65TR
  • 63. TIGRTIGR Wither Genomics? Not yet. • Despite limitations, a great deal can still be learned from genome sequence analysis.
  • 64. TIGRTIGR Evolutionary Diversity Still Poorly Represented in Complete Genomes Tmf-pendenR-rubrum3Azs-brasi2Rm-vannielRhb-legum8Bdr-japoniSpg-capsulRic-prowazSte-maltopSpr-volutaRub-gelat2Rcy-purpurNis-gonor1Hrh-halch2Alm-vinosmPs-aerugi3E-coliMyx-xanthuBde-stolpiDsv-desulfDsb-postgaC-leptumC-butyric4C-pasteuriEub-barkerC-quercicoHel-chlor2Acp-laidlaM-capricolC-ramosumB-stearothEco-faecalLis-monoc3B-cereus4B-subtilisStc-therm3L-delbruckL-caseiFus-nucleaGlb-violacOlst-lut_CZea mays CNost-muscrSyn-6301Tnm-lapsumFlx-litoraCy-lyticaEmb-brevi2Bac-fragilPrv-rumcolPrb-diffluCy-hutchinFlx-canadaSap-grandiChl-limicoWln-succi2Hlb-pylor6Cam-jejun5Stm-ambofaArb-globifCor-xerosiBif-bifiduCfx-aurantTmc-roseumAqu-pyrophenv-SBAR12env-SBAR16Msr-barkerTpl-acidopMsp-hungatHf-volcaniMb-formiciMt-fervid1Tc-celerArg-fulgidMpy-kandl1Mc-vannielMc-jannascenv-pJP27Sul-acaldaThp-tenaxenv-pJP89Tt-maritimFer-islandMei-ruber4D-radiodurChd-psittaAcbt-capslenv-MC18Pir-staleyLpn-illiniLps-interKSpi-stenosTrp-pallidBor-burgdoSpi-halophBrs-hyodysFib-sucS85Tmf-pendenR-rubrum3Azs-brasi2Rm-vannielRhb-legum8Bdr-japoniSpg-capsulRic-prowazSte-maltopSpr-volutaRub-gelat2Rcy-purpurNis-gonor1Hrh-halch2Alm-vinosmPs-aerugi3E-coliMyx-xanthuBde-stolpiDsv-desulfDsb-postgaC-leptumC-butyric4C-pasteuriEub-barkerC-quercicoHel-chlor2Acp-laidlaM-capricolC-ramosumB-stearothEco-faecalLis-monoc3B-cereus4B-subtilisStc-therm3L-delbruckL-caseiFus-nucleaGlb-violacOlst-lut_CZea mays CNost-muscrSyn-6301Tnm-lapsumFlx-litoraCy-lyticaEmb-brevi2Bac-fragilPrv-rumcolPrb-diffluCy-hutchinFlx-canadaSap-grandiChl-limicoWln-succi2Hlb-pylor6Cam-jejun5Stm-ambofaArb-globifCor-xerosiBif-bifiduCfx-aurantTmc-roseumAqu-pyrophenv-SBAR12env-SBAR16Msr-barkerTpl-acidopMsp-hungatHf-volcaniMb-formiciMt-fervid1Tc-celerArg-fulgidMpy-kandl1Mc-vannielMc-jannascenv-pJP27Sul-acaldaThp-tenaxenv-pJP89Tt-maritimFer-islandMei-ruber4D-radiodurChd-psittaAcbt-capslenv-MC18Pir-staleyLpn-illiniLps-interKSpi-stenosTrp-pallidBor-burgdoSpi-halophBrs-hyodysFib-sucS85 BacteriaArchaeaBacteriaArchaeaA. rRNA tree of Bacterial and Archaeal Major GroupsB. Groups with Completed Genomes Highlighted
  • 65. TIGRTIGR Limited Ecological and Physiological Diversity • All genomes from cultured species or pathogens/symbionts • Limited ecological diversity – most are from pathogens or thermophiles • Limited physiological diversity – need whole range for particular physiologies, not just extremes
  • 67. TIGRTIGR Why Completeness is Important • Improves characterization of genome features – Gene order, replication origins • Better comparative genomics – Genome duplications, inversions • Presence and absence of particular genes can be very important (e.g., gene loss) • Missing sequence might be important (e.g., centromere) • Allows researchers to focus on biology not sequencing • Facilitates large scale correlation studies
  • 68. TIGRTIGR Acknowledgements • Genome inversions: S. Salzberg, J. Heidelberg, O. White, A. Stoltzfus, J. Peterson, H. Ochman • Genome sequences and analysis: J. Heidelberg, T. Read, H. Tettelin, K. Nelson, J. Peterson, R. Fleischmann, D. Bryant • Horizontal transfers: K. Nelson, W. F. Doolittle • TIGR: C. Fraser, J. Venter, M-I. Benito, S. Kaul, Seqcore • $$$: NSF, NIH, ONR, DOE
  • 69. TIGRTIGR Evolutionary Studies Improve Most Aspects of Genome Analysis • Phylogeny of species places comparative data in perspective • Evolution of genes and gene families – Functional predictions – Identification of orthologs and paralogs – Species specific mutation patterns • Evolution of pathways – Convergence – Prediction of function • Evolution of gene order/genome rearrangements • Phylogenetic distribution patterns • Identification of novel features
  • 70. TIGRTIGR Genome Information and Analysis Improves Studies of Evolution • Complete genome information particularly useful • Unbiased sampling • More sequences of genes • Presence/absence information needed to infer certain events (e.g., gene loss, duplication) • Genome wide mutation and substitution patterns (e.g., strand bias) • Diversification and duplication
  • 73. TIGRTIGR Tracing Gene Loss • Need presence and absence information of orthologous genes from different species • Determining absence requires a complete genome • May still miss some homologs (e.g., due to rapid divergence) • Helps to have closely related species • Use standard character state reconstruction methods to infer gene gain and loss

Hinweis der Redaktion

  1. <number>