SlideShare a Scribd company logo
1 of 103
Download to read offline
FBW
22-11-2016
Wim Van Criekinge
BPC 2016 ?
Phylogenetics
Introduction
Definitions
Species concept
Examples
The Tree-of-life
Phylogenetics Methodologies
Algorithms
Distance Methods
Maximum Likelihood
Maximum Parsimony
Rooting
Statistical Validation
Conclusions
Orthologous genes
Horizontal Gene Transfer
Phylogenomics
Practical Approach: PHYLIP
Weblems
Phylogeny (phylo =tribe + genesis)
Phylogenetic trees are about visualising evolutionary
relationships. They reconstruct the pattern of events
that have led to the distribution and diversity of life.
The purpose of a phylogenetic tree is to illustrate how a
group of objects (usually genes or organisms) are
related to one another
Nothing in Biology Makes Sense Except in the Light of
Evolution. Theodosius Dobzhansky (1900-1975)
What is phylogenetics ?
Trees
• Diagram consisting of branches and nodes
• Species tree (how are my species related?)
– contains only one representative from each
species.
– all nodes indicate speciation events
• Gene tree (how are my genes related?)
– normally contains a number of genes from a
single species
– nodes relate either to speciation or gene
duplication events
Clade: A set of species which includes all of the species
derived from a single common ancestor
Species Concepts from Various Authors
D.A. Baum and K.L. Shaw - Exclusive groups of organisms, where an exclusive group is one whose members are all more closely related to
each other than to any organisms outside the group.
J. Cracraft - An irreducible cluster of organisms, diagnosably distinct from other such clusters, and within which there is a parental pattern of
ancestry and descent.
Charles Darwin - "From these remarks it will be seen that I look at the term species, as one arbitrarily given for the sake of convenience to a set
of individuals closely resembling each other, and that it does not essentially differ from the term variety, which is given to less distinct and
more fluctuating forms. The term variety, again, in comparison with mere individual differences, is also applied arbitrarily, and for mere
convenience sake" (Origin of Species, 1st
ed., p. 108).
T. Dobzhansky - The largest and most inclusive reproductive community of sexual and cross-fertilizing individuals which share a common gene
pool. And later...Systems of populations, the gene exchange between which is limited or prevented by reproductive isolating mechanisms.
M. Ghiselin - The most extensive units in the natural economy, such that reproductive competition occurs among their parts.
D.M. Lambert - Groups of individuals that define themselves by a specific mate recognition system.
J. Mallet - Identifiable genotypic clusters recognized by a deficit of intermediates, both at single loci and at multiple loci.
E. Mayr - Groups of actually or potentially interbreeding natural populations which are reproductively isolated from other such groups.
C.D. Michener - A group of organisms not itself divisible by phenetic gaps resulting from concordant differences in character states (except for
morphs - such as sex, age, or caste), but separated by such phenetic gaps from other such units.
H.E.H. Patterson - That most inclusive population of individual biparental organisms which share a common fertilization system.
G.G. Simpson - A lineage of populations evolving with time, separately from others, with its own unique evolutionary role and tendencies.
P.H.A. Sneath and R.R. Sokal - The smallest (most homogeneous) cluster that can be recognized upon some given criterion as being distinct
from other clusters.
A.R. Templeton - The most inclusive population of individuals having the potential for phenotypic cohesion through intrinsic cohesion
mechanisms (genetic and/or demographic - i.e. ecological -exchangeability).
E.O. Wiley - A single lineage of ancestor-descendant populations which maintains its identity from other such lineages and which has its own
evolutionary tendencies and historical fate.
S. Wright - A species in time and space is composed of numerous local populations, each one intercommunicating and intergrading with others.
Species
I. Definitions:
Species = the basic unit of classification
> Three different ways to recognize species:
Definitions:
> Three different ways to recognize species:
1) Morphological species = the smallest group that is
consistently and persistently distinct (Clusters in
morphospace)
species are recognized initially on the basis of
appearance; the individuals of one species look
different from the individuals of another
Plant Species
Definitions:
> Three different ways to recognize species:
2) Biological species = a set of interbreeding or
potentially interbreeding individuals that are
separated from other species by reproductive
barriers
species are unable to interbreed
Species
Definitions:
> Three different ways to recognize species:
3) Phylogenetic species = the boundary between
reticulate (among interbreeding individuals) and
divergent relationships (between lineages with no
gene exchange)
Species
reticulate
divergent
Phylogenetic species
recognized by the pattern of ancestor - descendent relationships
boundary
Definitions:
> Three different ways to recognize species:
4) Phylogenomics species = ability to transmit (and
maintain) a (stable) gene pool
Adresses the Anopheles genome topology
variations
Species
• In the tree to the left, A and B share the most recent
common ancestry. Thus, of the species in the tree,
A and B are the most closely related.
• The next most recent common ancestry is C with
the group composed of A and B. Notice that the
relationship of C is with the group containing A
and B. In particular, C is not more closely related to
B than to A. This can be emphasized by the
following two trees, which are equivalent to each
other:
Branching Order in a Phylogenetic Tree
• A common simplifying assumption is that the three is bifurcating,
meaning that each brach node has exactly two descendents.
• The edges, taken together, are sometimes said to define the topology
of the tree
More definitions …
Branch node, internal node
Edge, Branch
Leafs
Tips
external node
Outgroups, rooted versus unrooted
An unrooted reptilian phylogeny with an avian outgroup and
the corresponding rooted phylogeny. The Ri represent modern
reptiles; the Ai, inferred ancestors and the B a bird.
Some definitions …
Phylogenetic methods may be used to
solve crimes, test purity of products, and
determine whether endangered species
have been smuggled or mislabeled:
– Vogel, G. 1998. HIV strain analysis debuts in
murder trial. Science 282(5390): 851-853.
– Lau, D. T.-W., et al. 2001. Authentication of
medicinal Dendrobium species by the internal
transcribed spacer of ribosomal DNA. Planta
Med 67:456-460.
Examples
– Epidemiologists use phylogenetic methods to
understand the development of pandemics,
patterns of disease transmission, and
development of antimicrobial resistance or
pathogenicity:
• Basler, C.F., et al. 2001. Sequence of the 1918
pandemic influenza virus nonstructural gene (NS)
segment and characterization of recombinant viruses
bearing the 1918 NS genes. PNAS, 98(5):2746-2751.
• Ou, C.-Y., et al. 1992. Molecular epidemiology of HIV
transmission in a dental practice. Science
256(5060):1165-1171.
• Bacillus Antracis:
Examples
• Conservation biologists may use these techniques to
determine which populations are in greatest need of
protection, and other questions of population structure:
– Trepanier, T.L., and R.W. Murphy. 2001. The Coachella Valley
fringe-toed lizard (Uma inornata): genetic diversity and
phylogenetic relationships of an endangered species. Mol
Phylogenet Evol 18(3):327-334.
– Alves, M.J., et al. 2001. Mitochondrial DNA variation in the
highly endangered cyprinid fish Anaecypris hispanica:
importance for conservation. Heredity 87(Pt 4):463-473.
• Pharmaceutical researchers may use phylogenetic
methods to determine which species are most closely
related to other medicinal species, thus perhaps sharing
their medicinal qualities:
– Komatsu, K., et al. 2001. Phylogenetic analysis based on 18S
rRNA gene and matK gene sequences of Panax vietnamensis
and five related species. Planta Med 67:461-465.
Examples
Tree-of-life
Origin of the Universe 15 billion yrs
Formation of the Solar System 4.6 "
First Self-replicating System 3.5 "
Prokaryotic-Eukaryotic Divergence 2.0 "
Plant-Animal Divergence 1.0 "
Invertebrate-Vertebrate Divergence 0.5 "
Mammalian Radiation Beginning 0.1 "
Some Important Dates in History
Tree Of Life
Tree Of Life
Tree Of Life
Tree Of Life
What Sequence to Use ?
• To infer relationships that
span the diversity of
known life, it is necessary
to look at genes conserved
through the billions of
years of evolutionary
divergence.
• The gene must display an
appropriate level of
sequence conservation for
the divergences of
interest.
• If there is too much change, then
the sequences become
randomized, and there is a limit to
the depth of the divergences that
can be accurately inferred.
• If there is too little change (if the
gene is too conserved), then
there may be little or no change
between the evolutionary
branchings of interest, and it will
not be possible to infer close
(genus or species level)
relationships.
What Sequence to Use ?
Carl Woese
recognized the full potential of rRNA
sequences as a measure of phylogenetic
relatedness. He initially used an RNA
sequencing method that determined about
1/4 of the nucleotides in the 16S rRNA (the
best technology available at the time). This
amount of data greatly exceeded anything
else then available. Using newer methods,
it is now routine to determine the
sequence of the entire 16S rRNA
molecule. Today, the accumulated 16S
rRNA sequences (about 10,000) constitute
the largest body of data available for
inferring relationships among organisms.
Ribosomal RNA Genes and Their Sequences
An example of genes in this
category are those that define
the ribosomal RNAs (rRNAs).
Most prokaryotes have three
rRNAs, called the 5S, 16S and
23S rRNA.
What Sequence to Use ?
Namea Size (nucleotides) Location
5S 120 Large subunit of ribosome
16S 1500 Small subunit of ribosome
23S 2900 Large subunit of ribosome
a The name is based on the rate that the
molecule sediments (sinks) in water.
Bigger molecules sediment faster than small ones.
The extraordinary conservation of rRNA genes can
be seen in these fragments of the small subunit
rRNA gene sequences from organisms spanning
the known diversity of life:
human ...GTGCCAGCAGCCGCGGTAATTCCAGCTCCAATAGCGTATATTAAAGTTGCTGCAGTTAAAAAG...
yeast ...GTGCCAGCAGCCGCGGTAATTCCAGCTCCAATAGCGTATATTAAAGTTGTTGCAGTTAAAAAG...
Corn ...GTGCCAGCAGCCGCGGTAATTCCAGCTCCAATAGCGTATATTTAAGTTGTTGCAGTTAAAAAG...
Escherichia coli ...GTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCG...
Anacystis nidulans ...GTGCCAGCAGCCGCGGTAATACGGGAGAGGCAAGCGTTATCCGGAATTATTGGGCGTAAAGCG...
Thermotoga maratima ...GTGCCAGCAGCCGCGGTAATACGTAGGGGGCAAGCGTTACCCGGATTTACTGGGCGTAAAGGG...
Methanococcus vannielii ...GTGCCAGCAGCCGCGGTAATACCGACGGCCCGAGTGGTAGCCACTCTTATTGGGCCTAAAGCG...
Thermococcus celer ...GTGGCAGCCGCCGCGGTAATACCGGCGGCCCGAGTGGTGGCCGCTATTATTGGGCCTAAAGCG...
Sulfolobus sulfotaricus ...GTGTCAGCCGCCGCGGTAATACCAGCTCCGCGAGTGGTCGGGGTGATTACTGGGCCTAAAGCG...
Ribosomal RNA Genes and Their Sequences
Other genes …
• Rate of evolution = rate of mutation
• rate of evolution for any macromolecule is
approximately constant over time (Neutral
Theory of evolution)
• For a given protein the rate of sequence
evolution is approximately constant across
lineages. Zuckerkandl and Pauling (1965)
• This would allow speciation and duplication
events to be dated accurately based on
molecular data
Molecular Clock (MC)
Noval trees using Hox genes
• (a) A traditional phylogenetic tree and
• (a) A traditional phylogenetic tree and
• (b) the new phylogenetic tree, each showing the
positions of selected phyla. B, bilateria; AC,
acoelomates; PC, pseudocoelomates; C,
coelomates; P, protostomes; L, lophotrochozoa; E,
ecdysozoa; D, deuterostomes.
• Local and approximate molecular
clocks more reasonable
– one amino acid subst. 14.5 My
– 1.3 10-9 substitutions/nucleotide site/year
– Relative rate test (see further)
• ((A,B),C) then measure distance between
(A,C) & (B,C)
Molecular Clock (MC)
Rate of Change Theoretical Lookback Time
(PAMs / 100 myrs) (myrs)
Pseudogenes 400 45
Fibrinopeptides 90 200
Lactalbumins 27 670
Lysozymes 24 850
Ribonucleases 21 850
Haemoglobins 12 1500
Acid proteases 8 2300
Cytochrome c 4 5000
Glyceraldehyde-P dehydrogenase2 9000
Glutamate dehydrogenase 1 18000
PAM = number of Accepted Point Mutations per 100 amino acids.
Proteins evolve at highly different rates
Phylogenetics
Introduction
Definitions
Species concept
Examples
The Tree-of-life
Phylogenetics Methodologies
Algorithms
Distance Methods
Maximum Likelihood
Maximum Parsimony
Rooting
Statistical Validation
Conclusions
Orthologous genes
Horizontal Gene Transfer
Phylogenomics
Practical Approach: PHYLIP
Weblems
Multiple Alignment Method
• align
• select method (evolutionary
model)
–Distance
–ML
–MP
• generate tree
• validate tree
4-steps
Some definitions …
• Convert sequence data into a
set of discrete pairwise distance
values (n*(n-1)/2), arranged into
a matrix. Distance methods fit a
tree to this matrix.
• The phylogenetic topology tree
is constructed by using a cluster
analysis method (like upgma or
nj methods).
Distance matrix methods (upgma, nj, Fitch,...)
Distance matrix methods (upgma, nj, Fitch,...)
Distance matrix methods (upgma, nj, Fitch,...)
CGT
Distance matrix methods (upgma, nj, Fitch,...)
Since we start with A,p(A)=1
Distance matrix methods (upgma, nj, Fitch,...)
D=evolutionary distance ~ tijd
F = dissimilarity ~ (1 – PX(t))
F ~ 1 – d
Distance matrix methods (upgma, nj, Fitch,...)
Unweighted Pair Group Method with Arithmatic Mean (UPGMA)
Unweighted Pair Group Method with Arithmatic Mean (UPGMA)
Unweighted Pair Group Method with Arithmatic Mean (UPGMA)
Unweighted Pair Group Method with Arithmatic Mean (UPGMA)
Distance matrix methods: Summary
http://www.bioportal.bic.nus.edu.sg/phylip/neighbor.html
• The phylogeny makes an estimation of
the distance for each pair as the sum
of branch lengths in the path from one
sequence to another through the tree.
 easy to perform ;
 quick calculation ;
 fit for sequences having high similarity scores ;
• drawbacks :
 the sequences are not considered as such (loss
of information) ;
 all sites are generally equally treated (do not
take into account differences of substitution
rates ) ;
 not applicable to distantly divergent sequences.
Distance matrix methods (upgma, nj, Fitch,...)
• In this method, the bases
(nucleotides or amino acids) of all
sequences at each site are
considered separately (as
independent), and the log-likelihood
of having these bases are computed
for a given topology by using a
particular probability model.
• This log-likelihood is added for all
sites, and the sum of the log-
likelihood is maximized to estimate
the branch length of the tree.
Maximum likelihood
Maximum likelihood
• This procedure is repeated for all
possible topologies, and the topology
that shows the highest likelihood is
chosen as the final tree.
• Notes :
 ML estimates the branch lengths of the
final tree ;
 ML methods are usually consistent ;
 ML is extented to allow differences
between the rate of transition and
transversion.
• Drawbacks
 need long computation time to construct a
tree.
Maximum likelihood
Maximum likelihood
Parsimony criterion
• It consists of determining the minimum
number of changes (substitutions) required to
transform a sequence to its nearest neighbor.
Maximum Parsimony
• The maximum parsimony algorithm searches
for the minimum number of genetic events
(nucleotide substitutions or amino-acid
changes) to infer the most parsimonious tree
from a set of sequences.
Maximum Parsimony
Maximum Parsimony
Occam’s Razor
Entia non sunt multiplicanda praeter necessitatem.
William of Occam (1300-1349)
The best tree is the one which requires the least number of
substitutions
• The best tree is the one which needs the
fewest changes.
– If the evolutionary clock is not constant, the
procedure generates results which can be
misleading ;
– within practical computational limits, this
often leads in the generation of tens or more
"equally most parsimonious trees" which
make it difficult to justify the choice of a
particular tree ;
– long computation time to construct a tree.
Maximum Parsimony
Maximum Parsimony: Branch Node A or B ?
Maximum Parsimony: A requires 5 mutaties
Maximum Parsimony: B (and propagating A->B) requires only 4 mutations
• The best tree is the one which
needs the fewest changes.
• Problems :
– If the evolutionary clock is not
constant, the procedure generates
results which can be misleading ;
– within practical computational limits,
this often leads in the generation of
tens or more "equally most
parsimonious trees" which make it
difficult to justify the choice of a
particular tree ;
– long computation time to construct a
tree.
Maximum Parsimony
Phylogenetics
Introduction
Definitions
Species concept
Examples
The Tree-of-life
Phylogenetics Methodologies
Algorithms
Distance Methods
Maximum Likelihood
Maximum Parsimony
Rooting
Statistical Validation
Conclusions
Orthologous genes
Horizontal Gene Transfer
Phylogenomics
Practical Approach: PHYLIP
Weblems
 There is at present no statistical
methods which allow
comparisons of trees obtained
from different phylogenetic
methods, nevertheless many
studies have been made to
compare the relative consistency
of the existing methods.
Comparative evaluation of different methods
 The consistency depends on many
factors, among these the topology
and branch lengths of the real tree,
the transition/transversion rate and
the variability of the substitution
rates.
 One expects that if sequences have
strong phylogenetic relationship,
different methods will show the
same phylogenetic tree
Comparative evaluation of different methods
Comparison of methods
• Inconsistency
• Neighbour Joining (NJ) is very fast but depends on
accurate estimates of distance. This is more
difficult with very divergent data
• Parsimony suffers from Long Branch Attraction.
This may be a particular problem for very divergent
data
• NJ can suffer from Long Branch Attraction
• Parsimony is also computationally intensive
• Codon usage bias can be a problem for MP and NJ
• Maximum Likelihood is the most reliable but
depends on the choice of model and is very slow
• Methods may be combined
Rooting the Tree
• In an unrooted tree the direction of
evolution is unknown
• The root is the hypothesized ancestor
of the sequences in the tree
• The root can either be placed on a
branch or at a node
• You should start by viewing an
unrooted tree
Automatic rooting
• Many software packages will root
trees automaticall (e.g. mid-point
rooting in NJPlot)
• Sometimes two trees may look very
different but, in fact, differ only in the
position of the root
• This normally involves assumptions…
BEWARE!
Rooting Using an Outgroup
1. The outgroup should be a sequence (or set
of sequences) known to be less closely
related to the rest of the sequences than they
are to each other
2. It should ideally be as closely related as
possible to the rest of the sequences while
still satisfying condition 1
The root must be somewhere between the
outgroup and the rest (either on the node or
in a branch)
How confident am I that my tree is correct?
Bootstrap values
Bootstrapping is a statistical
technique that can use random
resampling of data to determine
sampling error for tree topologies
Bootstrapping phylogenies
• Characters are resampled with replacement
to create many bootstrap replicate data sets
• Each bootstrap replicate data set is analysed
(e.g. with parsimony, distance, ML etc.)
• Agreement among the resulting trees is
summarized with a majority-rule consensus
tree
• Frequencies of occurrence of groups,
bootstrap proportions (BPs), are a measure
of support for those groups
Bootstrapping - an example
Ciliate SSUrDNA - parsimony bootstrap
Majority-rule consensus
Ochromonas (1)
Symbiodinium (2)
Prorocentrum (3)
Euplotes (8)
Tetrahymena (9)
Loxodes (4)
Tracheloraphis (5)
Spirostomum (6)
Gruberia (7)
100
96
84
100
100
100
• Bootstrapping is a very valuable and widely used
technique (it is demanded by some journals)
• BPs give an idea of how likely a given branch
would be to be unaffected if additional data, with
the same distribution, became available
• BPs are not the same as confidence intervals.
There is no simple mapping between bootstrap
values and confidence intervals. There is no
agreement about what constitutes a ‘good’
bootstrap value (> 70%, > 80%, > 85% ????)
• Some theoretical work indicates that BPs can be a
conservative estimate of confidence intervals
• If the estimated tree is inconsistent all the
bootstraps in the world won’t help you…..
Bootstrap - interpretation
Jack-knifing
• Jack-knifing is very similar to
bootstrapping and differs only in the
character resampling strategy
• Jack-knifing is not as widely
available or widely used as
bootstrapping
• Tends to produce broadly similar
results
At present only sampling techniques allow testing the
topology of a phylogenetic tree
 Bootstrapping
» It consists of drawing columns from a sample of
aligned sequences, with replacement, until one gets
a data set of the same size as the original one.
(usually some columns are sampled several times
others left out)
 Half-Jacknife
» This technique resamples half of the sequence sites
considered and eliminates the rest. The final sample
has half the number of initial number of sites
without duplication.
Statistical evaluation of the obtained phylogenetic trees
Lineage.py
#Finding the lineage of an organism
#Staying with a plant example, let’s now find the lineage of the Cypripedioideae orchid family. First, we
search the Taxonomy database for Cypripedioideae, which yields exactly one NCBI taxonomy identifier:
from Bio import Entrez
Entrez.email = "A.N.Other@example.com" # Always tell NCBI who you are
handle = Entrez.esearch(db="Taxonomy", term="Cypripedioideae")
record = Entrez.read(handle)
print (record["IdList"])
#Now, we use efetch to download this entry in the Taxonomy database, and then parse it:
handle = Entrez.efetch(db="Taxonomy", id="158330", retmode="xml")
records = Entrez.read(handle)
#Again, this record stores lots of information:
print (records[0].keys())
[u'Lineage', u'Division', u'ParentTaxId', u'PubDate', u'LineageEx',
u'CreateDate', u'TaxId', u'Rank', u'GeneticCode', u'ScientificName',
u'MitoGeneticCode', u'UpdateDate']
#We can get the lineage directly from this record:
print (records[0]["Lineage"])
WWW resources for molecular phylogeny (1)
 Compilations
 A list of sites and resources:
http://www.ucmp.berkeley.edu/subway/phylogen.h
tml
 An extensive list of phylogeny programs
http://evolution.genetics.washington.edu/
phylip/software.html
• Databases of rRNA sequences and
associated software
 The rRNA WWW Server - Antwerp, Belgium.
http://rrna.uia.ac.be
 The Ribosomal Database Project - Michigan State University
http://rdp.cme.msu.edu/html/
 Database similarity searches (Blast) :
http://www.ncbi.nlm.nih.gov/BLAST/
http://www.infobiogen.fr/services/menuserv.html
http://bioweb.pasteur.fr/seqanal/blast/intro-fr.html
http://pbil.univ-lyon1.fr/BLAST/blast.html
 Multiple sequence alignment
ClustalX : multiple sequence alignment with a graphical interface
(for all types of computers).
http://www.ebi.ac.uk/FTP/index.html and go to ‘software’
Web interface to ClustalW algorithm for proteins:
http://pbil.univ-lyon1.fr/ and press “clustal”
WWW resources for molecular phylogeny (2)
• Sequence alignment editor
 SEAVIEW : for windows and unix
http://pbil.univ-lyon1.fr/software/seaview.html
• Programs for molecular phylogeny
 PHYLIP : an extensive package of programs for all platforms
http://evolution.genetics.washington.edu/phylip.html
 CLUSTALX : beyond alignment, it also performs NJ
 PAUP* : a very performing commercial package
http://paup.csit.fsu.edu/index.html
 PHYLO_WIN : a graphical interface, for unix only
http://pbil.univ-lyon1.fr/software/phylowin.html
 MrBayes : Bayesian phylogenetic analysis
http://morphbank.ebc.uu.se/mrbayes/
 PHYML : fast maximum likelihood tree building
http://www.lirmm.fr/~guindon/phyml.html
 WWW-interface at Institut Pasteur, Paris
http://bioweb.pasteur.fr/seqanal/phylogeny
WWW resources for molecular phylogeny (3)
Weblems
W6.1: The growth hormones in most mammals have very similar ammo acid
sequences. (The growth hormones of the Alpaca, Dog Cat Horse, Rabbit, and
Elephant each differ from that of the Pig at no more than 3 positions out of 191.)
Human growth hormone is very different, differing at 62 positions. The evolution of
growth hormone accelerated sharply in the line leading to humans. By retrieving
and aligning growth hormone sequences from species closely related to humans
and our ancestors, determine where in the evolutionary tree leading to humans the
accelerated evolution of growth hormone took place.
W6.2: Humans are primates, an order that we, apes and monkeys share with lemurs
and tarsiers. On the basis of the Beta-globin gene cluster of human, a
chimpanzee, an old-world monkey, a new-world monkey, a lemur, and a tarsier,
derive a phylogenetic tree of these groups.
W6.3: Primates are mammals, a class we share with marsupials and monotremes;
Extant marsupials live primarily in Australia, except for the opossum, found also in
North and South America. Extant monotremes are limited to two animals from
Australia: the platypus and echidna. Using the complete mitochondnal genome
from human, horse (Equus caballus), wallaroo (Macropus robustus), American
opossum (Didelphis mrgimana), and platypus (Ormthorhynchus anatmus), draw
an evolutionary tree, indicating branch lengths. Are monotremes more closely
related to placental mammals or to marsupials?
W6.4: Mammals are vertebrates, a subphylum that we share with fishes, sharks, birds
and reptiles, amphibia, and primitive jawless fishes (example: lampreys). For the
coelacanth (Latimeria chalumnae), the great white shark (Carcharodon
carcharias), skipjack tuna (Katsuwonus pelamis), sea lamprey (Petromyzon
marinus), frog (Rana Ripens), and Nile crocodile (Crocodylus niloticus), using
sequences of cytochromes c and pancreatic ribonucleases, derive evolutionary
trees of these species.

More Related Content

What's hot

Phylogenetics an overview
Phylogenetics an overviewPhylogenetics an overview
Phylogenetics an overviewCharthaGaglani
 
Multiple Sequence Alignment-just glims of viewes on bioinformatics.
 Multiple Sequence Alignment-just glims of viewes on bioinformatics. Multiple Sequence Alignment-just glims of viewes on bioinformatics.
Multiple Sequence Alignment-just glims of viewes on bioinformatics.Arghadip Samanta
 
Cytotaxonomy plant taxonomy
Cytotaxonomy plant taxonomyCytotaxonomy plant taxonomy
Cytotaxonomy plant taxonomySangeeta Das
 
Distance based method
Distance based method Distance based method
Distance based method Adhena Lulli
 
Phylogenetic trees
Phylogenetic treesPhylogenetic trees
Phylogenetic treesmartyynyyte
 
Molecular taxonomy
Molecular taxonomyMolecular taxonomy
Molecular taxonomyAnil kumar
 
Genetics Introduction
Genetics IntroductionGenetics Introduction
Genetics Introductionjrt004
 
Introduction to Mendelian Genetics
Introduction to Mendelian GeneticsIntroduction to Mendelian Genetics
Introduction to Mendelian GeneticsAnukriti Nigam
 
Plang functional genome
Plang functional genomePlang functional genome
Plang functional genometcha163
 
Basics of Cladistic Analysis Workshop
Basics of Cladistic Analysis WorkshopBasics of Cladistic Analysis Workshop
Basics of Cladistic Analysis WorkshopUsama Abdel-Hameed
 
IB Biology Genetics
IB Biology GeneticsIB Biology Genetics
IB Biology Geneticsguest9476bb
 
B4FA 2012 Ghana: Fundamentals of Genetics - Eric Danquah
B4FA 2012 Ghana: Fundamentals of Genetics - Eric DanquahB4FA 2012 Ghana: Fundamentals of Genetics - Eric Danquah
B4FA 2012 Ghana: Fundamentals of Genetics - Eric Danquahb4fa
 

What's hot (20)

Phylogenetics an overview
Phylogenetics an overviewPhylogenetics an overview
Phylogenetics an overview
 
Phylogeny
PhylogenyPhylogeny
Phylogeny
 
philogenetic tree
philogenetic treephilogenetic tree
philogenetic tree
 
Multiple Sequence Alignment-just glims of viewes on bioinformatics.
 Multiple Sequence Alignment-just glims of viewes on bioinformatics. Multiple Sequence Alignment-just glims of viewes on bioinformatics.
Multiple Sequence Alignment-just glims of viewes on bioinformatics.
 
Cytotaxonomy plant taxonomy
Cytotaxonomy plant taxonomyCytotaxonomy plant taxonomy
Cytotaxonomy plant taxonomy
 
Distance based method
Distance based method Distance based method
Distance based method
 
Phylogenetic studies
Phylogenetic studiesPhylogenetic studies
Phylogenetic studies
 
The tree of life
The tree of lifeThe tree of life
The tree of life
 
Phylogenetic trees
Phylogenetic treesPhylogenetic trees
Phylogenetic trees
 
Molecular taxonomy
Molecular taxonomyMolecular taxonomy
Molecular taxonomy
 
Genetics Introduction
Genetics IntroductionGenetics Introduction
Genetics Introduction
 
Parsimony analysis
Parsimony analysisParsimony analysis
Parsimony analysis
 
Phenetic versus phylogenetic systems
Phenetic versus phylogenetic systemsPhenetic versus phylogenetic systems
Phenetic versus phylogenetic systems
 
Introduction to Mendelian Genetics
Introduction to Mendelian GeneticsIntroduction to Mendelian Genetics
Introduction to Mendelian Genetics
 
Plang functional genome
Plang functional genomePlang functional genome
Plang functional genome
 
Basics of Cladistic Analysis Workshop
Basics of Cladistic Analysis WorkshopBasics of Cladistic Analysis Workshop
Basics of Cladistic Analysis Workshop
 
Introduction to DNA and Genetics
Introduction to DNA and GeneticsIntroduction to DNA and Genetics
Introduction to DNA and Genetics
 
IB Biology Genetics
IB Biology GeneticsIB Biology Genetics
IB Biology Genetics
 
B4FA 2012 Ghana: Fundamentals of Genetics - Eric Danquah
B4FA 2012 Ghana: Fundamentals of Genetics - Eric DanquahB4FA 2012 Ghana: Fundamentals of Genetics - Eric Danquah
B4FA 2012 Ghana: Fundamentals of Genetics - Eric Danquah
 
Some aspects of plant taxonomy
Some aspects of plant taxonomy Some aspects of plant taxonomy
Some aspects of plant taxonomy
 

Viewers also liked

2016 bioinformatics i_io_wim_vancriekinge
2016 bioinformatics i_io_wim_vancriekinge2016 bioinformatics i_io_wim_vancriekinge
2016 bioinformatics i_io_wim_vancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_proteins_wim_vancriekinge
2016 bioinformatics i_proteins_wim_vancriekinge2016 bioinformatics i_proteins_wim_vancriekinge
2016 bioinformatics i_proteins_wim_vancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_bio_python_ii_wimvancriekinge
2016 bioinformatics i_bio_python_ii_wimvancriekinge2016 bioinformatics i_bio_python_ii_wimvancriekinge
2016 bioinformatics i_bio_python_ii_wimvancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_bio_python_wimvancriekinge
2016 bioinformatics i_bio_python_wimvancriekinge2016 bioinformatics i_bio_python_wimvancriekinge
2016 bioinformatics i_bio_python_wimvancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_python_part_2_strings_wim_vancriekinge
2016 bioinformatics i_python_part_2_strings_wim_vancriekinge2016 bioinformatics i_python_part_2_strings_wim_vancriekinge
2016 bioinformatics i_python_part_2_strings_wim_vancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_python_part_1_wim_vancriekinge
2016 bioinformatics i_python_part_1_wim_vancriekinge2016 bioinformatics i_python_part_1_wim_vancriekinge
2016 bioinformatics i_python_part_1_wim_vancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_databases_wim_vancriekinge
2016 bioinformatics i_databases_wim_vancriekinge2016 bioinformatics i_databases_wim_vancriekinge
2016 bioinformatics i_databases_wim_vancriekingeProf. Wim Van Criekinge
 
2016 bioinformatics i_score_matrices_wim_vancriekinge
2016 bioinformatics i_score_matrices_wim_vancriekinge2016 bioinformatics i_score_matrices_wim_vancriekinge
2016 bioinformatics i_score_matrices_wim_vancriekingeProf. Wim Van Criekinge
 
Introducing SMCR from an HR perspective
Introducing SMCR from an HR perspectiveIntroducing SMCR from an HR perspective
Introducing SMCR from an HR perspectiveHeath Buck
 
2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vuploadProf. Wim Van Criekinge
 
نمونه آثار آقای علی عابدینی
نمونه آثار آقای علی عابدینینمونه آثار آقای علی عابدینی
نمونه آثار آقای علی عابدینیleila Abedi
 
بازدید از شهرک علمی تحقیقاتی
بازدید از شهرک علمی تحقیقاتیبازدید از شهرک علمی تحقیقاتی
بازدید از شهرک علمی تحقیقاتیleila Abedi
 
Beyond the Gig Economy
Beyond the Gig EconomyBeyond the Gig Economy
Beyond the Gig EconomyJon Lieber
 

Viewers also liked (20)

2016 bioinformatics i_io_wim_vancriekinge
2016 bioinformatics i_io_wim_vancriekinge2016 bioinformatics i_io_wim_vancriekinge
2016 bioinformatics i_io_wim_vancriekinge
 
2016 bioinformatics i_proteins_wim_vancriekinge
2016 bioinformatics i_proteins_wim_vancriekinge2016 bioinformatics i_proteins_wim_vancriekinge
2016 bioinformatics i_proteins_wim_vancriekinge
 
2016 bioinformatics i_bio_python_ii_wimvancriekinge
2016 bioinformatics i_bio_python_ii_wimvancriekinge2016 bioinformatics i_bio_python_ii_wimvancriekinge
2016 bioinformatics i_bio_python_ii_wimvancriekinge
 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
 
2016 bioinformatics i_bio_python_wimvancriekinge
2016 bioinformatics i_bio_python_wimvancriekinge2016 bioinformatics i_bio_python_wimvancriekinge
2016 bioinformatics i_bio_python_wimvancriekinge
 
2017 biological databases_part1_vupload
2017 biological databases_part1_vupload2017 biological databases_part1_vupload
2017 biological databases_part1_vupload
 
2017 biological databasespart2
2017 biological databasespart22017 biological databasespart2
2017 biological databasespart2
 
Mysql introduction
Mysql introduction Mysql introduction
Mysql introduction
 
2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge
 
2016 bioinformatics i_python_part_2_strings_wim_vancriekinge
2016 bioinformatics i_python_part_2_strings_wim_vancriekinge2016 bioinformatics i_python_part_2_strings_wim_vancriekinge
2016 bioinformatics i_python_part_2_strings_wim_vancriekinge
 
2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge2016 bioinformatics i_alignments_wim_vancriekinge
2016 bioinformatics i_alignments_wim_vancriekinge
 
2016 bioinformatics i_python_part_1_wim_vancriekinge
2016 bioinformatics i_python_part_1_wim_vancriekinge2016 bioinformatics i_python_part_1_wim_vancriekinge
2016 bioinformatics i_python_part_1_wim_vancriekinge
 
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge
 
2016 bioinformatics i_databases_wim_vancriekinge
2016 bioinformatics i_databases_wim_vancriekinge2016 bioinformatics i_databases_wim_vancriekinge
2016 bioinformatics i_databases_wim_vancriekinge
 
2016 bioinformatics i_score_matrices_wim_vancriekinge
2016 bioinformatics i_score_matrices_wim_vancriekinge2016 bioinformatics i_score_matrices_wim_vancriekinge
2016 bioinformatics i_score_matrices_wim_vancriekinge
 
Introducing SMCR from an HR perspective
Introducing SMCR from an HR perspectiveIntroducing SMCR from an HR perspective
Introducing SMCR from an HR perspective
 
2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload
 
نمونه آثار آقای علی عابدینی
نمونه آثار آقای علی عابدینینمونه آثار آقای علی عابدینی
نمونه آثار آقای علی عابدینی
 
بازدید از شهرک علمی تحقیقاتی
بازدید از شهرک علمی تحقیقاتیبازدید از شهرک علمی تحقیقاتی
بازدید از شهرک علمی تحقیقاتی
 
Beyond the Gig Economy
Beyond the Gig EconomyBeyond the Gig Economy
Beyond the Gig Economy
 

Similar to 2016 bioinformatics i_phylogenetics_wim_vancriekinge

Bioinformatics t6-phylogenetics v2013-wim_vancriekinge
Bioinformatics t6-phylogenetics v2013-wim_vancriekingeBioinformatics t6-phylogenetics v2013-wim_vancriekinge
Bioinformatics t6-phylogenetics v2013-wim_vancriekingeProf. Wim Van Criekinge
 
Taxonomy n Systematics 2
Taxonomy n Systematics 2Taxonomy n Systematics 2
Taxonomy n Systematics 2Hamid Ur-Rahman
 
Animal Systematics Lecture 2
Animal Systematics Lecture 2Animal Systematics Lecture 2
Animal Systematics Lecture 2Hamid Ur-Rahman
 
Bioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogeneticsBioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogeneticsProf. Wim Van Criekinge
 
bacterial systematics in the diversity of bacteria
bacterial systematics in the diversity  of bacteriabacterial systematics in the diversity  of bacteria
bacterial systematics in the diversity of bacteriatanvirastogi16
 
KOUSIK_GHOSHPhenetics and Cladistics2020-04-05Phenetics and Cladistics.pptx
KOUSIK_GHOSHPhenetics and Cladistics2020-04-05Phenetics and Cladistics.pptxKOUSIK_GHOSHPhenetics and Cladistics2020-04-05Phenetics and Cladistics.pptx
KOUSIK_GHOSHPhenetics and Cladistics2020-04-05Phenetics and Cladistics.pptxPriyankaChakraborty95
 
Phylogentics and Phylogeny of Angiosperms
Phylogentics and Phylogeny of AngiospermsPhylogentics and Phylogeny of Angiosperms
Phylogentics and Phylogeny of AngiospermsSehrishSarfraz2
 
Species delimitation - species limits and character evolution
Species delimitation - species limits and character evolutionSpecies delimitation - species limits and character evolution
Species delimitation - species limits and character evolutionRutger Vos
 
Species concept
Species conceptSpecies concept
Species conceptAlen Shaji
 
best ever ppt on speciation by Nagesh sadili
best ever ppt on speciation by Nagesh sadilibest ever ppt on speciation by Nagesh sadili
best ever ppt on speciation by Nagesh sadiliNagesh sadili
 
Principle of classification of living things
Principle of classification of living thingsPrinciple of classification of living things
Principle of classification of living thingsmnyaongo
 
Molecular Phylogenetics
Molecular PhylogeneticsMolecular Phylogenetics
Molecular PhylogeneticsMeghaj Mallick
 
Importance and Applications of Systematics evolution
Importance and Applications  of Systematics evolutionImportance and Applications  of Systematics evolution
Importance and Applications of Systematics evolutionHafiz M Waseem
 
Microbial phylogeny
Microbial phylogenyMicrobial phylogeny
Microbial phylogenyaquib59
 
Phylogeney
Phylogeney Phylogeney
Phylogeney Smawi GH
 

Similar to 2016 bioinformatics i_phylogenetics_wim_vancriekinge (20)

Bioinformatics t6-phylogenetics v2013-wim_vancriekinge
Bioinformatics t6-phylogenetics v2013-wim_vancriekingeBioinformatics t6-phylogenetics v2013-wim_vancriekinge
Bioinformatics t6-phylogenetics v2013-wim_vancriekinge
 
Bioinformatics t6-phylogenetics v2014
Bioinformatics t6-phylogenetics v2014Bioinformatics t6-phylogenetics v2014
Bioinformatics t6-phylogenetics v2014
 
Taxonomy n Systematics 2
Taxonomy n Systematics 2Taxonomy n Systematics 2
Taxonomy n Systematics 2
 
Animal Systematics Lecture 2
Animal Systematics Lecture 2Animal Systematics Lecture 2
Animal Systematics Lecture 2
 
Bioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogeneticsBioinformatica 24-11-2011-t6-phylogenetics
Bioinformatica 24-11-2011-t6-phylogenetics
 
Species concept
Species conceptSpecies concept
Species concept
 
bacterial systematics in the diversity of bacteria
bacterial systematics in the diversity  of bacteriabacterial systematics in the diversity  of bacteria
bacterial systematics in the diversity of bacteria
 
KOUSIK_GHOSHPhenetics and Cladistics2020-04-05Phenetics and Cladistics.pptx
KOUSIK_GHOSHPhenetics and Cladistics2020-04-05Phenetics and Cladistics.pptxKOUSIK_GHOSHPhenetics and Cladistics2020-04-05Phenetics and Cladistics.pptx
KOUSIK_GHOSHPhenetics and Cladistics2020-04-05Phenetics and Cladistics.pptx
 
Phylogentics and Phylogeny of Angiosperms
Phylogentics and Phylogeny of AngiospermsPhylogentics and Phylogeny of Angiosperms
Phylogentics and Phylogeny of Angiosperms
 
Species concept
Species conceptSpecies concept
Species concept
 
Species delimitation - species limits and character evolution
Species delimitation - species limits and character evolutionSpecies delimitation - species limits and character evolution
Species delimitation - species limits and character evolution
 
Species concept
Species conceptSpecies concept
Species concept
 
UNIT 1 LS.ppt
UNIT 1 LS.pptUNIT 1 LS.ppt
UNIT 1 LS.ppt
 
Phylogeny-Abida.pptx
Phylogeny-Abida.pptxPhylogeny-Abida.pptx
Phylogeny-Abida.pptx
 
best ever ppt on speciation by Nagesh sadili
best ever ppt on speciation by Nagesh sadilibest ever ppt on speciation by Nagesh sadili
best ever ppt on speciation by Nagesh sadili
 
Principle of classification of living things
Principle of classification of living thingsPrinciple of classification of living things
Principle of classification of living things
 
Molecular Phylogenetics
Molecular PhylogeneticsMolecular Phylogenetics
Molecular Phylogenetics
 
Importance and Applications of Systematics evolution
Importance and Applications  of Systematics evolutionImportance and Applications  of Systematics evolution
Importance and Applications of Systematics evolution
 
Microbial phylogeny
Microbial phylogenyMicrobial phylogeny
Microbial phylogeny
 
Phylogeney
Phylogeney Phylogeney
Phylogeney
 

More from Prof. Wim Van Criekinge

2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_uploadProf. Wim Van Criekinge
 
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_uploadProf. Wim Van Criekinge
 
2019 03 05_biological_databases_part3_v_upload
2019 03 05_biological_databases_part3_v_upload2019 03 05_biological_databases_part3_v_upload
2019 03 05_biological_databases_part3_v_uploadProf. Wim Van Criekinge
 
2019 02 21_biological_databases_part2_v_upload
2019 02 21_biological_databases_part2_v_upload2019 02 21_biological_databases_part2_v_upload
2019 02 21_biological_databases_part2_v_uploadProf. Wim Van Criekinge
 
2019 02 12_biological_databases_part1_v_upload
2019 02 12_biological_databases_part1_v_upload2019 02 12_biological_databases_part1_v_upload
2019 02 12_biological_databases_part1_v_uploadProf. Wim Van Criekinge
 
Bio ontologies and semantic technologies[2]
Bio ontologies and semantic technologies[2]Bio ontologies and semantic technologies[2]
Bio ontologies and semantic technologies[2]Prof. Wim Van Criekinge
 
2018 03 27_biological_databases_part4_v_upload
2018 03 27_biological_databases_part4_v_upload2018 03 27_biological_databases_part4_v_upload
2018 03 27_biological_databases_part4_v_uploadProf. Wim Van Criekinge
 
2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part2_v_upload2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part2_v_uploadProf. Wim Van Criekinge
 
2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_uploadProf. Wim Van Criekinge
 

More from Prof. Wim Van Criekinge (20)

2020 02 11_biological_databases_part1
2020 02 11_biological_databases_part12020 02 11_biological_databases_part1
2020 02 11_biological_databases_part1
 
2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload
 
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part4_v_upload
 
2019 03 05_biological_databases_part3_v_upload
2019 03 05_biological_databases_part3_v_upload2019 03 05_biological_databases_part3_v_upload
2019 03 05_biological_databases_part3_v_upload
 
2019 02 21_biological_databases_part2_v_upload
2019 02 21_biological_databases_part2_v_upload2019 02 21_biological_databases_part2_v_upload
2019 02 21_biological_databases_part2_v_upload
 
2019 02 12_biological_databases_part1_v_upload
2019 02 12_biological_databases_part1_v_upload2019 02 12_biological_databases_part1_v_upload
2019 02 12_biological_databases_part1_v_upload
 
P7 2018 biopython3
P7 2018 biopython3P7 2018 biopython3
P7 2018 biopython3
 
P6 2018 biopython2b
P6 2018 biopython2bP6 2018 biopython2b
P6 2018 biopython2b
 
P4 2018 io_functions
P4 2018 io_functionsP4 2018 io_functions
P4 2018 io_functions
 
P3 2018 python_regexes
P3 2018 python_regexesP3 2018 python_regexes
P3 2018 python_regexes
 
T1 2018 bioinformatics
T1 2018 bioinformaticsT1 2018 bioinformatics
T1 2018 bioinformatics
 
P1 2018 python
P1 2018 pythonP1 2018 python
P1 2018 python
 
Bio ontologies and semantic technologies[2]
Bio ontologies and semantic technologies[2]Bio ontologies and semantic technologies[2]
Bio ontologies and semantic technologies[2]
 
2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql2018 05 08_biological_databases_no_sql
2018 05 08_biological_databases_no_sql
 
2018 03 27_biological_databases_part4_v_upload
2018 03 27_biological_databases_part4_v_upload2018 03 27_biological_databases_part4_v_upload
2018 03 27_biological_databases_part4_v_upload
 
2018 03 20_biological_databases_part3
2018 03 20_biological_databases_part32018 03 20_biological_databases_part3
2018 03 20_biological_databases_part3
 
2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part2_v_upload2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part2_v_upload
 
2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload
 
P7 2017 biopython3
P7 2017 biopython3P7 2017 biopython3
P7 2017 biopython3
 
P6 2017 biopython2
P6 2017 biopython2P6 2017 biopython2
P6 2017 biopython2
 

Recently uploaded

Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Osopher
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxryandux83rd
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPCeline George
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxAvaniJani1
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...Nguyen Thanh Tu Collection
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptxAneriPatwari
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 

Recently uploaded (20)

Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptx
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERP
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Chi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical VariableChi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical Variable
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (GLOB...
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptx
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 

2016 bioinformatics i_phylogenetics_wim_vancriekinge

  • 1.
  • 4.
  • 5.
  • 6. Phylogenetics Introduction Definitions Species concept Examples The Tree-of-life Phylogenetics Methodologies Algorithms Distance Methods Maximum Likelihood Maximum Parsimony Rooting Statistical Validation Conclusions Orthologous genes Horizontal Gene Transfer Phylogenomics Practical Approach: PHYLIP Weblems
  • 7. Phylogeny (phylo =tribe + genesis) Phylogenetic trees are about visualising evolutionary relationships. They reconstruct the pattern of events that have led to the distribution and diversity of life. The purpose of a phylogenetic tree is to illustrate how a group of objects (usually genes or organisms) are related to one another Nothing in Biology Makes Sense Except in the Light of Evolution. Theodosius Dobzhansky (1900-1975) What is phylogenetics ?
  • 8. Trees • Diagram consisting of branches and nodes • Species tree (how are my species related?) – contains only one representative from each species. – all nodes indicate speciation events • Gene tree (how are my genes related?) – normally contains a number of genes from a single species – nodes relate either to speciation or gene duplication events
  • 9. Clade: A set of species which includes all of the species derived from a single common ancestor
  • 10. Species Concepts from Various Authors D.A. Baum and K.L. Shaw - Exclusive groups of organisms, where an exclusive group is one whose members are all more closely related to each other than to any organisms outside the group. J. Cracraft - An irreducible cluster of organisms, diagnosably distinct from other such clusters, and within which there is a parental pattern of ancestry and descent. Charles Darwin - "From these remarks it will be seen that I look at the term species, as one arbitrarily given for the sake of convenience to a set of individuals closely resembling each other, and that it does not essentially differ from the term variety, which is given to less distinct and more fluctuating forms. The term variety, again, in comparison with mere individual differences, is also applied arbitrarily, and for mere convenience sake" (Origin of Species, 1st ed., p. 108). T. Dobzhansky - The largest and most inclusive reproductive community of sexual and cross-fertilizing individuals which share a common gene pool. And later...Systems of populations, the gene exchange between which is limited or prevented by reproductive isolating mechanisms. M. Ghiselin - The most extensive units in the natural economy, such that reproductive competition occurs among their parts. D.M. Lambert - Groups of individuals that define themselves by a specific mate recognition system. J. Mallet - Identifiable genotypic clusters recognized by a deficit of intermediates, both at single loci and at multiple loci. E. Mayr - Groups of actually or potentially interbreeding natural populations which are reproductively isolated from other such groups. C.D. Michener - A group of organisms not itself divisible by phenetic gaps resulting from concordant differences in character states (except for morphs - such as sex, age, or caste), but separated by such phenetic gaps from other such units. H.E.H. Patterson - That most inclusive population of individual biparental organisms which share a common fertilization system. G.G. Simpson - A lineage of populations evolving with time, separately from others, with its own unique evolutionary role and tendencies. P.H.A. Sneath and R.R. Sokal - The smallest (most homogeneous) cluster that can be recognized upon some given criterion as being distinct from other clusters. A.R. Templeton - The most inclusive population of individuals having the potential for phenotypic cohesion through intrinsic cohesion mechanisms (genetic and/or demographic - i.e. ecological -exchangeability). E.O. Wiley - A single lineage of ancestor-descendant populations which maintains its identity from other such lineages and which has its own evolutionary tendencies and historical fate. S. Wright - A species in time and space is composed of numerous local populations, each one intercommunicating and intergrading with others.
  • 11. Species I. Definitions: Species = the basic unit of classification > Three different ways to recognize species:
  • 12. Definitions: > Three different ways to recognize species: 1) Morphological species = the smallest group that is consistently and persistently distinct (Clusters in morphospace) species are recognized initially on the basis of appearance; the individuals of one species look different from the individuals of another Plant Species
  • 13. Definitions: > Three different ways to recognize species: 2) Biological species = a set of interbreeding or potentially interbreeding individuals that are separated from other species by reproductive barriers species are unable to interbreed Species
  • 14. Definitions: > Three different ways to recognize species: 3) Phylogenetic species = the boundary between reticulate (among interbreeding individuals) and divergent relationships (between lineages with no gene exchange) Species
  • 15. reticulate divergent Phylogenetic species recognized by the pattern of ancestor - descendent relationships boundary
  • 16. Definitions: > Three different ways to recognize species: 4) Phylogenomics species = ability to transmit (and maintain) a (stable) gene pool Adresses the Anopheles genome topology variations Species
  • 17. • In the tree to the left, A and B share the most recent common ancestry. Thus, of the species in the tree, A and B are the most closely related. • The next most recent common ancestry is C with the group composed of A and B. Notice that the relationship of C is with the group containing A and B. In particular, C is not more closely related to B than to A. This can be emphasized by the following two trees, which are equivalent to each other: Branching Order in a Phylogenetic Tree
  • 18. • A common simplifying assumption is that the three is bifurcating, meaning that each brach node has exactly two descendents. • The edges, taken together, are sometimes said to define the topology of the tree More definitions … Branch node, internal node Edge, Branch Leafs Tips external node
  • 19. Outgroups, rooted versus unrooted An unrooted reptilian phylogeny with an avian outgroup and the corresponding rooted phylogeny. The Ri represent modern reptiles; the Ai, inferred ancestors and the B a bird.
  • 21. Phylogenetic methods may be used to solve crimes, test purity of products, and determine whether endangered species have been smuggled or mislabeled: – Vogel, G. 1998. HIV strain analysis debuts in murder trial. Science 282(5390): 851-853. – Lau, D. T.-W., et al. 2001. Authentication of medicinal Dendrobium species by the internal transcribed spacer of ribosomal DNA. Planta Med 67:456-460. Examples
  • 22.
  • 23. – Epidemiologists use phylogenetic methods to understand the development of pandemics, patterns of disease transmission, and development of antimicrobial resistance or pathogenicity: • Basler, C.F., et al. 2001. Sequence of the 1918 pandemic influenza virus nonstructural gene (NS) segment and characterization of recombinant viruses bearing the 1918 NS genes. PNAS, 98(5):2746-2751. • Ou, C.-Y., et al. 1992. Molecular epidemiology of HIV transmission in a dental practice. Science 256(5060):1165-1171. • Bacillus Antracis: Examples
  • 24.
  • 25. • Conservation biologists may use these techniques to determine which populations are in greatest need of protection, and other questions of population structure: – Trepanier, T.L., and R.W. Murphy. 2001. The Coachella Valley fringe-toed lizard (Uma inornata): genetic diversity and phylogenetic relationships of an endangered species. Mol Phylogenet Evol 18(3):327-334. – Alves, M.J., et al. 2001. Mitochondrial DNA variation in the highly endangered cyprinid fish Anaecypris hispanica: importance for conservation. Heredity 87(Pt 4):463-473. • Pharmaceutical researchers may use phylogenetic methods to determine which species are most closely related to other medicinal species, thus perhaps sharing their medicinal qualities: – Komatsu, K., et al. 2001. Phylogenetic analysis based on 18S rRNA gene and matK gene sequences of Panax vietnamensis and five related species. Planta Med 67:461-465. Examples
  • 27. Origin of the Universe 15 billion yrs Formation of the Solar System 4.6 " First Self-replicating System 3.5 " Prokaryotic-Eukaryotic Divergence 2.0 " Plant-Animal Divergence 1.0 " Invertebrate-Vertebrate Divergence 0.5 " Mammalian Radiation Beginning 0.1 " Some Important Dates in History
  • 32. What Sequence to Use ? • To infer relationships that span the diversity of known life, it is necessary to look at genes conserved through the billions of years of evolutionary divergence. • The gene must display an appropriate level of sequence conservation for the divergences of interest.
  • 33. • If there is too much change, then the sequences become randomized, and there is a limit to the depth of the divergences that can be accurately inferred. • If there is too little change (if the gene is too conserved), then there may be little or no change between the evolutionary branchings of interest, and it will not be possible to infer close (genus or species level) relationships. What Sequence to Use ?
  • 34. Carl Woese recognized the full potential of rRNA sequences as a measure of phylogenetic relatedness. He initially used an RNA sequencing method that determined about 1/4 of the nucleotides in the 16S rRNA (the best technology available at the time). This amount of data greatly exceeded anything else then available. Using newer methods, it is now routine to determine the sequence of the entire 16S rRNA molecule. Today, the accumulated 16S rRNA sequences (about 10,000) constitute the largest body of data available for inferring relationships among organisms. Ribosomal RNA Genes and Their Sequences
  • 35. An example of genes in this category are those that define the ribosomal RNAs (rRNAs). Most prokaryotes have three rRNAs, called the 5S, 16S and 23S rRNA. What Sequence to Use ? Namea Size (nucleotides) Location 5S 120 Large subunit of ribosome 16S 1500 Small subunit of ribosome 23S 2900 Large subunit of ribosome a The name is based on the rate that the molecule sediments (sinks) in water. Bigger molecules sediment faster than small ones.
  • 36. The extraordinary conservation of rRNA genes can be seen in these fragments of the small subunit rRNA gene sequences from organisms spanning the known diversity of life: human ...GTGCCAGCAGCCGCGGTAATTCCAGCTCCAATAGCGTATATTAAAGTTGCTGCAGTTAAAAAG... yeast ...GTGCCAGCAGCCGCGGTAATTCCAGCTCCAATAGCGTATATTAAAGTTGTTGCAGTTAAAAAG... Corn ...GTGCCAGCAGCCGCGGTAATTCCAGCTCCAATAGCGTATATTTAAGTTGTTGCAGTTAAAAAG... Escherichia coli ...GTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCG... Anacystis nidulans ...GTGCCAGCAGCCGCGGTAATACGGGAGAGGCAAGCGTTATCCGGAATTATTGGGCGTAAAGCG... Thermotoga maratima ...GTGCCAGCAGCCGCGGTAATACGTAGGGGGCAAGCGTTACCCGGATTTACTGGGCGTAAAGGG... Methanococcus vannielii ...GTGCCAGCAGCCGCGGTAATACCGACGGCCCGAGTGGTAGCCACTCTTATTGGGCCTAAAGCG... Thermococcus celer ...GTGGCAGCCGCCGCGGTAATACCGGCGGCCCGAGTGGTGGCCGCTATTATTGGGCCTAAAGCG... Sulfolobus sulfotaricus ...GTGTCAGCCGCCGCGGTAATACCAGCTCCGCGAGTGGTCGGGGTGATTACTGGGCCTAAAGCG... Ribosomal RNA Genes and Their Sequences
  • 38. • Rate of evolution = rate of mutation • rate of evolution for any macromolecule is approximately constant over time (Neutral Theory of evolution) • For a given protein the rate of sequence evolution is approximately constant across lineages. Zuckerkandl and Pauling (1965) • This would allow speciation and duplication events to be dated accurately based on molecular data Molecular Clock (MC)
  • 39. Noval trees using Hox genes
  • 40. • (a) A traditional phylogenetic tree and
  • 41. • (a) A traditional phylogenetic tree and • (b) the new phylogenetic tree, each showing the positions of selected phyla. B, bilateria; AC, acoelomates; PC, pseudocoelomates; C, coelomates; P, protostomes; L, lophotrochozoa; E, ecdysozoa; D, deuterostomes.
  • 42. • Local and approximate molecular clocks more reasonable – one amino acid subst. 14.5 My – 1.3 10-9 substitutions/nucleotide site/year – Relative rate test (see further) • ((A,B),C) then measure distance between (A,C) & (B,C) Molecular Clock (MC)
  • 43. Rate of Change Theoretical Lookback Time (PAMs / 100 myrs) (myrs) Pseudogenes 400 45 Fibrinopeptides 90 200 Lactalbumins 27 670 Lysozymes 24 850 Ribonucleases 21 850 Haemoglobins 12 1500 Acid proteases 8 2300 Cytochrome c 4 5000 Glyceraldehyde-P dehydrogenase2 9000 Glutamate dehydrogenase 1 18000 PAM = number of Accepted Point Mutations per 100 amino acids. Proteins evolve at highly different rates
  • 44. Phylogenetics Introduction Definitions Species concept Examples The Tree-of-life Phylogenetics Methodologies Algorithms Distance Methods Maximum Likelihood Maximum Parsimony Rooting Statistical Validation Conclusions Orthologous genes Horizontal Gene Transfer Phylogenomics Practical Approach: PHYLIP Weblems
  • 46. • align • select method (evolutionary model) –Distance –ML –MP • generate tree • validate tree 4-steps
  • 47.
  • 49. • Convert sequence data into a set of discrete pairwise distance values (n*(n-1)/2), arranged into a matrix. Distance methods fit a tree to this matrix. • The phylogenetic topology tree is constructed by using a cluster analysis method (like upgma or nj methods). Distance matrix methods (upgma, nj, Fitch,...)
  • 50.
  • 51.
  • 52.
  • 53. Distance matrix methods (upgma, nj, Fitch,...)
  • 54. Distance matrix methods (upgma, nj, Fitch,...) CGT
  • 55. Distance matrix methods (upgma, nj, Fitch,...) Since we start with A,p(A)=1
  • 56. Distance matrix methods (upgma, nj, Fitch,...) D=evolutionary distance ~ tijd F = dissimilarity ~ (1 – PX(t)) F ~ 1 – d
  • 57. Distance matrix methods (upgma, nj, Fitch,...)
  • 58.
  • 59. Unweighted Pair Group Method with Arithmatic Mean (UPGMA)
  • 60. Unweighted Pair Group Method with Arithmatic Mean (UPGMA)
  • 61. Unweighted Pair Group Method with Arithmatic Mean (UPGMA)
  • 62. Unweighted Pair Group Method with Arithmatic Mean (UPGMA)
  • 63. Distance matrix methods: Summary http://www.bioportal.bic.nus.edu.sg/phylip/neighbor.html
  • 64. • The phylogeny makes an estimation of the distance for each pair as the sum of branch lengths in the path from one sequence to another through the tree.  easy to perform ;  quick calculation ;  fit for sequences having high similarity scores ; • drawbacks :  the sequences are not considered as such (loss of information) ;  all sites are generally equally treated (do not take into account differences of substitution rates ) ;  not applicable to distantly divergent sequences. Distance matrix methods (upgma, nj, Fitch,...)
  • 65.
  • 66.
  • 67. • In this method, the bases (nucleotides or amino acids) of all sequences at each site are considered separately (as independent), and the log-likelihood of having these bases are computed for a given topology by using a particular probability model. • This log-likelihood is added for all sites, and the sum of the log- likelihood is maximized to estimate the branch length of the tree. Maximum likelihood
  • 69. • This procedure is repeated for all possible topologies, and the topology that shows the highest likelihood is chosen as the final tree. • Notes :  ML estimates the branch lengths of the final tree ;  ML methods are usually consistent ;  ML is extented to allow differences between the rate of transition and transversion. • Drawbacks  need long computation time to construct a tree. Maximum likelihood
  • 71.
  • 72. Parsimony criterion • It consists of determining the minimum number of changes (substitutions) required to transform a sequence to its nearest neighbor. Maximum Parsimony • The maximum parsimony algorithm searches for the minimum number of genetic events (nucleotide substitutions or amino-acid changes) to infer the most parsimonious tree from a set of sequences. Maximum Parsimony
  • 73. Maximum Parsimony Occam’s Razor Entia non sunt multiplicanda praeter necessitatem. William of Occam (1300-1349) The best tree is the one which requires the least number of substitutions
  • 74. • The best tree is the one which needs the fewest changes. – If the evolutionary clock is not constant, the procedure generates results which can be misleading ; – within practical computational limits, this often leads in the generation of tens or more "equally most parsimonious trees" which make it difficult to justify the choice of a particular tree ; – long computation time to construct a tree. Maximum Parsimony
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82. Maximum Parsimony: Branch Node A or B ?
  • 83. Maximum Parsimony: A requires 5 mutaties
  • 84. Maximum Parsimony: B (and propagating A->B) requires only 4 mutations
  • 85. • The best tree is the one which needs the fewest changes. • Problems : – If the evolutionary clock is not constant, the procedure generates results which can be misleading ; – within practical computational limits, this often leads in the generation of tens or more "equally most parsimonious trees" which make it difficult to justify the choice of a particular tree ; – long computation time to construct a tree. Maximum Parsimony
  • 86. Phylogenetics Introduction Definitions Species concept Examples The Tree-of-life Phylogenetics Methodologies Algorithms Distance Methods Maximum Likelihood Maximum Parsimony Rooting Statistical Validation Conclusions Orthologous genes Horizontal Gene Transfer Phylogenomics Practical Approach: PHYLIP Weblems
  • 87.  There is at present no statistical methods which allow comparisons of trees obtained from different phylogenetic methods, nevertheless many studies have been made to compare the relative consistency of the existing methods. Comparative evaluation of different methods
  • 88.  The consistency depends on many factors, among these the topology and branch lengths of the real tree, the transition/transversion rate and the variability of the substitution rates.  One expects that if sequences have strong phylogenetic relationship, different methods will show the same phylogenetic tree Comparative evaluation of different methods
  • 89. Comparison of methods • Inconsistency • Neighbour Joining (NJ) is very fast but depends on accurate estimates of distance. This is more difficult with very divergent data • Parsimony suffers from Long Branch Attraction. This may be a particular problem for very divergent data • NJ can suffer from Long Branch Attraction • Parsimony is also computationally intensive • Codon usage bias can be a problem for MP and NJ • Maximum Likelihood is the most reliable but depends on the choice of model and is very slow • Methods may be combined
  • 90. Rooting the Tree • In an unrooted tree the direction of evolution is unknown • The root is the hypothesized ancestor of the sequences in the tree • The root can either be placed on a branch or at a node • You should start by viewing an unrooted tree
  • 91. Automatic rooting • Many software packages will root trees automaticall (e.g. mid-point rooting in NJPlot) • Sometimes two trees may look very different but, in fact, differ only in the position of the root • This normally involves assumptions… BEWARE!
  • 92. Rooting Using an Outgroup 1. The outgroup should be a sequence (or set of sequences) known to be less closely related to the rest of the sequences than they are to each other 2. It should ideally be as closely related as possible to the rest of the sequences while still satisfying condition 1 The root must be somewhere between the outgroup and the rest (either on the node or in a branch)
  • 93. How confident am I that my tree is correct? Bootstrap values Bootstrapping is a statistical technique that can use random resampling of data to determine sampling error for tree topologies
  • 94. Bootstrapping phylogenies • Characters are resampled with replacement to create many bootstrap replicate data sets • Each bootstrap replicate data set is analysed (e.g. with parsimony, distance, ML etc.) • Agreement among the resulting trees is summarized with a majority-rule consensus tree • Frequencies of occurrence of groups, bootstrap proportions (BPs), are a measure of support for those groups
  • 95. Bootstrapping - an example Ciliate SSUrDNA - parsimony bootstrap Majority-rule consensus Ochromonas (1) Symbiodinium (2) Prorocentrum (3) Euplotes (8) Tetrahymena (9) Loxodes (4) Tracheloraphis (5) Spirostomum (6) Gruberia (7) 100 96 84 100 100 100
  • 96. • Bootstrapping is a very valuable and widely used technique (it is demanded by some journals) • BPs give an idea of how likely a given branch would be to be unaffected if additional data, with the same distribution, became available • BPs are not the same as confidence intervals. There is no simple mapping between bootstrap values and confidence intervals. There is no agreement about what constitutes a ‘good’ bootstrap value (> 70%, > 80%, > 85% ????) • Some theoretical work indicates that BPs can be a conservative estimate of confidence intervals • If the estimated tree is inconsistent all the bootstraps in the world won’t help you….. Bootstrap - interpretation
  • 97. Jack-knifing • Jack-knifing is very similar to bootstrapping and differs only in the character resampling strategy • Jack-knifing is not as widely available or widely used as bootstrapping • Tends to produce broadly similar results
  • 98. At present only sampling techniques allow testing the topology of a phylogenetic tree  Bootstrapping » It consists of drawing columns from a sample of aligned sequences, with replacement, until one gets a data set of the same size as the original one. (usually some columns are sampled several times others left out)  Half-Jacknife » This technique resamples half of the sequence sites considered and eliminates the rest. The final sample has half the number of initial number of sites without duplication. Statistical evaluation of the obtained phylogenetic trees
  • 99. Lineage.py #Finding the lineage of an organism #Staying with a plant example, let’s now find the lineage of the Cypripedioideae orchid family. First, we search the Taxonomy database for Cypripedioideae, which yields exactly one NCBI taxonomy identifier: from Bio import Entrez Entrez.email = "A.N.Other@example.com" # Always tell NCBI who you are handle = Entrez.esearch(db="Taxonomy", term="Cypripedioideae") record = Entrez.read(handle) print (record["IdList"]) #Now, we use efetch to download this entry in the Taxonomy database, and then parse it: handle = Entrez.efetch(db="Taxonomy", id="158330", retmode="xml") records = Entrez.read(handle) #Again, this record stores lots of information: print (records[0].keys()) [u'Lineage', u'Division', u'ParentTaxId', u'PubDate', u'LineageEx', u'CreateDate', u'TaxId', u'Rank', u'GeneticCode', u'ScientificName', u'MitoGeneticCode', u'UpdateDate'] #We can get the lineage directly from this record: print (records[0]["Lineage"])
  • 100. WWW resources for molecular phylogeny (1)  Compilations  A list of sites and resources: http://www.ucmp.berkeley.edu/subway/phylogen.h tml  An extensive list of phylogeny programs http://evolution.genetics.washington.edu/ phylip/software.html • Databases of rRNA sequences and associated software  The rRNA WWW Server - Antwerp, Belgium. http://rrna.uia.ac.be  The Ribosomal Database Project - Michigan State University http://rdp.cme.msu.edu/html/
  • 101.  Database similarity searches (Blast) : http://www.ncbi.nlm.nih.gov/BLAST/ http://www.infobiogen.fr/services/menuserv.html http://bioweb.pasteur.fr/seqanal/blast/intro-fr.html http://pbil.univ-lyon1.fr/BLAST/blast.html  Multiple sequence alignment ClustalX : multiple sequence alignment with a graphical interface (for all types of computers). http://www.ebi.ac.uk/FTP/index.html and go to ‘software’ Web interface to ClustalW algorithm for proteins: http://pbil.univ-lyon1.fr/ and press “clustal” WWW resources for molecular phylogeny (2)
  • 102. • Sequence alignment editor  SEAVIEW : for windows and unix http://pbil.univ-lyon1.fr/software/seaview.html • Programs for molecular phylogeny  PHYLIP : an extensive package of programs for all platforms http://evolution.genetics.washington.edu/phylip.html  CLUSTALX : beyond alignment, it also performs NJ  PAUP* : a very performing commercial package http://paup.csit.fsu.edu/index.html  PHYLO_WIN : a graphical interface, for unix only http://pbil.univ-lyon1.fr/software/phylowin.html  MrBayes : Bayesian phylogenetic analysis http://morphbank.ebc.uu.se/mrbayes/  PHYML : fast maximum likelihood tree building http://www.lirmm.fr/~guindon/phyml.html  WWW-interface at Institut Pasteur, Paris http://bioweb.pasteur.fr/seqanal/phylogeny WWW resources for molecular phylogeny (3)
  • 103. Weblems W6.1: The growth hormones in most mammals have very similar ammo acid sequences. (The growth hormones of the Alpaca, Dog Cat Horse, Rabbit, and Elephant each differ from that of the Pig at no more than 3 positions out of 191.) Human growth hormone is very different, differing at 62 positions. The evolution of growth hormone accelerated sharply in the line leading to humans. By retrieving and aligning growth hormone sequences from species closely related to humans and our ancestors, determine where in the evolutionary tree leading to humans the accelerated evolution of growth hormone took place. W6.2: Humans are primates, an order that we, apes and monkeys share with lemurs and tarsiers. On the basis of the Beta-globin gene cluster of human, a chimpanzee, an old-world monkey, a new-world monkey, a lemur, and a tarsier, derive a phylogenetic tree of these groups. W6.3: Primates are mammals, a class we share with marsupials and monotremes; Extant marsupials live primarily in Australia, except for the opossum, found also in North and South America. Extant monotremes are limited to two animals from Australia: the platypus and echidna. Using the complete mitochondnal genome from human, horse (Equus caballus), wallaroo (Macropus robustus), American opossum (Didelphis mrgimana), and platypus (Ormthorhynchus anatmus), draw an evolutionary tree, indicating branch lengths. Are monotremes more closely related to placental mammals or to marsupials? W6.4: Mammals are vertebrates, a subphylum that we share with fishes, sharks, birds and reptiles, amphibia, and primitive jawless fishes (example: lampreys). For the coelacanth (Latimeria chalumnae), the great white shark (Carcharodon carcharias), skipjack tuna (Katsuwonus pelamis), sea lamprey (Petromyzon marinus), frog (Rana Ripens), and Nile crocodile (Crocodylus niloticus), using sequences of cytochromes c and pancreatic ribonucleases, derive evolutionary trees of these species.