Journal Club @ UVigo 2011.07.22

Journal Club – Bayes Estimators for Phylogenetic
Reconstruction
Syst. Biol. 60(4), 528 – 540, 2011 doi 10.1093/sysbio/syr021

Leonardo de O. Martins

University of Vigo

July 22, 2011

Leo Martins (Univ. Vigo) Journal Club 22/7 1 / 12

Outline

1 Distance as a penalty

2 Distances, everywhere

3 No phylogenetics, yet...

4 Trees as points in space

5 To the paper, then


Statistical Risk

ˆ
The risk ρ associated with a decision θ is the expected loss of this decision
ˆ
θ (which can be, for instance, an estimate of θ).


Statistical Risk

ˆ
ˆ

ˆ
ρ(θ) = ˆ
L(θ, θ) P(θ | data) dθ

(promptly called posterior expected loss)


Statistical Risk

ˆ
ˆ

ˆ
ρ(θ) = ˆ

ˆ
The loss function L(θ, θ) is a penalty we give for ”deciding” away from the
parameter. Examples are the squared loss and the absolute loss.


Statistical Risk

ˆ
ˆ

ˆ
ρ(θ) = ˆ

ˆ
The loss function L(θ, θ) is a penalty we give for ”deciding” away from the
parameter. Examples are the squared loss and the absolute loss.

For some loss functions, we can calculate what is the best decision (i.e.
the one that minimizes the risk, for any data).


Outline







How to summarise a collection of objects?

scattered points

library ( MASS ) ;
x <- mvrnorm ( n =1000 , mu = c (0 ,0) , Sigma = matrix ( c (1 , 0.8 , 0.9 , 1) , 2 , 2 , byrow = T ) ) ;
plot ( x [ ,1] , x [ ,2] , pch = " . " , cex = 2 , xlab = " x " , ylab = " y " ) ;



centroid: minimizes a distance to all points

library ( MASS ) ;



regression line: minimizes a distance to all points

library ( MASS ) ;


Outline







How to summarise the posterior distribution P(X)?



Posterior mean
Minimize the expected loss under a squared loss function
ˆ ˆ
L(θ, θ) = (θ − θ)2

(Euclidean distance)



Posterior median
Minimize the expected loss under a linear loss function
ˆ ˆ
L(θ, θ) =| θ − θ |

(Manhattan distance)



Posterior mode
a.k.a. Maximum A Posteriori (MAP) estimate.
Minimize the expected loss under a delta loss function

0, ˆ
for θ = θ
ˆ
L(θ, θ) =
1, ˆ
for θ = θ

Outline







Distances between trees
D D
C E

€ €
€ € €
€

E C
€ €
f
f f
f
f f
fˆˆ fˆˆ
¢ ˆˆ
ˆB ¢ ˆˆ
ˆB
¢ ¢
¢ ¢
¢ ¢
A A
Trees from the article


D D
C E

€ €
€ € €
€

E C
€ €
f
f f
f
f f
fˆˆ fˆˆ
¢ ˆˆ
ˆB ¢ ˆˆ
ˆB
¢ ¢
¢ ¢
¢ ¢
A A
RF distance
DE|ABC and CD|ABE
total 2 branches


D D
C E

€ €
€ € €
€

E C
€ €
f
f f
f
f f
fˆˆ fˆˆ
¢ ˆˆ
ˆB ¢ ˆˆ
ˆB
¢ ¢
¢ ¢
¢ ¢
A A
Quartet distance
AC|DE and AE|CD
BC|DE and BE|CD
4 quartets are diﬀerent


D D
C E

€ €
€ € €
€

E C
€ €
f
f f
f
f f
fˆˆ fˆˆ
¢ ˆˆ
ˆB ¢ ˆˆ
ˆB
¢ ¢
¢ ¢
¢ ¢
A A
Path diﬀerence (number of speciations between trees)
path from A to E is one edge longer in one tree than the other
(...)
the overall diﬀerence is 6


Outline







If there is a distance, there is a Bayes estimator

For points in Rn , we know that the mean minimizes the Euclidean
distance, etc.

For phylogenies:

there are several Euclidean distances

But some distances between trees also lead to “analytical” solutions:



distance, etc.

For phylogenies:

the mean does not work since a tree has restrictions




distance, etc.

For phylogenies:



the consensus tree minimizes the Robinson-Foulds distance between
the samples



distance, etc.

For phylogenies:



the samples
the quartet puzzling minimizes the quartet distance



distance, etc.

For phylogenies:



the samples
the Buneman tree minimizes (I think) the dissimilarity map distance



distance, etc.

For phylogenies:



the samples
the Buneman tree minimizes (I think) the dissimilarity map distance
some of these are hard to solve as well


How do they ﬁnd, then, the Bayes estimates?

like many other softwares: hill-climbing on the space of possible
topologies



topologies
their input data is the posterior distribution of trees from MrBayes



topologies
starting tree can be NJ, MAP tree, ML...



topologies
apply branch-swap (NNI) to current optimal tree, then verify distance
to all samples



topologies
to all samples
the distance used is the path diﬀerence (matrix subtraction)



topologies
to all samples
the distance used is the path diﬀerence (matrix subtraction)
don’t need to recalculate distance to all samples, just to matrix with
average values


Journal Club @ UVigo 2011.07.22

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (16)

Similar to Journal Club @ UVigo 2011.07.22

Similar to Journal Club @ UVigo 2011.07.22 (20)

Recently uploaded

Recently uploaded (20)

Journal Club @ UVigo 2011.07.22