Is your phylogeny informative?

Is your phylogeny informative?
Measuring the power of comparative methods

Carl Boettiger
University of California, Davis
SICB 2011

Fell free to share live

Two Questions

Right Model? Sufficient Data?

Currently we worry that we don’t have the
right models.

We should worry that we don’t have the
power to know if we have the right models.

Right Model?

h
enoug
m ed ?
cram e tree
we to th
ave gy in
H lo
bio

(diagram shows tree
under weight of so
many models)

Sufficient Data?

Is the data-set big enough?

... informative about
the things we want to
estimate?

A model of “phylogenetic signal”

Are closely related
species really more
similar?

al” on g
sig
n str al”
“ n
No sig
0 1
“
an
wni
(Bro ion)
mot

A simulation-based test of this method


al” on g
s i gn str al”
“ n
“No sig
0 1

e
ue valu
Tr

estimates

al” on g
sig
n str al”
“ n
“No sig
0 1 w nian
(Bro ion)
mot

What went wrong?
Was supposed to address the
problem of adequate data...

Instead, it's just fallen prey
to the problem

Insufficient signal to
estimate signal

“Has the pendulum swung too far?”

ree e
e r t at
Bi gg s ti m
e
an da
c b
lam ter
be t

ve r s
s o ci e
ha pe
0s )
←
30 23
(vs

al” on g
sig
n str al”
“ n
“No sig
0 1

Not just size that matters....

e
is quit f
a data ate o
l2 3 tax ut estim
Origina tive abo e, sigma
a t
inform ication ra
sif
diver

Is the phylogeny informative?

Depends on the question

Depends on the amount of data
x
ple ta to
com da
m ore the
dd have
we a we
as , do ?
So dels hem
mo tify t
jus
uld
wo ?
ow now
H k
we

Process vs Pattern

Models represent mechanisms

More complex models fit the data
better
So can we tell when we've matched
the mechanism and when we've
only matched the pattern?

Yes!

Phylogenetic Monte Carlo

rom
la te f el →
i mu e mo d
S pl
sim


rom
la te f el →
i mu e mo d
S pl
sim odel
s
b oth m
← Fits data
to this


rom
la te f el →
i mu e mo d
S pl
sim odel
s
b oth m
← Fits data
to this

LogLik(complex) LogLik(simple)


rom
la te f el →
i mu e mo d
S pl
sim odel
s
b oth m
← Fits data
to this


frequency

Likelihood ratio


x
ple
comdel →
rom mo
la te f el →
i mu e mo d
S pl
sim odel
s
b oth m
← Fits data
to this


frequency

Likelihood ratio

A reliable way to justify the complex model

io
← Rat d for
observe ata
d
A noles

Prob of false positive: 0.0115
Power at 95% confidence: 0.99

Greater overlap indicates less information,
less power to distinguish

And we can identify cases where the tree
really has no information at all

Preferring the simpler model isn't always a
lack of power

Avoid over-
fitting

No trouble with
non-nested
models

Compared to what we already do?

a s no
A IC h o f
n
n otio r
e
pow

oses del
cho mo
AIC wrong d
o
the pite go
des r!
e
pow

About as good as flipping a coin...

Larger trees can identify more subtle
differences between models

Conclusions

We've been worried about the right model.
First we must worry if we have enough data to
tell.

Without this, we will often choose wrong
models.

Future lies in big trees!

PMC: an R package for Phylogenetic Monte Carlo
https://github.com/cboettig/Comparative-Phylogenetics/

Acknowledgements
●
Graham Coop
●
Peter Ralph
●
Wainwright Lab

Is your phylogeny informative?

Is your phylogeny informative?

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (10)

Mehr von Carl Boettiger

Mehr von Carl Boettiger (6)

Is your phylogeny informative?