This document discusses using phylogenetic diversity estimated from genomics to assess the structure of biological communities. It defines phylogenetic diversity and introduces PhyloH, a program for measuring phylogenetic diversity. It provides an example use case applying PhyloH to analyze the microbiomes of bee larvae and their parasites to examine the factors influencing beta diversity across bee hives, cells, and species.
Giacinto Donvito – Infrastrutture di Grid e Cloud per la ricerca Bioinformatica
BiPday 2014 -- Vicario Saverio
1. FUNCTIONAL AND PHYLOGENETIC DIVERSITY
ESTIMATED BY GENOMICS AS A WAY TO ASSESS
STRUCTURE OF THE COMMUNITIES
Saverio Vicario
Institute of Biomedical Technology – National Council of
Research, Bari, Italy (CNR-ITB)
1
2. Outline
• Biodiversity as a variable of state of the ecosystem
• Defining the type of diversity used and the necessity
of a clear statistical framework
• PhyloH a program for performing Phylogenetic
diversity
• A limited use case
2
3. Biodiversity as a variable of state of the ecosystem
• Ecosystem services are the output of communities
• The capacity of system to maintain their output
confronting external perturbation is the robustness
(i.e. resilience)
• The capacity of system to respond to prolonged
external change is the evolvability
Both feature are relevant to maintain ecosystem and
their outputs
3
4. System structure influences Robustness and
Evolvability
4
RR
E E
R
E
R
E
functional redundancy
LEVIN and LUBCHENCO
5. Functional redundancy
Community with species or taxon that that perform
same function using very different method (possessing
very different Bauplan or genetic architecture).
It is expected that in changed condition functional
redundant member will react differently
5
Night predator of flying insect
6. When Phylogenetic structure matter
An index of how much organisms are different is the
phylogenetic structure. Depending from the kind of
data used to reconstruct (single gene, multiple genes,
time calibrated or not) and the evolutionary history of
the organism the approach could be more or less
efficient in prediction
To measure functional redundancy would be enough a
sample from the communities, its subdivision
6
8. Limit of Shannon applied to biodiversity
Field 1
Field 2
=3/9=3/9=3/9
H=1.1 nats
D=3.0 equally abundant species
=3/9=3/9=3/9 H=1.1, D=3.0
Same Average Shannon Surprise!
But Adding a species of the dragon fly increase much more biodiversity of the
observations than adding another butterfly species
9. Phylogenetic entropy of Allen/Chao
T1 T2 T3
Observing sample at different level of
taxonomic resolution
T1 -> I got 9 insects (q5=1)
H1=0 D1=1
T2 -> I got 6 Lepidoptera and 3 Odonata
(q4=p2+p3)
H2=0.636 D2=1.889
T3-> I got 3 for each of the 3 species
H3=1.10 D3=3.0
11. a, b, g Diversities
• I do agree that:
– a entropy if q=1 is equal to Shannon entropy, while b
entropy is mutual information or the Kullback-Leiber
divergence between S and E vectors
– To pass from entropy to diversity is sufficient to
exponentialize for q=1
12
S=Species E=Environment
L. splendens Grassland
L. splendens Grassland
L. splendens Grassland
L. splendens Grassland
O. horribilis Grassland
O. horribilis Forest
… …
This framework put the partitioning well
within the frame of Information Theory.
1. Beta diversity is measured in equivalent
number of sample.
2. In case of N perfectly different samples
but with unequal counts this measure do
not reach N but exp(H(E)).
3. In case of perfectly identical sample but
unequal count this measure is perfectly 1
12. Experimental design and entropy partitioning
S
R
E
HalphaR=H(S|R)= Biological Noise
HbetaR|E= H(S|E)-H(S|R,E)= Experimental Noise
HbetaE=H(S)-H(S|E)=Signal
+ + =H(S)
HalphaR+HbetaR|E+HbetaA=H(S)
HalphaE
HbetaR|E
HbetaE
+ HalphaE=H(S|E)=Noise
HalphaE + HbetaA =H(S)
HalphaR
13. Problems of methods
14
They do use rarefaction to
assess significance.
Not surprisingly the effect
observed is small and not
significant
14. S=Species R=Replicates E=Environment
L. splendens RG1 G
L. splendens RG1 G
L. splendens RG2 G
L. splendens RG2 G
L. splendens RG3 G
O. horribilis RL1 L
… … …
Significance levels evaluation
• Permutations of the annotations (replicates and
environment) of the read allow to estimates
expected Dbeta if no differentiation across
environments exist given the observed level of
variation and sampling effort
15. Power of permutation procedure
• Simulated data set of 500
taxa over a phylogeny. 12
related taxa differ in their
relative frequency across
two environment of 0.01,
while the rest of taxa have
same relative frequency.
16. PhyloH features in measuring Phylogenetic diversity
differently from entropart (R) :
• Being Beta entropy is a summation of
difference
• Each difference refer to a branch in the
phylogeny. So is possible to plot the
contribution of each branch on the tree plot
and spot relevant patterns
• Work also on non time scaled (non ultrametric)
18. 20
Use Case
Towards a better understanding of Apis mellifera and Varroa destructor microbiomes:
introducing ‘phyloh’ as a novel phylogenetic diversity analysis tool
A. Sandionigi, S. Vicario, E. M. Prosdocimi, A. Galimberti, E. Ferri, A. Bruno, B. Balech,
V. Mezzasalma and M. Casiraghi
Article first published online: 19 NOV 2014 | DOI: 10.1111/1755-0998.1234
Molecular Ecology Resources
19. Use Case
• 42 microbiomes from 21 pair of bee larvae and their
parasites across 7 beehives (2-4 replicates per beehive)
Explaining factor # States Mutual
Information
Beta Diversity/
Effective number of
sample
% Overlap*
Beehives (across bees) 7 0.256 1.29 0.87
Cell (bees and varroa) 21 0.353 1.42 0.88
Species (bees and varroa) 2 0.0603 1.06 0.92
*(ln(C)-ln(D))/ln(C) = Horn index percentage of non unique species in each sample