Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jun 1;4(6):566-572.
doi: 10.1111/2041-210X.12042.

The mean and variance of phylogenetic diversity under rarefaction

Affiliations

The mean and variance of phylogenetic diversity under rarefaction

David A Nipperess et al. Methods Ecol Evol. .

Abstract

Phylogenetic diversity (PD) depends on sampling depth, which complicates the comparison of PD between samples of different depth. One approach to dealing with differing sample depth for a given diversity statistic is to rarefy, which means to take a random subset of a given size of the original sample. Exact analytical formulae for the mean and variance of species richness under rarefaction have existed for some time but no such solution exists for PD.We have derived exact formulae for the mean and variance of PD under rarefaction. We confirm that these formulae are correct by comparing exact solution mean and variance to that calculated by repeated random (Monte Carlo) subsampling of a dataset of stem counts of woody shrubs of Toohey Forest, Queensland, Australia. We also demonstrate the application of the method using two examples: identifying hotspots of mammalian diversity in Australasian ecoregions, and characterising the human vaginal microbiome.There is a very high degree of correspondence between the analytical and random subsampling methods for calculating mean and variance of PD under rarefaction, although the Monte Carlo method requires a large number of random draws to converge on the exact solution for the variance.Rarefaction of mammalian PD of ecoregions in Australasia to a common standard of 25 species reveals very different rank orderings of ecoregions, indicating quite different hotspots of diversity than those obtained for unrarefied PD. The application of these methods to the vaginal microbiome shows that a classical score used to quantify bacterial vaginosis is correlated with the shape of the rarefaction curve.The analytical formulae for the mean and variance of PD under rarefaction are both exact and more efficient than repeated subsampling. Rarefaction of PD allows for many applications where comparisons of samples of different depth is required.

Keywords: alpha diversity; phylogenetic diversity; rarefaction; sampling depth.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A hypothetical phylogenetic tree illustrating key concepts in the formulation of the rarefaction of phylogenetic diversity. The tree is populated with marks (indicated by stars) which represent observations of particular points on the tree in a sample. Marks might commonly be placed only at the leaves (tips) of the tree but allowing marks to occur anywhere provides for more flexible applications. Multiple marks indicate multiple observations: for example, several individuals of a species. The tree can then be broken up into snips, which are the edge segments between marks and/or internal nodes. For each snip i, there are two sets of marks, Ci and Di, which name the set of marks that are on the proximal (towards the root) side of i versus those on the distal (towards the leaves) side of i.
Figure 2
Figure 2
Comparison of analytical value (curve) with Monte Carlo calculation with 2,000 samples (points) for the mean of rooted PD under rarefaction.
Figure 3
Figure 3
Comparison of analytical value (curve) with Monte Carlo calculation with 2,000 samples (points) for the variance of rooted PD under rarefaction.
Figure 4
Figure 4
Phylogenetic diversity of mammal faunas for terrestrial ecoregions on the Australian continental shelf. Phylogenetic diversity is calculated for (a) all species present and (b) as an expected value after rarefaction to 25 species. Ecoregions are coloured light blue for low values to dark red for high values. The three highest ranked ecoregions in each case are indicated by number.
Figure 5
Figure 5
Rarefaction curve of samples from (Srinivasan et al., 2012). The Nugent score is a diagnostic score for bacterial vaginosis, with 0 being “normal” and 10 being classified as BV.

Similar articles

Cited by

References

    1. Allen B, Kon M, Bar-Yam Y. A new phylogenetic diversity measure generalizing the Shannon index and its application to phyllostomid bats. The American Naturalist. 2009;174(2):236–243. - PubMed
    1. Berger SA, Krompass D, Stamatakis A. Performance, accuracy, and web server for evolutionary placement of short sequence reads under maximum likelihood. Systematic biology. 2011;60(3):291–302. - PMC - PubMed
    1. Bininda-Emonds ORP, Cardillo M, Jones KE, MacPhee R, Beck RMD, Grenyer R, Price S, Vos R, Gittleman JL, Purvis A. The delayed rise of present-day mammals. Nature. 2007;446:507–512. - PubMed
    1. Cadotte M, Cavender-Bares J, Tilman D, Oakley TH. Using phylogenetic, functional and trait diversity to understand patterns of plant community of productivity. PLOS ONE. 2009 Jan;4(5):e5695. - PMC - PubMed
    1. Caporaso JG, Paszkiewicz K, Field D, Knight R, Gilbert JA. The Western English Channel contains a persistent microbial seed bank. ISME J. 2012 Jun;6(6):1089–1093. - PMC - PubMed

LinkOut - more resources