Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 8;4(1):vex042.
doi: 10.1093/ve/vex042. eCollection 2018 Jan.

TreeTime: Maximum-likelihood phylodynamic analysis

Affiliations

TreeTime: Maximum-likelihood phylodynamic analysis

Pavel Sagulenko et al. Virus Evol. .

Abstract

Mutations that accumulate in the genome of cells or viruses can be used to infer their evolutionary history. In the case of rapidly evolving organisms, genomes can reveal their detailed spatiotemporal spread. Such phylodynamic analyses are particularly useful to understand the epidemiology of rapidly evolving viral pathogens. As the number of genome sequences available for different pathogens has increased dramatically over the last years, phylodynamic analysis with traditional methods becomes challenging as these methods scale poorly with growing datasets. Here, we present TreeTime, a Python-based framework for phylodynamic analysis using an approximate Maximum Likelihood approach. TreeTime can estimate ancestral states, infer evolution models, reroot trees to maximize temporal signals, estimate molecular clock phylogenies and population size histories. The runtime of TreeTime scales linearly with dataset size.

Keywords: molecular clock phylogenies; phylodynamics; python.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Illustration of TreeTime’s time tree inference algorithm. Terminal nodes in the tree are either associated with exact dates or date ranges (node c2 in this example). These temporal constraints are convolved with the distribution bci(τ) of the branch length τ leading to node ci to yield Cci(t). At the internal node n, the messages from children c1 and c2 are multiplied and contribute to Hn(t|Cn). The latter is further passed down to the parent by convolving with bn(τ).
Figure 2.
Figure 2.
Estimation of the evolutionary rate from simulated data. TreeTime and LSD (following tree reconstruction with FastTree) underestimated the rate when branch lengths are long but return accurate estimates for low diversity samples. The graph shows median values, error bars indicate the inter-quartile distances.
Figure 3.
Figure 3.
Estimation of the TMRCA from simulated data. TreeTime, LSD, and BEAST estimated the time of the MRCA within 10% accuracy at low diversity, but TreeTime and LSD tended to overestimate the date of the root when branch lengths are long. The graph shows median values, error bars indicate the inter-quartile distances.
Figure 4.
Figure 4.
Method comparison on LSD test data. TreeTime (TT) showed comparable or better accuracy as BEAST (strict clock: BSMC; relaxed clock: BRMC), LSD (linear dating: LD; quadratic programming dating: QPD), or RTT regression when run on simulated data provided by (To et al., 2016). Both panels use the tree set 750_11_10, the top and bottom panel show runs on alignments generated with a strict and relaxed molecular clock, respectively.
Figure 5.
Figure 5.
Reconstruction of fluctuating population sizes by TreeTime. The graph shows simulated population size trajectories (dashed lines) and the inference by TreeTime as solid lines of the same color. Different lines vary in the bottleneck sizes of 10% (red), 20% (green), and 50% (blue) of the average population size. The top panel shows data for fluctuations with period 0.5 N, the bottom panel 2 N. The average population size is N = 300.
Figure 6.
Figure 6.
Sensitivity the dataset size. TreeTime and BEAST returned consistent estimates of the rate of evolution (A) and the TMRCA (B) when analyzing alignments of Influenza A/H3N2 HA sequences of various size. LSD showed a systematic drift.
Figure 7.
Figure 7.
Sensitivity to missing information. The inter-quartile range of the error of estimated tip dates decreases from 0.7 to 0.5 years as the fraction of known dates increases from 5 to 90% (see inset).
Figure 8.
Figure 8.
EBOV phylodynamic analysis. The top panel shows a molecular clock phylogeny of EBOV sequences obtained over from 2014 in West Africa. The lower panel shows the estimate of the coalescent population size along with its confidence intervals. The estimate suggests an exponential increase until late 2014 followed by a gradual decrease leading to almost complete eradication by 2016. Ebola case counts, as reported by the WHO (2016) agree quantitatively with the estimate.

Similar articles

Cited by

References

    1. Aris-Brosou S., Yang Z., Huelsenbeck J. (2002) ‘Effects of Models of Rate Evolution on Estimation of Divergence Dates With Special Reference to the Metazoan 18s Ribosomal RNA Phylogeny, Systematic Biology, 51: 703. - PubMed
    1. Britton T., Anderson C. L., Jacquet D.. et al. (2007) ‘Estimating Divergence Times in Large Phylogenetic Trees’, Systematic Biology, 56: 741. - PubMed
    1. Drummond A. J., Ho S. Y. W., Phillips M. J.. et al. (2006) ‘Relaxed Phylogenetics and Dating With Confidence’, PLOS Biology, 4: e88. - PMC - PubMed
    1. Drummond A. J., Suchard M. A., Xie D.. et al. (2012) ‘Bayesian phylogenetics with BEAUti and the BEAST 1.7’, Molecular Biology and Evolution, 29: 1969. - PMC - PubMed
    1. Dudas G., Carvalho L. M., Bedford T.. et al. (2017) ‘Virus Genomes Reveal Factors That Spread and Sustained the Ebola Epidemic’ Nature, 544: 309. - PMC - PubMed