Computational protocol: Measuring Asymmetry in Time-Stamped Phylogenies

Similar protocols

Protocol publication

[…] To illustrate the bias of standard metrics, we simulated two sets of trees—one set with homochronous sampling (tips sampled at the same time) and one set with heterochronous sampling (tips sampled at different times). These were generated using Serial SimCoal [] under a coalescent model with effective population size of 104, with 100 tips sampled in the present for the homochronous sampling, and sampled over 10 time points each 1000 generations apart for the heterochronous sampling. [...] A single tip-dated phylogeny is required as input for our permutation approach. These can be obtained via a number of methods, but for viral datasets, the use of BEAST [] is most common. Before implementing the permutation test, the observed trees were checked for polytomies, which were subsequently resolved into randomly ordered dichotomies with zero branch lengths. Negative branches were set equal to zero.Tree files were available for the ebola virus [] and influenza A virus [] datasets in Newick format, and are available in the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.v7817. For the within-host HIV dataset [], the sequences were aligned using MUSCLE v3.8.31 [] and the maximum clade credibility tree (MCC) obtained using BEAST 1.8 with a GMRF Bayesian Skyride coalescent model []. The GTR model of nucleotide substitution [] was used with an uncorrelated log-normal relaxed clock and a discretised gamma distribution with four categories was used to model rate heterogeneity across the sequence []. For the log-normal relaxed clock parameters, a uniform prior between 0.0 and 1.0 × 10100 was assumed for the mean, and an exponential with mean 1/3 for the standard deviation. A uniform (Dirichlet) prior was used for the nucleotide frequencies. The MCMC was run for 1 billion iterations, with a 10% burn-in period and samples saved every 10,000 iterations.The within-host HIV skyride plot was obtained from the observed tree in R using an approximate approach that employs an integrated nested Laplace approximation []. […]

Pipeline specifications

Software tools SIMCOAL, BEAST, MUSCLE
Applications Phylogenetics, Population genetic analysis, Nucleotide sequence alignment
Diseases Infection, HIV Infections