Computational protocol: A mixed relaxed clock model

Similar protocols

Protocol publication

[…] The nucleotide dataset of Meredith et al. [] was first restricted to placental mammals and then reduced to 105 taxa in a way that was intended to better correspond to the diversified sampling assumption. Specifically, a cut-off time of 25 Myr was chosen. Then, based on the time-calibrated phylogeny published by Meredith et al. [], only one terminal taxon was chosen for each lineage crossing the cut-off time. For the tip-dating analyses, the 105 taxon dataset was combined with the set of eutherian fossils of the analysis of O'Leary et al. [], excluding Hapalops, which is younger than 25 Myr. This resulted in a total of 138 eutherian species. Altogether, this taxon sampling can be seen as approximating a uniform fossil sampling rate between the origin of Eutheria and 25 Myr, followed by a fossil sampling rate of 0 between 25 Myr and the present.For both node- and tip-dating analyses, the phylogeny was constrained a priori. For the node-dating analyses, the phylogeny of extant placental mammals was constrained based on the nucleotide-based phylogeny of Meredith et al. []. For the tip-dating analyses, the phylogeny was obtained as follows. First, the phylogeny of Meredith et al. [] was used as the backbone for the phylogeny spanning extant taxa. Then, restrictions about possible fossil placements were defined based on expert knowledge, in the form of a series of clade constraints (electronic supplementary material, table S5). Finally, a Bayesian total-evidence analysis [] was conducted under the white noise clock, under all of the constraints defined above and using the morphological character matrix of O'Leary et al. []. All this was done using the RevBayes programming environment []. The posterior consensus tree of two independent analyses was calculated, which was then used as the fixed tree topology for all remaining analyses presented in this article.To conduct the constrained total-evidence analysis, the matrix of morphological characters of O'Leary et al. [] was extended with missing entries so as to encompass the 138 extant taxa considered here. The resulting character matrix was partitioned into several components, based on the number of distinct states represented in each column of the data matrix. Only the components corresponding to 2, 3 and 4 distinct states were used. Together, they account for 3212 out of the 4541 characters of the original data matrix. For each component, a Jukes–Cantor model was used, with the adequate number of states. Note that the likelihood was not corrected for unobserved site-patterns [], as this option was not available in RevBayes at the time of the analysis. The prior constraints on fossil placement were used for two reasons: first, to restrict the search space and, second, to override some of the problematic fossil placements observed in the unconstrained analyses using the morphological character matrix of O'Leary et al. [].For the node-dating analyses, all of the fossil calibrations used by Meredith et al. [], which are still valid under the present taxon subsampling, were used. They are reported in the electronic supplementary material, table S6. They were implemented as hard bounds. For the tip-dating analysis, uncertainty about fossil ages was accounted for, thus addressing the potentially important problem raised by O'Reilly et al. []. Allowing for uncertainty about fossil ages was implemented by allowing the tips corresponding to fossils to move during the Markov chain Monte Carlo (MCMC), within the intervals defined in the electronic supplementary material, table S7. As a result, the likelihood is summed over all possible serially sampled time-calibrated phylogenies that are compatible with these interval constraints. [...] The nucleotide sequences were assumed to evolve according to a general time-reversible process, using the standard parametrization in terms of relative exchange rates and equilibrium frequencies. Neither variation across sites nor across partitions was accommodated. Preliminary runs, using the node-dating formalism already implemented in PhyloBayes [], with and without rate variation among sites, resulted in very similar divergence times estimation (mean euclidean distance of 0.8 Myr between the vectors of posterior mean divergence times between the two analyses, maximum deviation of 4 Myr across all nodes), suggesting that accounting for varying rates across sites is not critical in the present context. […]

library_books

A mixed relaxed clock model

2016 Philosophical Transactions of the Royal Society B: Biological Sciences
PMCID: 4920333
PMID: 27325829
DOI: 10.1098/rstb.2015.0132

Pipeline specifications

Software tools node dating, RevBayes, PhyloBayes
Application Phylogenetics