Computational protocol: A Phylogeny and Timescale for the Evolution of Pseudocheiridae (Marsupialia: Diprotodontia) in Australia and New Guinea

Similar protocols

Protocol publication

[…] We identified alignment ambiguous regions using SOAP v1.2a4 (Löytynoja and Milinkovitch ) with gap opening (11–19) and gap extension (3–11) penalties in steps of two (25 total alignments). Prior to the removal of the alignment ambiguous regions, the resultant alignment was manually re-aligned with Se-Al (Rambaut ) to keep gaps in synchrony with codons. We removed entire codon positions from the alignment in cases where partial codons were included in the ambiguous region. Twenty-one base pairs were identified and removed from the BRCA1 alignment. The remaining 6928 sites were retained for subsequent analyses (ApoB = 760 sites; BRCA1 = 2477 sites; ENAM = 900 sites; IRBP = 1276 sites; Rag1 = 544 sites; vWF = 971 sites). The aligned nexus file is given in . [...] Bootstrap compatibility tests (de Queiroz ; Teeling et al. ) were performed with RAxML 7.0.4 (Stamatakis ) to assess the appropriateness of combining individual gene segments into a concatenated data set. Bootstrap analyses for each gene segment included 500 bootstrap replicates, and either the GTR + Γ (ApoB, BRCA1, ENAM) or GTR + I + Γ (IRBP, Rag1, vWF) model of sequence evolution (see Phylogenetic analyses). All bootstrap analyses were started from randomized MP starting trees, employed the fast hill-climbing algorithm, and estimated free model parameters. No conflicting nodes were found at or above the 90% bootstrap support level. PAUP 4.0b10 (Swofford ) was used to implement the partition homogeneity test (Farris et al. ) with six partitions corresponding to each of the gene segments, 1000 replicates, and ten taxon input orders per replicate. The results of the partition homogeneity test were significant (p = 0.003). However, Cunningham (), Barker and Lutzoni (), and Darlu and Lecointre () suggested that a significance threshold of 0.05 might be too conservative for the partition homogeneity test; Cunningham () suggested using a critical alpha value between 0.01 and 0.001. In contrast to Cunningham () and others, Dowton and Austin () showed that the partition homogeneity test sometimes results in estimates of congruence that are too high. Given the conflicting results reported above for the bootstrap compatibility and partition homogeneity tests, and the uncertainty associated with interpreting the results of partition homogeneity tests, we chose to combine the six individual gene segments into a single concatenation based on the results of bootstrap compatibility tests. [...] Maximum likelihood (ML), Bayesian, and maximum parsimony (MP) analyses were performed with RAxML (7.0.4) (Stamatakis ), MrBayes v3.1.1 (Huelsenbeck and Ronquist ; Ronquist and Huelsenbeck ), and PAUP 4.0b10 (Swofford ), respectively. Gaps were treated as missing data in all analyses. ML bootstrap analyses employed 500 replicates. MP bootstrap analyses implemented 1000 replicates, 10 randomized taxon input orders per replicate, and tree-bisection and reconnection branch swapping.For the ML and Bayesian analyses we performed partitioned analyses, which allowed each gene to have its own parameters, and non-partitioned analyses, which treated the concatenation as a single gene. In addition, we performed a ML analysis on each gene segment. Best-fit models of molecular evolution were chosen using the Akaike Information Criterion as implemented in Modeltest 3.06 (Posada and Crandall ). Models chosen were GTR + Γ (ApoB); TVM + Γ (BRCA1); HKY + Γ (ENAM); TVM + I + Γ (IRBP); TIM + I + Γ (Rag1); TrNef + I + Γ (vWF); GTR + I + Γ (concatenation). If the model of sequence evolution suggested by Modeltest was not available in MrBayes, we used the next most general model. RAxML only implements the GTR substitution matrix, and the results of Modeltest were only used to inform the possible inclusion of a rate-heterogeneity parameter (Γ or Γ + I). ML analyses were started from randomized MP starting trees, employed the fast hill-climbing algorithm, and estimated free model parameters. Bayesian analyses used default settings for priors, random starting trees, eight Markov chains (seven hot and one cold) sampled every 1000 generations, and were terminated once the average standard deviation of the split frequencies for the simultaneous analyses fell below 0.01 (at least 5 million generations). [...] SIMMAP (version 1.0 B2.3.2; Bollback ) and MacClade 4.08 (Maddison and Maddison ) were used to estimate ancestral states for geographic provenance of origin for Pseudocheiridae [0: Australia and Tasmania; 1: New Guinea and surrounding islands], and maximum elevation [0: ≤ 1500 m; 1: between 1500 and 3000 m; and 2: ≥ 3000 m (ordered character)]. MacClade uses parsimony to infer the minimal number of steps needed to explain the distribution of characters found in the terminal branches. SIMMAP implements the methods of Nielsen () and Huelsenbeck et al. () for stochastically mapping discrete mutations onto phylogenies. SIMMAP uses a Bayesian approach to calculate a posterior probability distribution that accommodates uncertainty in ancestral states, evolutionary rates, and the phylogeny. The rate prior is described by the parameters α and β, which specify the mean (α/β) and variance (α/β2). We used 50 discrete categories to approximate the gamma distribution and rescaled branch lengths before applying the prior to maintain branch length proportionality. We used three sets of morphological priors (α = 1 and β = 1; α = 3 and β = 2; α = 5 and β = 5) to investigate the robustness of our estimates. An additional parameter, the bias parameter, is required for analyses with binary characters. The bias parameter prior was specified with α = 1, which specifies an uninformative prior with equal prior probabilities. We used all post burn-in trees from one run of the partitioned Bayesian analysis (12212 trees) with ten draws from the prior distribution; acrobatids, Tarsipes, and petaurids were excluded from the SIMMAP analyses, but the pseudocheirid trees remained rooted. The three different combinations of priors gave posterior probabilities that differed by no more than 0.1199 (ancestral areas) or 0.1007 (elevation) at a given node, and we only report the results for analyses with α = 3 and β = 2. […]

Pipeline specifications