Computational protocol: Systematics of theOsteocephalusbuckleyi species complex (Anura, Hylidae) from Ecuador and Peru

Similar protocols

Protocol publication

[…] We estimated phylogenetic relations between species of Osteocephalus based on newly generated sequence data for five mitochondrial (12S RNA, CO1, 16S, ND1, control region) and one nuclear gene (POMC) for a total of up to 4170 bp. To expand the species sampling, we also included sequences from GenBank. All samples are listed in . For the outgroup, we included one sample of Trachycephalus jordani and one of Trachycephalus typhonius (based on and ). The completeness of the sequences varied considerably among individuals (specially for samples from GenBank which typically lacked three or more loci). Nevertheless, we included samples with missing data because analyses of both empirical and simulated matrices have shown that taxa with missing sequences can be accurately placed in model-based phylogenetic analyses if the number of characters is large, as in our matrix (for a review see ). Preliminary sequence alignment was done with MAFFT 6.814b software with the L-INS-i algorithm (). The sequence matrix was imported to Mesquite (version 2.72; ) and the ambiguously aligned regions were adjusted manually to produce a parsimonious alignment (i.e., informative sites minimized). In protein coding loci, DNA sequences were translated to amino acids with Mesquite to aid the manual alignment. Phylogenetic trees were obtained using Bayesian inference. Because our dataset includes several loci, it is unlikely that it fits a single model of nucleotide substitution. Thus, we partitioned the data to analyze each partition under a separate model. The best model for each partition was chosen with JModelTest version 0.1.1 () using the Akaike Information Criterion with sample size correction as optimality measure. We also evaluated three different partition strategies: (i) a single partition, (ii) six partitions (one per loci), and (iii) twelve partitions (one for each codon position in protein coding loci plus one for each non protein coding loci). The best partition strategy was chosen by estimating Bayes factors using a threshold of 10 as evidence in favor of the more complex partition (). Each Bayesian analysis consisted of two parallel runs of the Metropolis coupled Monte Carlo Markov chain for 5 × 106 generations. Each run had four chains with a temperature of 0.05. The prior for the rate matrix was a uniform dirichlet and all topologies were equally probable a priori. Convergence into a stationary distribution was determined by reaching average standard deviation split frequencies < 0.05 between runs. We also used software Tracer ver. 1.5 () to visually inspect convergence and stationarity of the runs. The first 50% of the sampled generations were discarded as burn-in and the remaining were used to estimate the Bayesian tree, posterior probabilities and other model parameters. Phylogenetic analyses were carried out in MrBayes 3.2.1 (). Because the only nuclear gene analyzed had low variability and few informative sites, it was concatenated to the mitochondrial genes into a single matrix. We recognize the advantages of species-tree methods (e.g., ) but could not use them given the insufficient number of nuclear genes sampled. We encourage the application of those methodologies in future phylogenetic inferences in Osteocephalus. […]

Pipeline specifications

Software tools MAFFT, Mesquite, jModelTest, MrBayes
Application Phylogenetics