Computational protocol: Reconstructing the phylogeny of Blattodea: robust support for interfamilial relationships and major clades

Similar protocols

Protocol publication

[…] The taxon sample consists of 103 Blattodea taxa (ingroup) and 26 outgroup taxa (Table ). The molecular data set consists of five genes: the mitochondrial 12S (390 nucleotides, nt), 16S (430nt), COII (730nt), and the nuclear 28S (600nt), H3 (330nt); the total length of the aligned molecular data set is 2831 nt. GenBank sequences were used when available from previous works on Blattodea, –, but some problematic sequences were not used in this study, e.g. Supella longipalpa. For Mantodea and others see Table . New sequences and their GenBank numbers were listed in Table . In our study, names of chimeric taxa (i.e. Gryllus, Mantophasmatidae and Oligotomidae) followed Djernæs et al..Sequences were aligned via the online MAFFT 7 (http://mafft.cbrc.jp/alignment/server/). For ribosomal genes (12S, 16S and 28S), alignments were adjusted according to the first sequence because some ribosomal gene sequences from GenBank were reversed. The Q-INS-i algorithm was selected protein-coding genes (COII, H3), the G-INS-i algorithm was used with other parameters at their default values. Protein-coding genes (COII, H3) were inspected visually and manually corrected in Mega6 after translation into amino acids; few gaps were detected, and alignment was straightforward. Alignments of the ribosomal sequences (12S, 16S and 28S) were inspected visually and manually adjusted in Mega6. Poorly aligned characters were removed but these were limited.Subsequent analyses were performed on the combined dataset utilizing Maximum likelihood (ML) and Bayesian inference (BI). Bayesian inference (BI) was performed using MrBayes 3.2 and maximum likelihood (ML) was performed using RAxML 7.7.1.The molecular data set was divided into 9 partitions (partitioned by gene: 12S, 16S, 28S, COII, H3; COII and H3 were divided by codon position (pos1–3)). For ML, the GTRGAMMA model was selected for the combined datasets and 1000 bootstrap replicates were performed. For BI, PartitionFinder v.1.1.1 was used to choose models and model selection was based on BIC. For the 9 partitions, PartitionFinder resulted in the following models: GTR+I+ G: 12S, 16S, COII_pos1, COII_pos2, 28S; TVM+G: COII_pos3; GTR+G: H3_pos1; JC+I: H3_pos2; TVM+I+G: H3_pos3. Two independent sets of Markov chains were run, each with one cold and three heated chains for 1 × 107generations, and every 1000th generation was sampled. Convergence was inferred when a standard deviation of split frequencies <0.01 was completed. Sump and sumt burninfrac were set to 25% and contype was set to allcompat. [...] We performed divergence date analyses based on the combined mitochondrial, nuclear and histone dataset of Blattodea and 26 outgroups (see Table ). For this analysis, the molecular clock was calibrated using eight minimum age constraints based on termite, cockroach and mantid fossils as shown in Table . Analyses were performed using a relaxed molecular-clock model with the Bayesian phylogenetic program BEAST 1.8.0. Rate variation was modeled among branches using uncorrelated lognormal relaxed clocks, with a single model for all genes. A Yule speciation process was used for the tree prior and posterior distributions of parameters, including the tree, were estimated using MCMC sampling. We performed two replicate MCMC runs, with the tree and parameter values sampled every 5000 steps over a total of 50 million generations. A maximum clade credibility tree was obtained using Tree Annotator within the BEAST software package with a burn-in of 1000 trees. Acceptable sample sizes and convergence to the stationary distribution were checked using Tracer 1.5. […]

Pipeline specifications

Software tools MAFFT, MEGA, MrBayes, RAxML, PartitionFinder, BEAST
Application Phylogenetics