Computational protocol: Improved Phylogenetic Analyses Corroborate a Plausible Position of Martialis heureka in the Ant Tree of Life

Similar protocols

Protocol publication

[…] Single genes were aligned separately using the local L-ins-i algorithm of MAFFT version 6.717 . The L-ins-i algorithm is an iterative progressive algorithm which outperformed other methods in benchmark tests , . Each of the three sequence alignments (18S, 28S, and EF1aF2) was screened for randomised sections with ALISCORE using all possible pairwise comparisons and a window size w = 6. Within ALISCORE, gaps were treated as ambiguous characters. Randomised sections (28S rRNA: 725 base positions (bp); 18S rRNA: 14 bp) were excluded with ALICUT . In the EF1aF2 alignment, no randomised positions were detected. Single genes were concatenated using FASconCAT version 1.0 . The concatenated supermatrix of the masked approach included 4,315 characters while the unmasked supermatrix comprised 5,054 characters. All alignments (fasta format) and the respective character partitions are provided in , , , and are freely available from http://www.zfmk.de. [...] We computed NeighbourNetworks – with SplitsTree 4.10 to visualise the data structure of the unmasked and masked alignments. NeighborNetworks were calculated applying uncorrected p-distances for the unmasked alignment and the masked alignment used for the masked-partitioned analyses. NeighborNetwork graphs give an indication of noise, signal-like patterns and conflicts within a multiple sequence alignments. [...] We estimated a Maximum Likelihood (ML) topology for the unmasked supermatrix and the masked supermatrix in non-partitioned analyses with RAxML using RAxMLHPC-PTHREADS , version 7.2.6. A third topology was reconstructed from the masked supermatrix with four partitions according to the setup described for the Bayesian analyses in Rabeling et al. (2008) with the RAxMLHPC-HYBRID , version 7.2.6. The first partition included the 18S, the second partition the 28S. The third partition comprised the 1st and 2nd codon position of EF1aF2, the fourth partition included the 3rd codon position of EF1aF2. We identified the correct reading frame and excluded the first position of the EF1aF2-alignment. Therefore, the EF1aF2-alignment was 1 bp shorter (516 bp) than that described in Rabeling et al. (2008) .We conducted rapid bootstrap analyses and a thorough search for the best ML tree using GTR+α with 5,000 bootstrap replicates. We evaluated the number of necessary bootstrap replicates a posteriori for each data set according to the bootstop criteria based on the Weighted Robinson-Foulds (WRF) distance criterion using RAxML 7.2.6 for the extended majority-rule (MRE) consensus tree criterion. We chose a cutoff value of 0.01 to ensure a sufficient number of bootstrap replicates. In final trees, clades with a bootstrap support (bs) below 50% were considered unresolved. All analyses were performed on HPC LINUX clusters of the ZFMK, Bonn, Germany. Trees were edited with the software TreeGraph 2 .To test alternative placements of Martialinae and Leptanillinae as suggested by Rabeling et al. (2008) , we exchanged the position of Martialinae and Leptanillinae in our best trees (unmasked, masked-unpartitioned and masked-partitioned). We compared alternative tree topologies by performing an AU test for each data set. Therefore, we optimised branch lengths for alternative topologies. Subsequently, we calculated per site log Likelihood scores using RAxML 7.2.6. AU tests were performed with CONSEL , version v0.1i. [...] Bayesian phylogenies were calculated using MrBayes , for three data sets also used in our ML analyses. Topologies were inferred from (i) the unmasked superalignment (ii) the masked superalignment, non-partitioned and (iii) the masked superalignment with four partitions according to and our ML analyses. Similar to Rabeling et al., we used MrBayes v3.2 (an unreleased version of MrBayes; the source code was downloaded from the current version system in January, 2011). Convergence of parameters of the Bayesian analyses was assessed with the software Tracer v1.5 .We chose the sequence evolution model GTR+Γ for all three data sets (i) – (iii) for accuracy of comparison with our ML analyses. Parameters of the model (i.e., base frequencies, transition/transversion ratio, and rate variation shape parameter) were unlinked across partitions. According to Rabeling et al., Metropolis coupling was used with eight chains per analysis and a temperature increment of 0.05 . For analysis (i) and (ii) we ran 30 million generations with a sample frequency of 200. For analysis (iii) we ran 28,130,500 generations with a sample frequency of 100. After checking all analyses for parameter convergence in Tracer v1.5, we discarded a burn-in of 10% for each analysis. After discarding the burn-in, majority rule consensus trees with posterior probabilities were calculated from all sampled trees within MrBayes. All analyses were performed on HPC LINUX clusters of the ZFMK, Bonn, Germany. Trees were edited with the software TreeGraph 2 . […]

Pipeline specifications