Computational protocol: Complex distribution of EFL and EF-1α proteins in the green algal lineage

[…] New EFL and EF-1α sequences were added to existing amino acid alignments []. The two proteins were analysed separately since their relationship to one another has been examined previously, and separate analyses allow the inclusion of more unambiguously alignable characters, 425 and 407 for EFL and EF-1α, respectively. An alignment of nine concatenated proteins from algae and plants was also constructed using actin, alpha-tubulin, beta-tubulin, RbcS, Rps10, Rps13, Rpl3, Rpl11, and Rpl13, for a total of 1,206 unambiguously alignable amino acid characters (with no missing data). The trebouxiophytes are represented by a composite of sequences from three species: actin, Rps10, and Rpl11 are from Helicosporidium sp., beta-tubulin, Rpl3, Rpl13 and Rps13 are from Prototheca wickerhamii, and alpha-tubulin, RbcS, and TufA are from Chlorella vulgaris. Trees were inferred using distance, maximum likelihood and Bayesian methods. Bayesian trees were inferred using Mr. Bayes 3.1 [] employing the WAG substitution model with site-to-site rate variation modeled on a gamma distribution with 8 variable rate categories and one category of invariable sites, three heated chains and one cold one, and 1,000,000 generations with sampling every 1,000 generations. Log likelihoods were plotted and showed a rapid plateau after only five samples, so a burnin of 40 trees was removed before constructing the consensus (constructing a consensus of all trees resulted in the same topology). Maximum likelihood branch lengths for the consensus topology were calculated using ProML 3.6 [] with JTT, 8 gamma rate categories and one category of invariable sites. ProML trees were inferred using the same settings, and 100 bootstraps were inferred with the gamma shape parameter alpha and proportion of invariant sites estimated using Tree-Puzzle 5.2 []. ML trees and 1,000 bootstrap trees were also inferred using PhyML 2.4.4 [] with the WAG model with 8 gamma rate categories and one category of invariable sites (when p-inv was not zero), and parameters estimated from the data. Distances were also calculated using Tree-Puzzle with the same settings (and parameters estimated from the data), and trees constructed using WEIGHBOUR 1.0.1a []. 100 bootstrap replicates were carried out using puzzleboot [], which gave similar results as maximum likelihood (not shown). […]

Pipeline specifications

Software tools PHYLIP, TREE-PUZZLE, PhyML
Application Phylogenetics
Organisms Langat virus