Computational protocol: Widespread distribution of a unique marine protistan lineage

[…] blastn and preliminary phylogenetic analysis were used to identify biliphyte-like sequences in the GenBank non-redundant (May 2006; March 2007) and CAMERA (April 2007) databases. It became apparent that several deposited clones () had been deposited previously () under other accession numbers. Here only one sequence version per clone is used. Check_Chimera () and manual screening were used to identify and remove likely chimeras as necessary. The alignment used for final phylogenetic analyses was performed in clustalw, manually adjusted and masked to use only those bases within regions of unambiguous alignment and considered homologous. Prior to this, Chimera Check (RDP II) and manual screening of alignments were used to identify and remove likely chimeras. Several short originally unidentified sequences from a previous publication () were not used in the alignment, none of which fell in BP2. Model selection, number of rate categories, proportion of invariable sites, transition/transversion (TiTv) and the gamma distribution parameter were determined within Modeltest (). Phylogenetic analysis was performed using maximum likelihood, neighbour-joining distance methods and parsimony, all within the Phylip modules (). For maximum likelihood 100 data sets were used for bootstrapping, with global rearrangements, randomized input, 10 jumbles and six categories (fraction of invariable sites = 0.2332; TiTv = 1.9433; 1/α = 1.451258). The same alignment and parameters were used for neighbour-joining distance (1000 replicates, 10 jumbles) and parsimony analysis (100 replicates, 2 jumbles). […]

