Computational protocol: Speciation patterns and processes in the zooplankton of the ancient lakes of Sulawesi Island, Indonesia

[…] We surveyed five lakes on the island of Sulawesi, Indonesia: Lakes Tondano, Poso, and the three major lakes of the Malili lake system (Matano, Mahalona, and Towuti). Three to five sites were sampled per lake for a total of 19 sites (; Fig. A). Although copepods may be present up to 100 m depth in Lake Matano (Sabo et al. ), zooplankton samples were collected from the surface layer by 10 m vertical tows using a 62-μm-mesh plankton net (1 m diameter) and immediately preserved in 95% ethanol. The key of Reddy () was used for taxonomic identification. As polyteny has been found as a potential source of cryptic speciation in copepods (McLaren et al. ), we estimated the nuclear DNA content (genome size) from each population using the Feulgen image analysis densitometry method described by Hardie et al. (). Genome size was estimated for four individuals from each lake population using a minimum of 20 and maximum of 50 nuclei. Optical densities were converted into picograms using chicken (Gallus gallus domesticus) blood as a standard. Means and standard errors were calculated from four individuals of each lake population. A one-way ANOVA and Tukey post hoc comparisons were performed with the STATISTICA v. 8.0 software package to test for differences between all lake populations. [...] Sequences for each marker were aligned and quality controlled using CodonCode Aligner v.2.0.6 (CodonCode Corporation, Dedham, MA). Neighbor-joining (NJ) and Bayesian (BI) phylogenetic reconstructions were conducted for three data sets: COI, ITS1, and concatenated 18S + 28S. NJ phylogenetic reconstructions were performed in MEGA v. 4 (Tamura et al. ) using the TrN substitution model and 103 bootstrap replicates. BI reconstructions were conducted with MrBayes v.3.1 (Ronquist and Huelsenbeck ) using the best fit substitution models as determined by Modeltest v.3.7 (Posada and Krandall ), and consisted of four replicate runs with four chains of 107 generations, discarding the first 25% as burn-in. The calanoid copepod Leptodiaptomus siciloides, collected from Lake Erie, Ontario, Canada, was used to root all trees.We explored gene flow between the three Malili lakes populations (Matano, Mahalona, and Towuti) with coalescent simulations using the full COI alignment and the longest nonrecombining stretch of DNA from the ITS1 alignment in the program IMa2 (Hey ). We conducted five final runs of 106 generations using priors 5× larger than those estimated via the user guidelines. Log-likelihood ratio tests were performed to infer migration between lakes.Relationships among the COI haplotypes were further examined by constructing a statistical parsimony haplotype network at the 95% connection limit in TCS v.1.21 (Clement et al. ). The number of haplotypes (Nh), haplotype diversity (h), nucleotide diversity (π), Tajima's D, and Fu's FS was calculated for the COI data with DnaSP v.5 (Librado and Rozas ). Tajima's D statistic (Tajima ) was used to test evolution under neutrality or demographic changes for each of the major COI clades. Significantly negative D values indicate strong selection or a population bottleneck, whereas positive D values indicate balancing selection (Tajima ). Population demographic changes were further investigated using Fu's FS (Fu ) and pairwise mismatch distributions (Rogers and Harpending ) computed with 104 permutations in Arlequin v. 3.5 (Excoffier and Lischer ). Statistically significant negative FS values and unimodal mismatch distributions indicate an excess of recent mutations and population expansion events (Fu ). Positive FS values indicate lack of alleles or overdominant selection and multimodal distributions indicate demographic equilibrium. Further details regarding phylogenetic and coalescent analyses can be found in . […]

