Computational protocol: Phylogeography of social polymorphism in a boreo-montane ant

Similar protocols

Protocol publication

[…] Four of the ten used microsatellite loci were developed specifically for L. acervorum (GA1, GA2, GT1, GT2; []. The remaining loci represent cross-amplifications from other ants (GT218, GT223; [], 2MS67; [], 2MS46; Suefuji et al., unpublished – 2MS46fwd: 5′- GCTCACTACTATGCTGCCAGC -3′, 2MS46rev: 5′- TCCGTCTATCCCTTCCTGCAA -3′, L18; [] and Myrt3; []. PCR conditions were chosen as follows: initial denaturation 5 min at 95 °C; 35 cycles of 60 s at 95 °C, 45 s at the locus-specific annealing temperature of 45–60 °C, elongation of 45 s at 72 °C and a locus-specific final extension step of 30–120 s at 72 °C. Total reaction volume was 20 μL with 1 μL DNA template. PCR products were either analysed on an ABI PRISM 310 automated sequencer (GA1, GA2, GT218, GT223) and subsequently genotyped using GENESCAN 3.1 (Applied Biosystems) or sent to GATC Biotech AG (GT1, GT2, L18, Myrt3a, 2MS67, 2MS46 (II)) and subsequently genotyped using Peak Scanner Software v1.0 (Applied Biosystems).Each microsatellite locus was tested for deviation from Hardy-Weinberg equilibrium (HWE) using exact HW tests []. For testing of independence between loci, we assessed linkage disequilibrium (LD) for all locus pairs across populations (Fisher’s method). Both tests were run in GENEPOP 4.2.2 []. Evidence for the occurrence of null alleles was assessed and corrected using MICRO-CHECKER 2.2.3 []. We found evidence for null alleles in 14 of out of 70 microsatellite loci x population pairs, of which only three pairs had null allele frequencies larger than 0.2 (maximum: 0.243, see Additional file ). Subsequent analysis of both datasets (original and null allele corrected) resulted only in marginal differences in global and pairwise FST – values and microsatellite-tree topologies (null allele corrected global FST = 0.074 and see Fig. , Table  and Additional file ). We, therefore, decided to perform all downstream analyses with the original microsatellite dataset if not explicitly stated otherwise.Fig. 2 [...] Number of alleles (k), allelic richness (A), number of private alleles (AP), and expected (HE) and observed (HO) heterozygosities were calculated per population and locus using FSTAT [] and GENALEX 6.5 []. The genealogical relationships among study populations were analysed by neighbour-joining trees in POPULATIONS 1.2.31 [] using DA distance []. Bootstrap values were obtained by 2000 replications over loci. To study population structure in more detail we estimated overall and pairwise F-statistics using GENEPOP as well as GENALEX to calculate allelic-diversity corrected FST – value analogues (Hedrick’s standardized GST, GST corrected for small sample sizes G”ST, and Jost’s D, []). To test the hypothesis that genetic differentiation is equal or higher within than between regions (defined here as geographic regions – Iberian Peninsula (IB) and Pyrenees (PY) or high skew vs. low skew populations), we conducted an analysis of molecular variance (AMOVA) using ARLEQUIN v3.5.1.2 []. Significance of results was evaluated over 10,000 replicates.We applied Bayesian clustering to assess fine-scale population genetic structure (as well as identifying distinct populations) based upon multi-locus genotypic data. Cluster analysis was run in STRUCTURE 2.3.4 [] allowing individuals to have mixed ancestry (admixture model) without using sampling locations as prior information and without correlated allele frequencies among populations. As recommended by ([], and cf. STRUCTURE documentation), we run a first analysis including microsatellite data from all locations and in a second analysis excluded samples from the most divergent population (PY II). Potential population cluster values (K) varied from 1 to 10 with ten runs per value of K, burn-in and sampling period were set to 300,000 (first analysis) and to 200,000 generations (second analysis) accordingly. For each analysis the optimal K value was assessed by following the ΔK-method as described by [] and implemented in STRUCTURE HARVESTER web-v0.6.93 []. DISTRUCT v1.1 [] was used to graphically display the output. Additionally, we used a Mantel test [] to evaluate genetic isolation by distance in the microsatellite data, with significance of results evaluated by 999 matrix permutations in GENALEX. Genetic distance matrix among populations was calculated as linearized FST-values [(FST (1- FST)−1] in FSTAT whereas geographical distances among populations in kilometres were calculated in GENALEX (based upon the coordinates shown in Table ).Because of the patchy distribution of L. acervorum in “mountainous islands” in Central Spain, we tested for the occurrence of bottlenecks in each sampled population. First, we used Wilcoxon’s sign rank test, which tests for an excess of heterozygosity, implemented in BOTTLENECK 1.2.02 []. Second, we calculated the M-ratio statistic [] to test for a severe reduction in effective population size. Finally, we used the coalescent-based Bayesian MCMC algorithm as implemented in MIGRATE-N v 3.6.10 [] for the inference of demographic parameters, the analyses of demographic changes over time and estimation of pairwise migration rates. For details on all methods (including parameter settings) see Additional file . [...] The primers C1-J-2183 and A8-N-3914 [] were used to amplify an 1641 bp long mitochondrial DNA fragment, starting from within the COI gene, including the complete COII sequence, and finishing in the very beginning of the ATPase 8 gene. PCR was carried out in a total reaction volume of 15 μL using the BIO-X-ACT Short Mix (Bioline) and 1 μL DNA template. PCR conditions consisted of an initial denaturation 4 min at 94 °C; 38 cycles of 75 s at 94 °C, 75 s at 50 °C (annealing), elongation of 150 s at 72 °C, and a final extension step of 5 min at 72 °C. PCR products were sent to LGC Genomics for purification and Sanger sequencing.Chromatograms were assessed and edited using CHROMAS LITE 2.1.1 (Technelysium) and subsequently concatenated in BIOEDIT []. Sequences were aligned manually and automatically by using the algorithm CLUSTAL W as implemented in BIOEDIT. All sequences could be aligned unambiguously, and no indels were found except in three sequences from the eastern Pyrenees (PY I) that contained a 1 bp-deletion. Since this appeared in a noncoding region only, it was considered valid and was used as a fifth mutational state in network analysis. The absence of unusual stop codons as potential evidence for the presence of pseudogenes (numts, e.g. []) was checked in the ARTEMIS GENOME BROWSER using the invertebrate mitochondrial codon table []. [...] To evaluate genealogical relationships among sampled populations we reconstructed haplotype networks using the statistical parsimony algorithm as implemented in TCS []. The reconstructed network contains two extreme divergent haplotypes, including the three sequences with a 1 bp-deletion and a highly divergent sequence from FR III. These haplotypes are similarly divergent as a reference sequence (same primer pair and PCR conditions used) from the closely related socially parasitic species Leptothorax kutteri (Fig. ). Due to their unclear taxonomic status, these sequences were removed from downstream analysis.Fig. 3For the quantification of genetic polymorphism we used the following standard diversity indices: number of haplotypes (h), number of private haplotypes (hP), number of segregating sites (S), haplotype diversity (H) and nucleotide diversity (π) for each locality, for each geographic region, and for the whole mtDNA dataset except the extremely divergent haplotypes using DNASP v5.10.01 []. We estimated genetic differentiation with and without specifying groups (i.e. geographic regions as defined above), following [] by calculating ΦST – values from mean pairwise differences (NST) and haplotype frequencies (GST) in ARLEQUIN. By comparing both statistics of genetic differentiation it is possible to test whether the mtDNA-sequence dataset contains a signal of phylogeographic structure (if NST > > GST) beyond that in haplotype frequencies alone (see [] for detailed discussion). AMOVA with pairwise comparisons between regions was carried out to test the hypothesis that genetic differentiation is equal or higher within regions (ΦSC) than between regions (ΦCT, regions as defined in the microsatellite data). When both measures are equal, their ratio is expected to be one. Consequently, if their ratio is larger than one, then genetic differentiation is larger within than between regions (which in case of the PY – IB comparison would also mean: no stronger genetic differentiation between social forms). Significance of AMOVA results was evaluated over 10,000 replicates. […]

Pipeline specifications