Computational protocol: Can novel genetic analyses help to identify low-dispersal marine invasive species?

Similar protocols

Protocol publication

[…] Populations within a species’ native range can often be assigned to distinct phylogeographic lineages whose ranges are linked to biogeography (Teske et al. ). Introduced populations, on the other hand, while often comprising alleles that are also present in the native habitat, tend to have different allele frequencies (Golani et al. ), or a combination of alleles from several regional lineages (Roman ). We explored genetic relationships among populations, and their relationship with geography, using both population-level and individual-level analyses.Tests for genetic structure among pairs of populations were conducted in GenAlEx v6.5 (Peakall and Smouse ) using the statistics G″ST (Meirmans and Hedrick ) and Dest (Jost ). G″ST is an unbiased estimator of F′ST (FST standardized by the maximum value it can obtain; Hedrick ), while Dest is the unbiased estimator of Jost's () D (actual population differentiation). Both statistics are particularly suitable for microsatellite data because they are not affected by the high levels of polymorphism typical of these markers. Significance was tested using 999 permutations.In addition, we used three approaches that do not incorporate information on each individual's population membership. First, a neighbour-joining (NJ) tree (Saitou and Nei ) was constructed in PHYLIP (Felsenstein ) from Rousset's â indices among pairs of individuals (Rousset ) that were calculated in SPAGeDI (Hardy and Vekemans ). Rousset's â index is an individual-level analog of the population-level FST/(1 − FST) ratio (Rousset ). We used the reduced data set in this case. Second, we tested for differentiation among individuals using factorial correspondence analyses (FCA) in GENETIX v4.05 (Belkhir et al. –2004). This multivariate method can be applied to any type of data and is thus particularly suitable for data sets that are potentially affected by departures from HWE or LD, so we applied it to the complete data set. Genetic differentiation among populations, if present, is graphically displayed by plotting individuals in multidimensional space. Third, we used STRUCTURE v2.3.2.1 (Pritchard et al. ) to determine the most likely number of distinct genetic clusters (K) to which individuals of P. doppelgangera could be assigned (reduced data set only). As genetic structure was found among most pairs of sites (see Results) and the data set was thus highly informative, we used the admixture model without location priors and set allele frequencies to be independent among populations, with default settings for all advanced parameters. For each of five replications of a particular value of K (1–10), the burnin was set to 105 MCMC replicates, followed by 106 recorded replications. In addition to determining the K for which the highest likelihood was determined, we estimated the statistic ΔK (Evanno et al. ), which selects the value of K for which the most rapid increase in likelihood is found for successive values of K. Maximum L(K) and ΔK were both plotted with STRUCTURE HARVESTER (Earl and von Holdt ). [...] Extended Bayesian Skyline Plots (EBSPs; i.e., Bayesian Skyline Plots based on more than one locus) were used to explicitly reconstruct each population's effective population size over time. To our knowledge, this is the first time this method has been used to reconstruct population size trends in an animal at the scale of decades rather than millennia, because until recently, no software was available to construct such plots with microsatellite data. The EBSPs were constructed in BEAST v1.74 (Drummond et al. ), and settings were based on recommendations by Chieh-Hsi Wu (BEAST developer, University of Auckland, New Zealand). The site models of the different loci were linked, but the clock models and partition trees were not. For the substitution model, we specified equal rates, linear mutation bias and a two-phase model. For the strict clock model, a mutation rate of 4.0 × 10−4 (with a 95% confidence interval of 1.3 × 10−4 to 1.3 × 10−3) was specified based on the mutation rate estimate of the MSVAR analyses (see Results), as no published rates for ascidians are available. While this estimate was recovered irrespective of the priors specified (Online appendix), the mutation rates estimated by this program are not always reliable (e.g., Faurby et al. ). Although we consider this particular estimate to be plausible, we also discuss our results in the light of a different choice of mutation rate (see Discussion). A linear model was specified for the coalescent tree prior, and ploidy was set to autosomal nuclear. Default priors were used for model parameters and statistics, except that the demographic population mean was set to uniform, with an initial value of 2500 and upper and lower bounds of 50,000 and 100, respectively, based on the results of an exploratory BEAST run with a constant size tree prior using a combination of samples from the two sites in Victoria. We specified a chain length of 8 × 108 and a logging frequency of 1 × 106, and ran the program on the BIOPORTAL server (Kumar et al. ). Each run was repeated twice with the same settings to ensure that searches converged on similar values. As the pooling of samples from multiple sources can considerably affect the Skyline Plots (Heller et al. ), we excluded populations that showed evidence for admixture from these and the following analyses (ABC). [...] The program DIYABC v2.0 (Cornuet et al. ) was used to test different hypotheses concerning the populations' effective population sizes before and after a period of demographic expansion (all populations underwent expansions, see Results). If the non-Tasmanian populations were recently founded, then one would expect these to have undergone severe bottlenecks. In contrast, long-established populations, although undergoing demographic changes, would be expected to have much larger sizes prior to demographic expansion. Although recent natural or human-mediated intra-Tasmanian colonization events are likely, and some habitats may have become depleted and then recolonized from nearby sources, we hypothesized that there would be well-established Tasmanian populations in particularly suitable habitats whose numbers remained comparatively large over long periods of time. DIYABC implements Approximate Bayesian Computation (ABC), a bayesian method in which the posterior distributions of the model parameters of interest are determined by a measure of similarity between observed and simulated data rather than each parameter's likelihood (Nielsen and Beaumont ). For each population, we determined support for two demographic scenarios: Scenario 1: the effective population size increased from a small number of individuals (1–99) during the past 1000 years, to a larger present population size (100–10,000 individuals); Scenario 2: the same settings were specified, but the starting population size was larger (100–10,000 individuals) but constrained to be smaller than the present population size. Scenario 1 thus represents a founder effect that would be expected if a small number of adults are introduced to a new area by means of a vector (e.g., floating wood or the hull of a ship), while Scenario 2 merely represents an increase in population size. Summary statistics included the mean number of alleles, mean genetic diversity, and mean size variance. a shows details on priors and mutation models. […]

Pipeline specifications

Software tools GenAlEx, PHYLIP, SPAGeDi, Structure Harvester, BEAST, DIYABC
Applications Phylogenetics, Population genetic analysis
Organisms Homo sapiens