Computational protocol: Historical Invasion Records Can Be Misleading: Genetic Evidence for Multiple Introductions of Invasive Raccoons (Procyon lotor) in Germany

Similar protocols

Protocol publication

[…] We analysed the population genetic structure using two Bayesian genetic clustering algorithms. First, we analysed the data in STRUCTURE 2.3.4 []. We estimated the number of genetic subpopulations (K) by performing ten independent runs of K = 1–12 with 106 Markov chain Monte Carlo (MCMC) iterations after a burn-in period of 105 iterations, using the model with correlated allele frequencies and assuming admixture. ALPHA, the Dirichlet parameter for the degree of admixture, was allowed to vary between runs. After deciding on the most probable number of sub-populations based on the log-likelihood values (and their convergence) associated with each K, as well as on the ΔK method by Evanno et al. [], we calculated each individual’s percentage of membership (q), averaging q over different runs of the same K. In order to facilitate geographical representation, the average q values for each administrative district (‘Landkreis’) were calculated and mapped using ArcGIS 10.1 (ESRI Inc., Redlands, CA, USA). Second, we also analysed our data using the ‘clustering of individuals’ algorithm implemented in BAPS v.6.0 [], which infers the number of genetic clusters in a data set. We performed ten runs for each of K = 2–12.For the subsequent analyses, populations were pre-defined by placing samples into the STRUCTURE cluster for which they showed the highest percentage of membership (q). We represented the results from K = 7, averaging q over eight runs with the highest log-likelihood values (see ). We tested for the significance of heterozygote deficiency or excess [] with the Markov chain method in Genepop 4.1.4 [], with 10,000 dememorization steps, 500 batches and 10,000 subsequent iterations. Pairs of loci were tested for linkage disequilibrium using an exact test based on a Markov chain method as implemented in Genepop 4.1.4 The false discovery rate technique was used to eliminate false assignment of significance by chance [].We visualised the genetic differentiation among the samples with a Factorial Correspondence Analysis (FCA) in Genetix 4.05.2 [] and performed genetic exclusion tests in the program GENECLASS 2.0 [] to test the hypothesis that individuals assigned to a specific cluster but visualized as outliers in the FCA were in fact individuals that had recently been introduced to the population. Exclusion probabilities were calculated with the Monte Carlo method of Paetkau et al. [] by simulating 10,000 multi-locus genotypes and by setting the threshold for exclusion of individuals to 0.001 []. The level of genetic differentiation between the genetic clusters was quantified with F ST [] in GenAlEx version 6.501 [] and by an Analysis of molecular variance (AMOVA) using 9,999 permutations.We tested the data set for isolation-by-distance (IBD) by analysing genetic relatedness between pairs of individuals as a function of geographical distance, using program SPAGeDi 1.2 []. The slope of this relationship offers a convenient measure of the degree of spatial genetic structuring. As suggested by Vekemans & Hardy [], the Loiselle kinship coefficient (F ij) [] was chosen as a pairwise estimator of genetic relatedness, as it is a relatively unbiased estimator with low sampling variance. The slope was tested for a significant difference from zero by 10 000 permutations of locations of individuals. We performed an analysis on the whole data set, as well as on pairs of individuals assigned to the same STRUCTURE cluster only (using cluster-specific allele frequencies).We used GenAlEx to estimate the number of alleles (A), observed heterozygosity (H O) and unbiased expected heterozygosity (u H E) for each STRUCTURE cluster and the number of private alleles (p A) in a cluster. Allelic richness (A R) was calculated using Fstat 2.9.3.2 []. Estimates were based on a minimum sample size of 13 diploid individuals. Relatedness coefficients were calculated in COANCESTRY 1.0.1.2. [], which provides the Triadic Maximum Likelihood estimator TrioML based on Wang [], estimating pairwise relatedness (r) by the use of a third individual as a reference, thus reducing the chance of genes identical in state being mistakenly inferred as identical by descent. We estimated effective population sizes (N e) of each genetic cluster using the linkage disequilibrium method in program NeEstimator v.2.01 [], estimated 95% confidence intervals using jackknifing and excluded rare alleles with frequencies less than 0.02. […]

Pipeline specifications

Software tools BAPS, Genepop, GeneClass, GenAlEx, SPAGeDi, NeEstimator
Applications Phylogenetics, Population genetic analysis
Organisms Procyon lotor