Computational protocol: Genetic variability and structure of the water vole Arvicola amphibius across four metapopulations in northern Norway

Similar protocols

Protocol publication

[…] The number of alleles (na), allelic richness corrected for minimum sample size (AR), observed (HO) and expected (HE) heterozygosities for each locus, as well as inbreeding coefficients (FIS) for each population and overall FST, were calculated in FSTAT version (Goudet ). We tested for departure from Hardy–Weinberg equilibrium (HW) and linkage disequilibrium (LD) using exact tests based on a Markov chain algorithm implemented in the program GENEPOP 3.4 (Raymond and Rousset ). Evidence for scoring error due to stuttering, large allele dropout and null alleles was checked with the software Microchecker 2.2.3 (Van Oosterhout et al. ).We compared pairwise genetic distances to pairwise geographic distances (Appendix ) to investigate isolation by distance among populations (Slatkin ). The computer program GENEPOP 3.4 (Raymond and Rousset ) was used to estimate pair-wise FST (Weir and Cockerham ) among the sampled populations (see Appendix ). The estimates from these analyses were further analyzed and related to geographic distance, using the software R (R Development core team ). The R package ECODIST (Goslee and Urban ) was used to carry out Mantel tests (Mantel ), which allow for inter-dependence of data points in the analyses (e.g. Underwood ). In the Mantel tests, a Pearson's rank correlation coefficient was calculated and statistical significance was estimated by 1000 permutations.Because the study area is two-dimensional, we used the transformed FST (i.e. ) and decimal logarithm of geographic distance in the analyses (see Rousset ).The R package HIERFSTAT (Goudet ) was used to determine the relevant unit of population structure according to two hierarchical levels: island and locality. The significance of these different levels was tested with a G-based randomization test implemented in the package and the number of randomization was set to 10,000.STRUCTURE 2.3.3 (Pritchard et al. ; Pritchard and Wen ) was used to estimate the spatial structure of the genetic data. STRUCTURE considers multilocus genotypes and attempts to minimize linkage disequilibrium and Hardy–Weinberg disequilibrium by estimating the number of populations (K) on the basis of individual data. In STRUCTURE, we ran five iterations for each K = 1–20 (100,000 burn-in period length, 500,000 Monte Carlo repetitions) using the admixture model, correlated allele frequencies and no prior information of the sampling locality. Furthermore, we followed the procedure described by Evanno et al. () to identify the principal hierarchical level of structure in our data. Clustering was performed and the individuals were assigned to groups using q-values. To detect the number of first generation immigrants among the 7 clusters of islands estimated by STRUCTURE, we used the software Geneclass 2.0 (Piry et al. ) and we followed the settings recommended when not all source populations are sampled (direct likelihood L_home). We simulated 10,000 individuals with Monte Carlo resampling, following the algorithm of Paetkau et al. () and the frequencies-based method (Paetkau et al. ). The probability of type I error was set to P < 0.01, and the default frequency of missing alleles to 0.01. […]

Pipeline specifications

Software tools Genepop, GeneClass
Application Population genetic analysis
Organisms Neovison vison