Computational protocol: Contrasting patterns of genetic variation in core and peripheral populations of highly outcrossing and wind pollinated forest tree species

Similar protocols

Protocol publication

[…] Genotypic disequilibrium between pairs of loci was tested at the single population level and across all populations with a Fisher’s exact test using ARLEQUIN 3.11 (). The allelic diversity of the studied loci and within-population genetic variation were estimated based on the following parameters: the number of alleles per locus (Al), the mean number of alleles per population (Np), the mean number of effective alleles per population (Ne), the mean number of private alleles per population (Pa), the observed heterozygosity (Ho) and the unbiased expected heterozygosity (He), all of which were computed using GenAlEx 6 (). In accordance with earlier studies that showed that microsatellites are known to be susceptible to genotyping errors (), the null allele frequency for each loci was calculated using the EM algorithm with FreeNA software (). We used FSTAT v 2.9.3 () to estimate gene diversity (Gd), rarefied allelic richness (AR22) for a minimum sample size of 22 individuals and inbreeding coefficient values (Fis). The deviation of genotypic frequencies from Hardy–Weinberg equilibrium (HWE) were also identified utilizing the inbreeding coefficients (Fis; ) with a correction for null alleles (FisNull) for each population using the Bayesian method implemented in INEST 2.0 software (). The evaluation was performed using the IIM model with 100 000 MCMC iterations, storing every 100th value and with a burn-in period of 10 000. A Bayesian procedure based on the Deviance Information Criterion (DIC) was used to determine the statistical significance of the inbreeding component by comparing the full model with the random mating model (under the assumption Fis = 0). [...] To estimate the proportion of the total genetic variation due to differentiation among populations, an Analysis of Molecular Variance (AMOVA) based on two distance methods (FST and RST) was conducted using ARLEQUIN 3.11. Moreover, due to the presence of null alleles, the global, pairwise and within-geographic regions FST were calculated using FreeNA software. FreeNA applies the ENA (Excluding Null Alleles, FSTENA) correction method to effectively correct for the positive bias induced by the presence of null alleles in the FST estimation. Bootstrap 95 % confidence intervals (CI) were calculated for the global FSTENA values using 2000 replicates across the loci. The statistical significance of the FST values was verified with ARLEQUIN 3.11.To evaluate the ability of the stepwise mutation model (SMM) to differentiate among populations and geographical regions, which in turn indicates whether phylogeographical structures exist, the computed FST and RST were compared. To test whether the difference between values of RST and permuted pRST (which corresponds to FST) was significant, the permutation test proposed by was implemented in the program SpaGeDi 1.3d ().The genetic population structure (in the case of microsatellite markers) can arise due to isolation by distance (IBD), range expansions, diffusion of genes through space in migratory events and/or allelic surfing (). Because of that a Mantel test (1967) was applied to evaluate spatial processes driving populations structure by comparing the matrixes of pairwise geographic (logarithmic scale) and pairwise genetic (measured as FST/(1−FST)) distances. The statistical significance of the correlation was calculated for all populations and sets of populations located along latitudinal and longitudinal transects using 9999 permutations with GenAlEx 6. [...] Principal Coordinates Analysis (PCoA) was applied to visualize the patterns of the genetic structure of the populations using a pairwise FSTENA matrix and GenAlEx 6 software. Phylogenetic relationships between the populations were investigated using POPTREEW (). The phylogenetic tree was constructed from allele frequency data using the neighbour-joining (NJ) method. This method allows faithful depiction of genetic structure for some populations that have an isolation-by-distance population structure (). Nei's standard genetic distance (DST) () was chosen as a distance measure for the construction of the phylogeny. The statistical robustness of the branches was evaluated with 1000 bootstrap replicates.The assignment of individuals and populations to genetically distinct groups was conducted using the Bayesian clustering method with the software STRUCTURE 2.3.4 (; ; ). This program uses a Markov chain Monte Carlo (MCMC) algorithm to assign individuals to a given number of genetic clusters (K) without considering sampling origins and assuming that each cluster is in optimal Hardy–Weinberg (H–W) and linkage equilibrium (LE). The correlated allele frequencies and admixture model used allowed for mixed recent ancestry of individuals and assigned the proportion of the genome of each individual to the inferred clusters. Moreover, because all the microsatellite loci used in this study were affected by null alleles (see Results section), the recessive alleles option was chosen. Twenty independent runs were performed for each K, from K = 1 to 24, with burn-in lengths of 500 000 and 100 000 iterations. The probability distributions of the data (LnP[D]) and the ΔK values () were visualized in the STRUCTURE HARVESTER Web application (). Following Bayesian clustering, the hierarchical distribution of genetic variation was characterized using an analysis of molecular variance (AMOVA). A three-level AMOVA was conducted in ARLEQUIN 3.11 and significance was obtained via 10 000 random permutations. […]

Pipeline specifications