Computational protocol: Spatial genetic structure in Beta vulgaris subsp. maritima and Beta macrocarpa reveals the effect of contrasting mating system, influence of marine currents, and footprints of postglacial recolonization routes

Similar protocols

Protocol publication

[…] As pointed out by Buttler (), only tepal characters are relevant criteria to distinguish between B. v. subsp. maritima and B. macrocarpa. However, sampling was carried out at the beginning of the flowering season, when B. vulgaris subsp. maritima shows large phenotypic plasticity and is morphologically similar to B. macrocarpa. However, as B. macrocarpa is a self-fertilizing species, the detection of multilocus fixed-homozygote genotypes can help to determine a posteriori which individuals belong to B. macrocarpa. Because (1) the two species may possibly hybridize (Kishima et al. ; Lange and Debock ; Bartsch and Ellstrand ) and (2) B. macrocarpa tetraploid forms with fixed heterozygosity can occur (Letschert ), we also discriminated individuals on the basis of their probability of belonging to one or the other species due to allele frequency differences in multilocus genotypes. To do so, we used the model-based Bayesian algorithm implemented in STRUCTURE and described in Pritchard et al. (). Bayesian clustering assumes that each individual has admixed ancestral origins in gene pools that have correlated allele frequencies because of migration or shared ancestry (Falush et al. ). Using the whole data set, we assessed the number of potential clusters (K) from 30 runs along a range of K varying from 1 to 50 with 2.106 Markov chain Monte Carlo replications following a burn-in period of 100,000 iterations without any prior information on the putative population affiliation of individuals. Among the 30 replicates, 15 with the highest likelihood were retained for subsequent analyses. The ad hoc statistic ΔK was then calculated to determine the most accurate number of K clusters (Evanno et al. ).The Bayesian clustering described in Pritchard et al. () assigns individuals by creating groups within which Hardy–Weinberg (HW) disequilibrium and linkage disequilibrium (LD) are minimized. Nonetheless, inbreeding may induce departures from HW expectations and LD among loci, which can lead to bias or erroneous inferences. To circumvent this, we ran two parallel extensive runs by keeping or removing B. macrocarpa individuals to verify the consistency of clustering results.Additional analyses were conducted using spatially explicit Bayesian clustering, that is, incorporating spatial trends and autocorrelation in the prior distribution on individual admixture coefficients, as described in Chen et al. () and Durand et al. (). Using TESS version 2.3 (Grenoble INP -TIMC-IMAG, Faculty of Medicine, La Tronche, France), we ran 1.106 (100,000 burn-in) MCMC iterations from K = 1 to K = 20 (25 replicates per K value) using the BYM admixture model described in Durand et al. () and the geographic distances between individuals. Of the 25 replicates per K tested, 20 with the lowest deviance information criterion (DIC) were retained. We set the Dirichlet parameter of the allele frequency model, the trend degree to 1.0, and the admixture and spatial interaction parameters to default values. To reveal which K may provide the best fit to the genetic data, average DIC values were plotted against K.Similarity coefficients between runs and the average matrices of individual membership proportions were estimated using CLUMPP version 1.1.2 (Jakobsson and Rosenberg ). Clusters were displayed using DISTRUCT version 1.1 (Rosenberg ). [...] Genetic variation was examined per locus and per population with descriptive statistics: the total and mean number of alleles (At and An, respectively), allelic richness (Ar), and level of gene diversity (HE), all using FSTAT version (Goudet ). Allelic richness was estimated using the rarefaction approach proposed by El Mousadik and Petit () and based on a minimal population size of eight diploid individuals. For additional information on population genetic uniqueness, estimates of private allelic richness (ArP) were computed following a rarefaction procedure (n = 8) using ADZE software (Szpiech et al. ).Genotypic linkage disequilibrium among all locus-pair combinations was assessed prior to other analyses using a Markov chain approximation of the Fisher's exact test, based on the contingency tables for all pairs of loci in each population, as implemented in GENEPOP version (Raymond and Rousset ). Departures from HW equilibrium for each microsatellite locus and overall loci were quantified using the intrapopulation fixation index (FIS). Statistical significance of FIS values was subsequently assessed using the randomization procedure provided by FSTAT. Sequential Bonferroni corrections for multiple comparisons were applied following Rice ().Levels of genetic variation (Ar, ArP, and HE) were compared between the genetically distinct groups depicted through Bayesian clustering and multivariate analyses described later. Statistical significance of differences was tested using a one-way analysis of variance followed by Tukey's multiple-comparisons test. To assess whether a northward-declining diversity gradient occurred in B. vulgaris subsp. maritima, linear regressions were applied on multilocus genetic diversity estimates and spatial attributes of populations: latitude and geographic distance measured along the coastline from the most southern sampling site. Linear regressions were performed on separate population data sets to take into account the presence of genetic discontinuities. A potential bias could arise because levels of allelic diversity often varied among the loci. Therefore, we also performed linear mixed models in analyzing genetic diversity as a function of geographic locations, with loci as random intercepts. [...] We also analyzed our data set using a spatially explicit multivariate method: the spatial principal component analysis (sPCA) described in Jombart et al. (). The advantage of sPCA is that it imposes no genetic assumptions on mating system, population structure, or allele frequency models (Jombart et al. ). Basically, the method aims to find independent synthetic variables that maximize the product of the spatial autocorrelation calculated on a set of allelic frequencies using Moran's index and the genetic variance among individuals or populations. Spatial information used for the computation of Moran's I is stored in a spatial weighting binary matrix determined through a neighborhood graph. The Gabriel graph was selected to perform the sPCA because it showed the best fit with our genetic data (see ). To evaluate the consistency of spatial patterns observed, the significance of global and local spatial structures was assessed using permutation tests as described in Jombart et al. (). Main results provide a geographic map of entity scores allowing a visual assessment of spatial genetic structure. To draw a comprehensive synthetic representation, each of the first three principal component scores was simultaneously represented into a channel of colors as in Menozzi et al. (). All the analyses were performed using the ADEGENET package implemented in R (Jombart ; R Development Core Team ). […]

Pipeline specifications

Software tools CLUMPP, DISTRUCT, Genepop, adegenet
Application Population genetic analysis
Organisms Beta vulgaris subsp. maritima, Beta macrocarpa