Computational protocol: Genetic diversity and structure of an endangered desert shrub and the implications for conservation

Similar protocols

Protocol publication

[…] Seven microsatellite loci produced consistent PCR products near the expected size range and were separated by capillary electrophoresis, ABI 3730xl (Applied Biosystems) at the University of Wisconsin DNA-sequencing facility. Allele lengths were analysed with Geneious v7.0 using the package Plugin (), using Gene-flo 625 (Chimerx) as the internal lane standard.Genetic diversity index, including the number of alleles per locus corrected for the sample size (A), Nei’s gene diversity index (h), observed heterozygosity (HO) and unbiased expected heterozygosity (HE) within each locus and population were calculated using GenAlEx 6.5 software (, ). Heterozygote deficiency within populations (FIS) at different significance levels (P = 0.05 and 0.001) and F-statistics value (FIS, FIT, and FST) within each locus over all populations were calculated with FSTAT 1.2 (). Deviation from Hardy-Weinberg expectations and linkage disequilibrium in pairs of microsatellites loci were tested with GENEPOP 3.4 (), using 1000 allelic permutations among individuals and 0.05 (P-value) for the significance level. Null alleles were tested with MICRO-CHECKER 2.2 (). Bonferroni-type correction was applied for all tests to estimate significance ().Populations that have experienced recent bottlenecks generally exhibit significant excess of heterozygosity, indicating departure from mutation-drift equilibrium. Tests implemented in the programme BOTTLENECK 1.2.02 () were conducted under the infinite allele model (IAM), the stepwise mutation model (SMM) and the two phase model (TPM), with 10 000 replicates performed. The Wilcoxon sign-rank test suggested by was used to estimate the significance level. In addition, tests for shifted or normal L-shaped distribution of allele frequencies were also performed ().Migration-drift equilibrium (gene flow model vs drift model) was tested with the software 2MOD, estimating the relative likelihoods of fthe two models using an Markov chain Monte Carlo (MCMC) procedure as described in . The procedure was performed with 100 000 iterations, and the first 10 % of points in the output were dropped to avoid dependence on initial starting values.To assess the partitioning of total genetic variation among and within populations, analysis of molecular variance (AMOVA) was implemented in ARLEQUIN v.3.01 (), using 1000 permutations. Pairwise population differentiation measures (FST) were calculated as the variance components (). To illustrate relationships among the populations, the FST matrix was used to construct a neighbour-joining (NJ) network in MEGA 6.0 (). The genetic distance matrix (FST) was also used to perform principal coordinate (PCO) analysis, implemented in GenAlEx 6.5 (. Further examination of the population structure was conducted using a Bayesian approach that assigns initially sampled individuals into inferred groups, implemented in STRUCTURE 2.2 (). To calculate the optimal number of genetically distinct groups (K), we first simulated a total of 10 000 MCMC iterations for the burn-in period, followed by a run length of 10 000 iterations. For each value of K (K = 2–10), three independent runs were performed to assure convergence and homogeneity among runs. We used deltaK to select the best K. Each run yielded a log likelihood value, Ln Pr (X/K), which had a corresponding deltaK. The highest Ln Pr(X/K) corresponded to the highest deltaK, and the maximum was chosen to determine the optimal number of genetically distinct clusters. DeltaK was calculated in Structure Harvester ( The probabilities of ancestor assignment were calculated for each pre-defined population (). Based on the structure results, populations were grouped and hereafter referred to as groups. Genetic diversity index, including the number of alleles per locus corrected for the sample size (A), observed heterozygosity (HO) and unbiased expected heterozygosity (HE) corrected for the sample size, were calculated within the groups.To examine whether the genetic distance have a significant relationship with the geographical distance, Mantel test was performed using the programme IBD v.1.52 (). Pairwise estimates of FST, representing the genetic distances between populations, were calculated in GenAlEx 6.5, and the geographic distances between locations were calculated in GEODIS 2.5 (). Geographic distances were first natural-log transformed when calculating the correlation index. The significance test was based on 10 000 permutations. […]

Pipeline specifications