Computational protocol: Hybridization of cultivated Vitis vinifera with wild V. californica and V. girdiana in California

Similar protocols

Protocol publication

[…] The uniqueness of all 190 genotypes was confirmed, and the polymorphic information content of each locus was calculated (Botstein et al. ) using the Microsatellite Toolkit (Park ). The probability of identity was calculated using the FAMOZ software package (Gerber et al. ). For each of the 19 microsatellite loci, the number of alleles, allele frequencies, observed and expected heterozygosity, and the fixation index were calculated using GenAlEx 6.0 (Peakall and Smouse ). Allelic richness was calculated in FSTAT (Goudet ), which applies rarefaction for comparison of different sample sizes (El Mousadik and Petit ). [...] Model‐based Bayesian analysis implemented in the software package STRUCTURE (Pritchard et al. ) was used to determine the approximate number of genetic clusters (K) within the full data set and to assign individuals to the most appropriate cluster. All simulations were run using the assumptions that individuals may have admixed ancestry and that allele frequencies are correlated (Falush et al. ). Simulations were run varying K as a prior from one to ten. After multiple trials, a burn‐in of 80,000 iterations and 100,000 iterations for data collection proved sufficient to produce results that were consistent among eight runs for likely values of K. The most likely value for K was determined based on averages of the estimated Ln probability of the data (ln Pr(X/K) as described in the STRUCTURE documentation and by calculating ∆K (Evanno et al. ). Bar graphs from STRUCTURE were prepared using STRUCTURE PLOT (Ramasamy et al. ). STRUCTURE was also used to generate the posterior probability that individuals have mixed ancestry (the “GENSBACK” option with, K = 3 and M = 0.05). For this analysis, assignment to one of the three sample groups was given as a prior. The results indicate whether an individual has mixed ancestry within the three preceding generations (G = 3) or if the individual is best assigned to another sample group.Additional tests to investigate possible cryptic structure within the CAL samples were performed with 50,000 iterations burn‐in and 250,000 iterations for data collection; eight runs for each value of K from 1 to 7 were simulated. As before, admixed ancestry and correlated allele frequencies were assumed. To facilitate visualization of these results, most CAL samples were placed in one of three subgroups based on collection location. The “Wine Country” subgroup contained 37 samples primarily from Napa County, with a few from the adjacent counties of Lake, Solano, and Yolo. The 35 samples in the “Remote” subgroup were primarily from Shasta County, with a few samples from the adjacent counties of Siskiyou and Tehama. The third subgroup, “Other”, contained samples from scattered locations throughout the range of V. californica. This location information was not used as a prior for STRUCTURE analysis. The proportion of each individual attributed to each inferred cluster (Q) was averaged over the eight runs. Genetic structure within and among the Wine Country and Remote subgroups was also investigated using PCoA (principal coordinate analysis) computed in GenAlX, using the codominant genotypic distance of Smouse and Peakall (). […]

Pipeline specifications

Software tools GenAlEx, strplot
Application Population genetic analysis
Organisms Vitis vinifera