Computational protocol: Pedigree-Based Analysis in a Multiparental Population of Octoploid Strawberry Reveals QTL Alleles Conferring Resistance to Phytophthora cactorum

Similar protocols

Protocol publication

[…] For the discovery population sets and validation sets in both years, 30–60 mg of unexpanded leaf tissue from each individual were collected into 96 well plates and frozen at −80° until extraction. DNA extraction was performed using the E-Z 96 Plant DNA Kit (Omega Bio-Tek, Norcross, GA) with only minor modifications for the 2013–2014 samples and the same kit or a modified CTAB method () for the 2014–2015 samples. Prior to DNA extraction, frozen samples were ground with a Fisher Scientific PowerGen high-throughput homogenizer (Pittsburgh, PA) twice for 2 min with a 30 min refreezing at −80° between grindings.From the discovery population sets, 551 () individuals were genotyped in the 2013–2014 season and 580 in the 2014–2015 season. Immediate parents and other pedigree-connected individuals were also included. From the validation sets, 245 individuals were submitted for genotyping each year. Genotyping was performed using the Affymetrix IStraw90 Axiom Array (). Genotyping errors were detected by comparing, within full-sib families, each SNP diplotype of each individual with that of the parental genotypes and replaced with “no call” if incorrect. Similarly, when SNP diplotypes of the parents were not consistent with their progeny, the correct parent was inferred and was corrected accordingly when possible using a Microsoft Excel-based tool. If the correct parent was not obvious, the incorrect parent was replaced with a virtual “undetermined” parent. In addition, markers with inheritance errors in LG 7D, as determined in the “mconsistency” file of FlexQTL outputs, were removed from the data and the data were analyzed in FlexQTL until no errors were observed. [...] Weekly plant collapse evaluations from the discovery populations were utilized to calculate the area under the disease progress curve (AUDPC, ) for each individual for 28 wk in both seasons. In addition to the progeny, marker data for pedigree-connected cultivars and selections up to two generations previous to the direct parents of the individuals in the discovery population sets were included in the analysis (Figure S1 and Figure S2).The QTL analysis was performed using a Markov chain Monte Carlo (MCMC)-based Bayesian analysis in FlexQTL, using a model with additive QTL effects and a maximum number of QTL of 15. Prior number of QTL was set to 1 or 3 (, ) and genome-wide analyses were performed twice for each prior. Each analysis was performed with different starting seeds to create independence between iterations, using simulation chain lengths of 100,000 iterations with thinning values of 100. The effective sample size in the parameter file was set to 101 to ensure convergence with effective chain size of at least 101. Each of the two iterations converged (effective chain samples, or ECS, ≥ 100 for each of the parameters mean, variance of the error, number of QTL, and the variance for the number of QTL) as recommended by . Full FlexQTL parameters and values chosen for the present analyses are provided in Table S1.Two times the natural log of Bayes factors (BF) generated from genome-wide FlexQTL analysis were used to determine the total number of QTL (2lnBF10 ≥ 5) as well as QTL positions on individual LGs (; ; ; ). Once the genome-wide total number of QTL was established, further analyses of individual LGs containing QTL with strong evidence (2lnBF10 ≥ 5) was conducted, including only those present in at least two of the four runs. Individual LGs were analyzed to examine whether the phenotypic variation explained would be similar to the genome-wide analysis. Analyses were performed in triplicate with simulation chain lengths of 10,000 iterations and thinning values of 10, and other parameters detailed in Table S1. Narrow-sense heritability (h2) was calculated by using statistical inferences from FlexQTL software outputs with the formulah2=VP−VEVPwhere VP is the phenotypic variance of the trait, and VE is the residual error variance (; FlexQTL output). The proportion of phenotypic variation explained (PVE) by a particular QTL was calculated using the formulaPVE=(wAVtVP)×100where wAVt is weighted additive variance of the trait, adjusted for the portion of the variance explained by the QTL on a particular chromosomal position (obtained after PostQTL analysis), and VP is the total phenotypic variance of the trait.To examine recombination patterns in the tested germplasm in and around identified QTL regions, pairwise linkage disequilibrium (LD) (r2) among SNP markers was determined with Haploview software using the four-gamete method (). Marker haploblocks and haplotypes for the identified QTL regions were constructed from FlexQTL outputs. Markers were selected within the QTL intervals defined by FlexQTL and according to LD values between markers. The same set of markers was utilized to determine haplotypes in both the discovery population sets and validation sets. The “mhaplotype” files of the QTL discovery populations were examined first since the presence of multiple individuals within full-sib families and the circular diallel crossing design allowed determinations of the consistency of SNP haplotypes. […]

Pipeline specifications

Software tools FlexQTL, Haploview
Applications WGS analysis, GWAS
Organisms Arabidopsis thaliana, Homo sapiens