Computational protocol: Genome-Wide Association Mapping Uncovers Fw1, a Dominant Gene Conferring Resistance to Fusarium Wilt in Strawberry

Similar protocols

Protocol publication

[…] For DNA isolation, newly emerging leaves were harvested from field grown plants of the germplasm accessions and shade house-grown seedlings of S1 populations. Leaf tissue was placed into 1.1 ml tubes, freeze-dried in a Benchtop Pro (VirTis SP Scientific, Stone Bridge, NY), and ground using stainless steel beads in a Mini 1600 (SPEX Sample Prep, Metuchen, NJ). Genomic DNA (gDNA) was extracted from powdered leaf samples using the E-Z 96 Plant DNA Kit (Omega Bio-Tek, Norcross, GA, USA) according to the manufacturer’s instructions. To enhance the quality of the DNA and reduce polysaccharide carry-through, the protocol was modified with a Proteinase K treatment, a separate RNase treatment, an additional spin, and heated incubation steps during elution. DNA quantification was performed using Quantiflor dye (Promega, Madison, WI) on a Synergy HTX (Biotek, Winooski, VT).SNP genotyping with the Affymetrix IStraw35 Axiom Array (; ) was performed by Affymetrix (Santa Clara, CA) on a GeneTitan HT Microarray System using gDNA samples that passed quality and quantity control standards. SNP genotypes were automatically called with the Affymetrix Axiom Analysis Suite software (v1.1.1.66, Affymetrix, Santa Clara, CA). Samples with a call-rate greater than 94% were retained. The quality metrics output by the Affymetrix Axiom Analysis Suite, custom R scripts, and the R-package ‘SNPRelate’ () were utilized to filter SNPs; 14,408 SNPs with high-quality bi-allelic clusters and < 5% missing data were selected for subsequent analyses. The R-packages ‘SNPRelate’ () and ‘GWASTools’ () were used to generate genotypic input files for GWAS from raw genotyping reads. [...] Type III analysis of variance (ANOVA) was performed using a mixed linear model for the lattice experiment design with incomplete and complete blocks as random effects and entries as fixed effects. Statistical analyses were performed using the R-packages ‘lme4’ and ‘car’ (; ). The recovery of intra-block error information from the lattice experiment designs was negligible and failed to increase efficiency relative to randomized complete block (RCB) experiment designs; hence, we utilized linear models for RCB experiment designs for subsequent analyses. Least square means for entries were estimated using the R-package ‘lsmeans’ with complete blocks as a random effect and entries as a fixed effect () and were subsequently used as phenotypic input for GWAS. REML variance component and broad-sense heritabilities () were estimated using the R-package ‘lme4’ (), with entries, complete blocks, and years as random effects.Because the germplasm collection we studied included numerous closely related individuals, we investigated and accounted for population structure using principal components analysis in related samples (PCAiR) with ‘GENESIS’ (http://bioconductor.org/packages/release/bioc/vignettes/GENESIS/inst/doc/pcair.html; , ; ). The p-value inflation factors (λ), ignoring population structure, were 3.09 for the 2016 and 3.69 for the 2017 GWAS experiments (). We subsequently used the first three principal components (PCs) from PC-AiR as input for calculating the kinship matrix, which was done using PC-relate in ‘GENESIS’ (). The resultant kinship matrix was used in a mixed linear model GWAS analysis, assuming a Gaussian distribution of the dependent variable. Wald tests were performed as implemented in ‘GENESIS’ using default parameters (). Because an octoploid reference genome was unavailable, we utilized a diploid reference genome for F. vesca () for GWAS, plotting p-values against physical positions (Mb). SNP probe sequences from the Affymetrix IStraw35 Axiom Array (; ) were physically mapped to the diploid reference genome using the Burrows-Wheeler Aligner (BWA v.0.7.15; ). The ancestry-adjusted p-value inflation factors were 0.75 for the 2016 and 0.85 for the 2017 GWAS experiments, which suggested that the population structure corrections in the mixed linear model GWAS were effective (). [...] Because the parents and grandparents of the S1 populations were outbred, genetic mapping was performed using the full-sib mapping algorithm of JoinMap 4.1 (), which utilizes the general maximum-likelihood (ML) algorithm of for simultaneously estimating linkage and linkage phases in full-sib families. Because we selfed a single individual descended from two outbred parents (grandparents of the S1 offspring), heterozygous loci were expected to segregate 1 AA: 2 AB: 1 BB in the Fronteras and Portola S1 populations, where A is the allele inherited from one grandparent and B is the allele inherited from the other grandparent. S1 individuals were genotyped with the Affymetrix IStraw35 Axiom Array (; ). SNPs that produced co-dominant (bi-allelic) segregation patterns identified using the Affymetrix Axiom Analysis Suite were selected for subsequent analyses. For genetic mapping, we identified and selected 5,673 SNPs in the Fronteras S1 population and 7,345 SNPs in the Portola S1 population. Numerous SNPs were in complete LD across the genome. To reduce the dimensions of the data and more robustly order loci, co-segregating SNPs were assigned to bins and one SNP from each bin was selected for inclusion in the analysis. Once linkage phases were estimated, SNPs were recoded according to the inferred linkage phase, analogous to an F2 population developed from a cross between inbred parents. Loci were grouped using a minimum likelihood odds (LOD) threshold of 8.0 and ordered using the multi-point ML algorithm in JoinMap 4.1 with default parameters and three rounds of locus ordering (). Genetic distances were calculated using the Kosambi mapping function. By cross-referencing previously mapped iStraw35 and iStraw 90 SNPs (), linkage groups identified in the present study were aligned with 28 linkage groups previously described by and . The linkage group numbers and orientations in and trace their origin to .We assigned S1 offspring to resistant and susceptible phenotypic classes and tested the hypothesis of the segregation of a single gene using standard goodness-of-fit statistics. Offspring with Fusarium wilt scores < 2.5 were classified as resistant, whereas offspring with Fusarium wilt score ≥ 2.5 were classified as susceptible. The observed segregation ratios were tested for goodness-of-fit to the expected segregation ratio of three resistant to one susceptible using Chi-square statistics with the R-function ‘chisq.test’.Linkage groups were scanned for quantitative trait loci (QTL) using the interval mapping function in MapQTL 6 (). Several tightly linked SNPs on linkage group 2C, previously identified by GWAS, co-segregated with a QTL that was targeted in subsequent analyses. Significant SNP loci in the haploblock were individually used as fixed effects (independent variables) in linear model analyses to estimate additive (a) and dominance (d) effects, degree of dominance (d/a), and the proportion of the phenotypic variance associated with the additive and dominance effects of the SNP locus (; ). SNPs were physically mapped in the diploid reference genome (). We used linkage phases of SNP markers to infer the haplotypes of the parents (Fronteras and Portola). The inferred haplotypes were supported by the three-generation pedigree of Fronteras. The 05C165P001 parent of Fronteras was susceptible to Fusarium wilt and homozygous for the eight most significant SNPs, whereas the 04C018P004 was resistant to Fusarium wilt and heterozygous for the eight most significant SNPs (File S2). […]

Pipeline specifications

Software tools SNPRelate, GWASTools, BWA, JoinMap, MapQTL
Applications WGS analysis, GWAS
Organisms Fungi, Fusarium oxysporum, Fragaria vesca