Computational protocol: QTL mapping and candidate genes for resistance to Fusarium ear rot and fumonisin contamination in maize

Similar protocols

Protocol publication

[…] Genomic DNA was extracted from fresh leaves using a NucleoSpin Plant II kit (Macherey-Nagel, Germany) following the manufacturer’s instructions. The DNA quantification was performed using PicoGreen (Invitrogen, Carlsbad, CA) and normalized to 10 μL of 10 ng/μL (100 ng total) in 96 well plates. The protocol [] was followed to construct two libraries, containing 94 and 63 progenies, respectively plus the two parental lines (i.e., a total of 96 and 65 samples per library), for sequencing on Illumina HiSeq2000 platform (Illumina Inc., San Diego, CA). Each pool was run on a single flow cell lane using a 100 bp paired-end module on Illumina HiSeq2000 instrument at Parco Tecnologico Padano, Lodi, Italy.Raw 100 bp reads from the two Illumina HiSeq lanes were processed with FastQC to check the overall quality of the sequence data. The reads were processed with a custom demultiplexer to remove the barcode adapters and assign each read to the corresponding sample. Next, the raw sequencing data for each sample were processed with Trimmomatic [] to remove low quality bases and sequencing adapters, using the parameters ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36. The filtered reads were mapped with BWA MEM [] on the Zea mays genome (AGP v3.20) downloaded from Ensembl database and the resulting BAM files were sorted and indexed using SamTools v.0.1.19 []. The sorted BAM files were processed with Freebayes v9.9.2 [], using parameters -m 30 -q 20 -R 0 -S 0, to perform the variation calling across all the samples. Filtering was applied to exclude INDELs and retain only SNPs that show polymorphism between the parental lines and missing data below 30%. SNP name were abbreviated with the chromosome number followed by the physical position on the reference genome (AGP v3.20). [...] Genetic linkage map construction was based on a dataset of 149 genotypes (95 SSR markers and 1,700 SNPs), excluding genotypes missing >30% marker data. For each marker, the alleles of the CO354 and CO441 parents were encoded as A and B, respectively, in the data matrix used for linkage map construction and QTL analysis. Map construction was performed using the regression mapping algorithm of JoinMap 4.1 [], using linkages with a recombination frequency smaller than 0.5 and a LOD larger than 0 and keeping other default settings. The recombination frequencies were transformed into genetic distances in centiMorgans (cM) through the Kosambi’s mapping function. The map was constructed including also markers exhibiting segregation distortion, but excluding markers mapping in incoherent positions in comparison with the reference genome. Finally, 342 SNPs and 41 SSRs were clustered into ten LGs.QTL analysis was performed using the MAPQTL 6.0 software [] for each phenotypic dataset (i.e., trait in 1 year, sowing time and inoculation technique). Following a permutation test for each trait (number of permutation fixed as 1,000), genome-wide LOD scores corresponding to P = 0.05 were considered as significance thresholds for the detected QTLs. According to this criterion, the estimated LOD threshold value of FER trait was 4.3 for 2012B_F, 4.1 for 2011A-B_F and 2012A_ F and 4.2 in the other cases. The genome-wide significance threshold for FB1 contamination trait was set to 4.1 for 2011A_T, 2011B_F-T, 2012A_F, to 4.2 for 2011A_F, 2012B_ T-F and to 3.9 for 2012A_T. The estimated LOD significance threshold for DTS was 4.2 in the year 2012, 4.1 in 2011A and 4.3 in 2011B. In a first analysis the Interval Mapping approach was used to estimate the QTL genomic interval and its contribution to the phenotypic variance. In order to detect which markers are significantly associated with QTLs and candidate as co-factors, the Automatic Cofactor Selection (ACS) was used. Multiple-QTL Mapping (MQM) was carried out in order to resolve the occurrence of multiple QTLs in the same LG. When a QTL associated with an ACS-validated cofactor marker showed a LOD lower than the significance threshold determined by the permutation test, this QTL was anyhow considered “significant” if (a) another significant QTL was determined in the same position for the corresponding phenotypic trait in another year, sowing time or inoculation technique and (b) the difference with LOD threshold was <0.5. Additive and dominance effects were calculated at the cofactor position by MapQTL, according to the formula (mu_A-mu_B)/2 and mu_H-[(mu_A + mu_B)/2], respectively, where: mu_A, mu_B and mu_H are the estimated mean of the distribution of the quantitative trait associated with the “A” genotype (CO354), “B” genotype (CO441) and “h” genotype, respectively. The maps of QTL positions, showing 1- and 2-LOD confidence intervals, were drawn using MapChart 2.1 software [].The overlapping of the 2-LOD confidence intervals within the same trait QTLs defined the new integrated limits of the QTL, thereafter referred to as the “integrated QTL”. QTL nomenclature for the integrated QTLs modified rules proposed in [] and each QTL was designated with the code “qTc-LG.1”, where: q = quantitative trait; Tc = trait code (FER/FB1/DTS); LG = linkage group number; 1 = first chronological QTL for this trait reported on this LG, when more QTLs were detected in the same LG for the same trait. […]

Pipeline specifications

Software tools JoinMap, MapQTL
Application WGS analysis
Organisms Fusarium verticillioides, Zea mays