Computational protocol: A genome-wide SNP scan accelerates trait-regulatory genomic loci identification in chickpea

Similar protocols

Protocol publication

[…] The 96-plex GBS libraries were made by digesting the genomic DNA of 92 chickpea accessions (association panel) with ApeKI and ligating the digested DNA to adapters containing one of 96 unique barcodes. The pooling of libraries and their sequencing (100-bp single end) were performed using Illumina HiSeq2000 with respect to Elshire et al. and Spindel et al.. The high-quality FASTQ sequence reads (a phred score >10) were de-multiplexed relying their unique barcodes and the individual sequence reads of 92 accessions were mapped to reference drafts of desi (ICC 4958) and kabuli (CDC Frontier) chickpea genome sequences using Bowtie v2.1.0. The sequence reads remained unaligned with desi and kabuli reference genomes were further analyzed individually using the de novo genotyping approach of STACKS v1.0 ( The sequence reads aligned and unaligned with each of desi and kabuli genomes were processed using the reference-based GBS pipeline of STACKS and the de novo genotyping approach of STACKS to identify valid and high-quality SNPs (no sequencing errors with minimum sequence read depth: 10 and SNP base quality ≥20) in 92 accessions. The structural and functional annotation of SNPs identified by reference-based GBS approach in various coding (synonymous and non-synonymous SNPs) and non-coding sequence components of genes and genomes (chromosomes/pseudomolecules and scaffolds) were performed using the available desi (CGAP v1.0) and kabuli genome annotations. [...] The genome-wide SNP genotyping information (MAF ≥ 5%), robust field phenotyping data of three seed and pod yield-contributing traits (PN, SN and SW), ancestry coefficient data (Q matrix deived from population structure at optimal population numbers) and relative kinship matrix (K) generated from 92 chickpea accessions (association panel) were analysed by use of general linear model (GLM, Q model)- and mixed linear model (MLM, Q + K model)-based approaches of TASSEL. Additionally, the principal component analysis (PCA) integrated with the efficient mixed-model (P + K, K and Q + K) association (EMMA) and P3D/compressed mixed linear model (CMLM) interfaces of GAPIT were utilised for GWAS. The relative distribution of observed -log10 P-value for each SNP marker-trait association was compared individually with that of the expected distribution using quantile-quantile plot of GAPIT. The adjusted P-value threshold of significance in each trait was corrected for multiple comparisons basing upon false discovery rate (FDR cut-off ≤ 0.05). Combining the four model-based outputs of TASSEL and GAPIT, the SNP loci in the target genomic (gene) regions (significant LD regions) revealing significant contributions to phenotypic variation of three agronomic traits at highest R2 (magnitude of marker trait-association) and lowest FDR adjusted P-values (threshold P < 2 × 10−4) were identified. For large-scale validation and verification of the accuracy of identified SNP marker-trait associations, the high-throughput genotyping data (Illumina GoldenGate assay) of 96 SNPs (including strong trait-associated SNPs) in 211 chickpea minicore accessions were correlated with their field phenotyping information of the three agronomic traits under study, following the afore-mentioned GWAS methods. [...] For genetic linkage map construction, 384 SNPs (physically mapped across eight chromosomes) showing polymorphism between two parental accessions (ICC 6013 and ICC 7346) were genotyped in 283 F4 segregating mapping individuals using a MALDI-TOF SNP genotyping assay with respect to Saxena et al.. The SNP genotyping information was analysed in JoinMap 4.1 ( at a higher LOD (logarithm of odds) threshold (>4.0) using Kosambi mapping function. The SNPs mapped on eight linkage groups (LGs) of an intra-specific chickpea genetic map were designated (LG1 to LG8) according to their corresponding physical positions (bp) on chromosomes, as determined in our study.For QTL mapping, the genotyping data of parental polymorphic SNPs genetically mapped on eight LGs and field phenotyping information (PN, SN and SW) of 283 F4 mapping individuals and parental genotypes were analysed using the composite interval mapping (CIM) function (LOD > 4.0 with 1000 permutations and p ≤ 0.05) of MapQTL 6. The percentage of phenotypic variation explained (PVE) by significant QTLs (R2) was estimated to identify and map the novel major genomic regions harbouring QTLs associated with three agronomic traits in chickpea. […]

Pipeline specifications

Software tools GAPIT, JoinMap, MapQTL
Applications WGS analysis, GWAS
Organisms Cicer arietinum