Computational protocol: New Genetic Loci Associated with Preharvest Sprouting and Its Evaluation Based on the Model Equation in Rice

Similar protocols

Protocol publication

[…] Twenty-one samples (japonica, 14; indica/tongil, 7; Table ), revealing similar flowering times (early August), were selected as representatives by PHS resistance (PHS resistant representatives: PHS < 20%, PHS susceptible representatives: PHS > 40%) at 42 DAF among the diverse rice genetic resources () in that environmental noises might reduce the resolution for the detection of PHS associated genetic loci in the field condition test. The samples were re-sequenced using the Illumina HiSeq 2500 platform to identify genome-wide variations and to detect genetic signals associated with PHS.Genomic DNA was extracted from the 21 rice samples using a Gentra Puregene Cell Kit for Plants (Qiagen, Hilden, Germany). Library construction and sequencing data collection were conducted using Illumina’s official protocol with a 101 bp paired-end read length. Trimmomatic-0.33 () was used to remove adapters and low-quality reads. The adapter-free trimmed reads were mapped against the reference genome of Oryza sativa L. (IRGSP-1.0.27) using Bowtie2 with default parameters (). The mapped reads were assigned to read groups and sorted using Picard version 1.138. Picard was also used to remove potential PCR duplicates and to repair mismatches between each read and its mate pair. For the subsequent local realignment, base quality recalibration, and variant calling steps, Genome Analysis Toolkit (GATK) version 3.4.46 () was used. Local realignment of reads was carried out to correct misalignments caused by the presence of InDels. Base quality recalibration was performed to compensate for base quality errors from empirical measurements. For variant calling, arguments including “UnifiedGenotyper” and “SelectVariants” were used. Finally, the identified variants were filtered using the “VariantFiltration” argument based on the following criteria: (1) Reads with a mapping quality of zero, MQ0 higher than 4, and MQ0/(1.0∗DP) higher than 0.1, where DP is the unfiltered read depth, (2) FS higher than 200 to reduce false positives, and (3) Phred-scaled quality score lower than 30.To identify specific SNPs in the PHS groups (PHS resistant and susceptible group), Fisher’s exact test implemented in SNPSift was conducted to analyze the resulting genotype count data () contrary to the linear or mixed model generally used for genome-wide association studies (GWAS) (). Information about PHS-specific groups and the genotype data assigned to the dominant and recessive models of the 2 × 2 contingency tables were used in this statistical test. To minimize the rate of false positives, multiple Bonferroni correction was conducted. SNPs were considered significant when the p-value of the total number of tests was below 0.05. Significantly enriched SNPs were annotated with SNPeff. […]

Pipeline specifications

Software tools Trimmomatic, Bowtie2, Picard, GATK, SnpSift, SnpEff
Databases IRGSP
Application GWAS
Organisms Oryza sativa, Oryza sativa Japonica Group