Computational protocol: Japanese genome‐wide association study identifies a significant colorectal cancer susceptibility locus at chromosome 10p14

[…] In phase 1, the genotyping of the 577 CRC cases and 571 controls was carried out using the Affymetrix GeneChip Human Mapping 500K Array Set according to the manufacturer's protocols. The equal number of patient and control samples enabled us to analyze the genotype and phenotype independently.In phase 2.1 (fast‐track second screening), among the whole array of SNP, we focused on the highest‐ranked 100 SNP. Among those 100 SNP, we excluded SNP with a minor allele frequency (MAF) <0.1 and selected TagSNP to avoiding overlapping and allelic imbalance, thus totaling 62 SNP that were confirmed by PCR in phase 2.1 by screening with a subset of 1181 cases and 1617 controls.In phase 2.2 (full second screening), 480 CRC and 480 control samples were genotyped at the 1536 best SNP (allelic P < 0.013) using the Illumina GoldenGate Assay. When multiple SNP displayed strong linkage disequilibrium (LD) with each other (r2 > 0.8), the most closely associated SNP was chosen to avoid redundancy during the selection of the 1536 SNP. The samples with a genotype call rate <0.98 and SNP with a call rate <0.98, in Hardy–Weinberg disequilibrium (P < 1.0 × 10−4) in the controls, or with a MAF <0.05 were excluded from the association analysis.For the 21 SNP that showed an allelic P < 0.01 in phase 2.2, genotyping with the TaqMan method in 789 CRC cases and 1624 controls was performed in phase 3. [...] Genotype data cleaning and pairwise identity‐by‐descent (IBD) analysis were carried out using the PLINK software (version 1.06). We used the Haploview software (v3.2) to establish the LD structure on chromosome 10p14. […]

