Computational protocol: Weighted SNP Set Analysis in Genome-Wide Association Study

Protocol publication

[…] We apply the three methods to a real GWAS dataset studying the genetic susceptibility of non-small cell lung cancer (NSCLC). The details of the population were described previously . This dataset includes 1,473 NSCLC cases and 1,962 controls. DNA was extracted from the whole blood and genotyped by the Affymetrix 6.0 Quad chip. A total of 570,373 SNPs pass the general quality control (QC) . We extracted two regions from the dataset. One is a region of 67 kb in 5p13.33, which includes 8 SNPs within a range of 20 kb upstream and downstream of the CLPTM1L gene, and the MAFs of 4 SNPs are lower than 20%. The gene was reported to be associated with smoking behavior and NSCLC –. The second region is about 208.4 kb length in 6p21.32–21.33 including 15 SNPs with genes of TNXB, FKBPL and PPT2, and the MAFs of 12 SNPs are lower than 20%. PPT2 was associated with pulmonary function and gene expression of TNXB was reported to be associated with lung squamous cell cancer . FKBPL has been proposed as a novel prognostic and predictive biomarker . The two regions are then analyzed by wPCA, LKM and PCA, respectively.Datasets are generated using R packages (version 2.13.0) and PLINK. Analyses of the simulated datasets are performed using R packages. The SKAT package is used to conduct LKM analysis. […]

Pipeline specifications

Software tools PLINK, SKAT
Application GWAS
Diseases Carcinoma, Non-Small-Cell Lung, Ichthyosis Bullosa of Siemens