Computational protocol: An integrated expression phenotype mapping approach defines common variants in LEP, ALOX15 and CAPNS1 associated with induction of IL-6

[…] LD estimates and haplotypes were computed using the HaploView 3.3.2 program (). For volunteers of Caucasian ancestry, genotyped SNPs of greater than 5% MAF were used to infer the underlying haplotypic structure with LD blocks predicted by the Confidence Intervals algorithm as previously described (). An accelerated expectation-maximization algorithm was used to create highly accurate population frequency estimates of the phased haplotypes based on maximum likelihood as determined from unphased input (). [...] PLINK was the primary software utilized for analysis of the data sets within this study (). Statistics were verified using SPSS, R package and SNPTest, and concordant statistical results were obtained. Standard QC measures included exclusion criteria of maximum per-person missing (MIND > 0.1), maximum per-SNP missing (GENO > 0.1) and MAF < 0.01. After frequency and genotyping pruning, and removing individuals with low genotyping success rates, the total genotyping rate in the remaining individuals was more than 98.8% for the Illumina humanCVD beadchip. PLINK was used to perform quantitative trait analysis which generates a P-value using the Wald test. For each SNP, PLINK generates a phenotypic mean for the three genotypic states and compares these means using the Wald test statistic to generate a P-value. The Wald test is useful especially in this instance, since it does not require that the data fit a normal distribution. Covariates age, ethnicity and sex were included as additional terms in PLINK analysis to further interrogate observed associations. Non-parametric statistics and log transformation were applied where data were not normally distributed, otherwise unpaired t-tests were used for analysis of expression data for specific SNPs, data passing tests of normality. Permutation analysis was performed using a label-swapping procedure which swaps phenotypes while retaining the genotype linkage structure. An adaptive algorithm was used which speeds computations by dropping SNPs less likely to be significant in each iteration. Analysis was performed with one million permutations using a within-cluster algorithm which swaps labels within each population cluster. The analysis was repeated without any restrictions. The empirical P-value is robust with respect to normality of phenotype and multiple testing issues. Haplotypic association analysis was also performed using the haplotypes defined as described earlier. A linear regression model was used to estimate the contribution of the most strongly associated SNPs to IL-6 expression. Potential confounding factors (age, sex and ethnicity) were then added to the core model. Factors which did not contribute significantly to the fit of the model on likelihood ratio testing (P< 0.05) were excluded. Linear regression analysis was carried out using STATA v10 software. […]

Pipeline specifications

Software tools Haploview, UNPHASED, PLINK, SNPTEST
Application GWAS
Chemicals Lipopolysaccharides, Nucleotides