Computational protocol: Genome-wide association reveals genetic effects on human Aβ42 and τ protein levels in cerebrospinal fluids: a case control study

Protocol publication

[…] Quality control for the genotyping data was performed using PLINK[] as follows. 498,205 SNPs were retained after excluding SNPs with Minor Allele Frequency (MAF) < 5%, call rate < 98%, or significant in Hardy-Weinberg Equilibrium test (p ≤ 10-6). All samples had genotyping call rate > 95% and were retained. We then examined population stratification by visual inspection using the first two dimensions from principal components analysis, using SmartPCA from EIGENSTRAT[,]. Self-reported ethnicity and racial identities for ADNI subjects were used to highlight samples in the PCA plot and are summarized in Additional file . 390 samples were retained after SmartPCA excluded 20 samples as outliers. We computed the top five principal component coordinates using SmartPCA to correct for stratification in association analysis. SmartPCA removed all but one of the Asian samples and retained Black/African Americans (Additional file ); Visual inspection suggests that the first principal component (PC0), which explains the most variance in the data, separates the Caucasians and non-Caucasians reasonably well (setting threshold PC0 < = 0.01 can exclude all non-Caucasians). Finally, we excluded 52 non-Caucasian samples as outliers; the genomic control variance inflation factor λ was 1.00983, suggesting minimal population admixture in the final sample used for association analysis. We performed association analysis using age and APOE ε4 genotype as covariate and did not incorporate principal components. Quantile-Quantile plots for each of the three test groups with log10-transformed level of three CSF biomarkers (Additional file and Additional file ) suggested that population stratification having negligible bias on the genetic associations (Additional file ). Finally, for the whole analysis we performed in the following method section, the study sample of 390 individuals with three CSF biomarkers was used after removing 20 outliers. The level of three CSF biomarkers was log10-transformed. [...] We carried out gene ontology analysis of SNP association results using ALIGATOR (Association LIst Go AnnoTatOR) [], to find gene-ontology terms enriched with significant SNPs. We used p-value cutoff < 10-3 for SNPs, 5000 replicate gene lists and 1000 permutations as parameters to run ALIGATOR. We examined the top associated SNPs and examined nearby SNPs in linkage disequilibrium (LD) that are associated with gene expression from published eQTL studies [-]. […]

Pipeline specifications

Software tools PLINK, EIGENSOFT, GenePath, ALIGATOR
Databases ADNI
Applications Population genetic analysis, GWAS
Organisms Homo sapiens
Diseases Alzheimer Disease
Chemicals Nitroprusside