Computational protocol: Trans-Ethnic Fine-Mapping of Lipid Loci Identifies Population-Specific Signals and Allelic Heterogeneity That Increases the Trait Variance Explained

Similar protocols

Protocol publication

[…] We applied multiple linear regression models and assumed an additive mode of inheritance to test for association between genotypes and HDL-C, LDL-C, or log-transformed triglycerides. We performed each test of association separately in each of the 11 groups () prior to meta-analysis. We constructed principal components (PCs) using the software EIGENSOFT. We used age and sex as covariates in each individual cohort; other cohort-specific covariates including age2, enrollment site, socioeconomic status, and principal components varied across studies (). The European samples include type 2 diabetes (T2D) cases and unaffected controls; to avoid confounding due to T2D status, samples were analyzed separately as Finnish T2D patients, Finnish unaffected individuals, Norwegian T2D patients, and Norwegian unaffected individuals.We first conducted the meta-analysis within the African Americans, East Asians, and Europeans separately. We then performed combined trans-ethnic meta-analyses by combining the statistics of each the 11 participating groups to assess the association with the SNPs at the 58 lipids loci.At loci that exhibited evidence of association at P<10−4, we next performed a series of sequential conditional analyses by adding the most strongly associated SNP into the regression model as a covariate and testing all remaining regional SNPs for association. We conducted a set of sequential conditional analyses until the strongest SNP showed a conditional P value>10−4 and had no annotation or literature evidence that suggested a functional role.For single SNP analyses, we applied PLINK ( for population-based studies. We used the R package GWAF for the family-based study of HyperGEN. We applied an inverse variance-weighted fixed-effect meta-analysis implemented in METAL .Unless otherwise noted, linkage disequilibrium estimates were obtained from the 1000 Genomes Project November 2010 release. SNP positions correspond to hg18.We performed haplotype analysis at LDL-C locus TOMM40-APOE-APOC4 in 5,593 unrelated African Americans from the PAGE consortium, using the ‘haplo.stat’ R package. Haplotypes and haplotype frequencies were estimated using the R function ‘haplo.em’. The association between haplotypes and LDL-C was assessed using the R function ‘haplo.glm’. An additive model was assumed, in which the regression coefficient β represents the expected change in LDL-C level with each additional copy of the specific haplotype compared with the reference haplotype, which was set as the A-A (trait increasing-increasing) haplotype.We created the regional association plots using LocusZoom . To plot the association results in Europeans and East Asians, we used the LocusZoom-implemented LD estimates from the 1000 Genomes Project (June 2010) CEU and CHB+JPT samples, whose LD structures are similar to our samples with European and East Asian ancestries. We applied the user-supplied LD calculated from the genotype data of the PAGE African American samples to plot the regional association in African Americans , because the LD patterns may vary from any pre-computed LD sources implemented in LocusZoom.We evaluated the proportion of variance explained by a single SNP or any given locus by including the SNP or a set of SNPs into a linear regression model with all covariates used in association analysis and calculating the R2 for the full model. We subtracted the variance explained by a basic model in which only covariates were included from the variance we obtained from the full model. We performed these analyses using SAS version 9.2 (SAS Institute, Cary, NC, USA). […]

Pipeline specifications

Software tools EIGENSOFT, PLINK, LocusZoom
Applications Population genetic analysis, GWAS
Chemicals Cholesterol, Triglycerides