Computational protocol: Genome-wide association study of coronary artery calcified atherosclerotic plaque in African Americans with type 2 diabetes

Similar protocols

Protocol publication

[…] Local ancestry estimation was performed using LAMP-ANC and HAPMIX [, ]. A linkage disequilibrium (LD) pruning algorithm was applied with an R-squared threshold of 0.8 to select a subset of SNPs among those that met the above QC criteria. Observed data at these SNPs were then combined with HapMap phase 3 genotypes obtained from Yoruban and CEPH samples; the HapMap samples were used as anchoring populations and were not included in the analysis. The estimation process was repeated twice in AA-DHS, once with LAMP-ANC and once with HAPMIX. Results were comparable; the distribution of Spearman correlation estimates ranged between 0.88 and 0.97. Local admixture estimation in JHS was performed with LAMP-ANC. The global ancestry proportion estimates were obtained by averaging the local ancestry estimates across the genome. These global estimates were used as covariates in the association models and are reported in Tables  and . [...] Imputation was performed using IMPUTE2 with phased haplotypic data obtained from Shapeit2 []. The imputation effort used all SNPs that passed the QC filters. Imputation was based on 3,436,913 and 733,318 autosomal SNPs in AA-DHS and JHS, respectively. The multi-ethnic 1000 Genomes Phase I integrated variant set release (v3) was used as the reference panel []. Imputation was performed separately for each study. Statistical analyses were performed on imputed SNPs that had certainty score above 90%, info score above 50% and MAF greater than 1%. [...] Analyses were run using Log(CAC + 1) and CAC dichotomized (presence (CAC ≥ 10) vs. absence (CAC < 10)). The value of 1 added to the observed CAC score allowed for the inclusion all subjects, even those with a CAC score of zero. This approach is justified based on the assumption that factors governing presence of CAC may differ from those influencing amount of CAC once calcification is initiated []. Age, gender, global African ancestry proportion, diabetes duration, hemoglobin A1c, body mass index, smoking status, and use of lipid-lowering medication were included as covariates in the model to test for association between each SNP and CAC. Analyses were run separately in each study using the same outcome definitions based on the 90 HU CAC in AA-DHS and the 130 HU score in JHS. For the continuous outcome, linear mixed models were fitted using Genome-wide Efficient Mixed Model Analysis (GEMMA) software []. Generalized estimating equations were implemented to test for associations with the binary outcome. All analyses adjusted for familial relationships estimated using the Relatedness Estimation in Admixed Populations (REAP) software []. SNPs were tested for association using the likelihood ratio test for the overall two degrees of freedom mode of inheritance model. If the overall test of association is significant, then the three a priori genetic models (dominant, additive, and recessive) were explored; the model with the best fit for each SNP was used. Correction for this maximization was applied to account for the correlation between tests and to maintain the type 1 error rate [–]. This approach is consistent with Fisher’s protected least significant difference multiple comparison procedure. Sample size weighted meta-analysis was performed to compare and combine results observed from each study. Penalized regression with the L 1 norm (LASSO) was used to identify the SNP with the strongest effect size when LD caused several SNPs to display strong association with the outcome. A cross validation approach was used to determine the shrinkage parameter for each region. SNP selection was performed only in the AA-DHS subset, the larger of the two studies, to limit confounding effects. Joint tests of association between local ancestry and genotypes with CAC were also were performed. The model used for testing for association between local ancestry and CAC was similar to the one described for the genetic association tests, with local ancestry replacing the observed or imputed genotypes. If T L and T G denote the test statistics associated with the local ancestry and genotypic association with CAC, the joint test of association with local ancestry and genotype at each marker was calculated as TL2+TG2, which follows a Chi-square distribution with 2 degrees of freedom []. An alternative test based on the maximum of T G and T L was also computed, assuming these tests follow a bivariate normal distribution with a non-zero correlation. The empirical correlation was computed using the variance-covariance matrix for 2 correlated score tests []. Results from these tests are shown in Additional file : Table S3. […]

Pipeline specifications

Applications Population genetic analysis, GWAS
Diseases Cardiovascular Diseases, Diabetes Mellitus, Diabetes Mellitus, Type 2, Heart Diseases, Atherosclerosis