Computational protocol: Genetic Risk Score of NOS Gene Variants Associated with Myocardial Infarction Correlates with Coronary Incidence across Europe

Similar protocols

Protocol publication

[…] Genotyping rate, allele frequencies, and deviations from the Hardy-Weinberg equilibrium (HWE) were calculated using PLINK software . SNPs with a genotyping rate lower than 0.75 or not polymorphic in any sample were removed from the analysis. Individuals with a genotyping rate lower than 0.75 or not genetically homogeneous compared with individuals of the same population group were also removed for the analysis. Missing genotypes were inferred using MACH 1.0 software taking as reference the rest of the genotypes ascertained in the same population. Linkage disequilibrium was calculated and visualized using Haploview software .Datasets from the MIGen study already included 27 out of the 78 SNPs in the NOS regions. In order to have the same genetic information, SNPs not directly genotyped in the MIGen samples were imputed using two different imputation softwares, MACH 1.0 and IMPUTE2 . In both imputations the computational effort was controlled performing 200 algorithm iterations when phasing and imputing data sets, and considering 300 haplotypes to use as templates when phasing observed genotypes. This imputation effort is four times higher than the standard effort recommended by software developers. Phased chromosomes from the most similar 1000 Genomes Project samples were used as reference panels: the FIN sample for the FINRISK case-controls, the TSI sample for the ATVB and Regicor case-controls, and the CEU sample for the MDCS case-controls.As a control approach to validating the genotyping strategy of this study (SNPs selected as representative of NOS regions common variation), in our population sample from Central Italy (CIT) we imputed all the variation described in TSI sample from the 1000 Genomes Project in the studied three chromosomal regions. And then we checked the imputation quality indices regarding allele frequency thresholds. [...] A two-step analysis of association and prediction was performed with the PredictABEL R package . These analyses were performed in duplicate, in the MACH imputed data set and in the IMPUTE2 imputed data set. In the first step, associations were tested by logistic regression analysis in the three case-control samples with the largest sample size: FINRISK from northern Europe, and ATVB and Regicor from southern Europe. The other two case-control samples (MDCS and NDBC) were kept as cross-validating samples for the posterior prediction step. In the association analyses, only SNPs with a LD measure (r2) lower than 0.8 between pairs and imputation quality indexes (r2 for MACH 1.0 and i for IMPUTE2) higher than 0.6 in all three used case-control samples and the two imputation methods were included. Estimates of beta coefficients for each SNP were obtained using multivariate logistic regression analyses and adjusted for age, gender and the remaining genetic variables. In order to get a single robust estimate of the level of association for each genetic marker, a meta-analysis of the three previous association analyses (n = 4318) was conducted with the METAL software .In the second step, NOS genetic risk scores for MI were computed in all five case-control samples. Risk scores were constructed using allele dosages of low P value (p<0.1) risk alleles identified in both meta-analysis from MACH and IMPUTE2 data sets. Thus, homozygotes for the reference allele were coded as 0 and homozygotes for the risk allele as 2. The risk SNPs were pruned by LD (r2) lower than 0.2 in order to obtain a set of unequivocally independent SNPs to calculate the risk scores. This LD pruning was performed by Tagger , implemented in Haploview , preferentially picking the SNPs with the lowest P value. As an approach to checking for the epidemiological relevance of the estimated risk scores, predictive models were constructed based only on these NOS risk scores in all five case-control samples. These models were performed to assess the fraction of interindividual variance of the MI affection status explained by NOS risk score through Nagelkerke's R2. Moreover, discrimination accuracy of the NOS risk score between patients and healthy controls was estimated as the area under the receiver operating characteristic (ROC) analysis curve (AUC) index. […]

Pipeline specifications

Software tools PLINK, Haploview, IMPUTE, Tagger
Application GWAS
Organisms Homo sapiens
Diseases Coronary Artery Disease, Myocardial Infarction
Chemicals Nitric Oxide