Computational protocol: Genetic Variants on Chromosome 1p13.3 Are Associated with Non-ST Elevation Myocardial Infarction and the Expression of DRAM2 in the Finnish Population

Similar protocols

Protocol publication

[…] The genotypes for the GWAS were from an initial dataset consisting of 2,234 individuals from the Corogene study, 1,579 individuals selected as controls for the Corogene study sample and an additional 141 DILGOM participants. The samples were genotyped using the Illumina HumanHap 610-Quad SNP Array at the Wellcome Trust Sanger Institute, Cambridge, United Kingdom. We set missing genotypes with clustering probability < 95% and excluded SNPs with a genotyping success rate < 95%, minor allele frequency (MAF) < 1%, or P < 10−6 for an exact test of Hardy-Weinberg equilibrium. We also excluded individuals with a genotyping success rate < 95% and individuals with different reported and genotype-determined gender. We then estimated the pair-wise identity-by-descent for all pairs in the sample and excluded one from each pair of closely-related individuals. Finally, we retained only autosomal SNPs.To augment the dataset with imputed variants, we pre-phased it with shapeIT v1 [] and used IMPUTE v2.2.2 [] with the 1000 Genomes Project integrated variant set (release v3, March 2012) for genotype imputation. Variants with imputation score < 0.6 or MAF < 5% were excluded. For the analysis of the case-control sample using only directly genotyped SNPs, we retained only the MI cases and controls and subjected the remaining sample to the same filters as previously with the MAF cutoff set at 5%. Additionally, we removed SNPs with P < 0.05 for Pearson's X2 test for genotype missingness between cases and controls. A total of 1,579 MI cases (962 NSTEMI, 614 STEMI, 3 MI with unspecified ST-segment status), 1,576 controls, 485,919 genotyped SNPs and 5,968,900 imputed variants passed quality control.The replication sample I was genotyped using the Sequenom MassARRAY iPLEX platform at the Institute For Molecular Medicine Finland FIMM, Helsinki, Finland, following standard procedures. Genotype quality was assessed by genotyping 74 samples in duplicate with 100% concordance for both SNPs. A total of 1,131 participants (390 NSTEMI, 174 STEMI, 567 controls) were successfully genotyped. To form the replication sample II, we combined the genotypes of the FINRISK participants of the discovery sample with individuals genotyped on a variety of different genome-wide genotyping arrays for previous studies (Table B in ), imputed using the same methods as were used for the discovery GWAS. This resulted in both genotypic and phenotypic information being available for 16,627 FINRISK participants not used as controls in the discovery and replication I samples.Gene expression was measured from blood leukocytes using the Illumina HT-12 expression array as described previously []. Briefly, each sample was measured in duplicates and the probe signal intensities were background corrected and normalized so that the signal intensity distributions for all samples on all arrays were the same. Correlation between the two technical replicates was measured and 9 samples were excluded due to poor correlation. Finally, we log2-transformed the corrected and normalized probe intensities. [...] We calculated the genome-wide principal components for the discovery sample with EIGENSTRAT v4.2 [] from 87,046 directly genotyped SNPs in approximate linkage equilibrium. For all association tests we used an additive genetic model. We analyzed the association of genetic variants with the MI phenotypes with logistic regression using PLINK v1.07 [] for the directly genotyped SNPs and SNPTEST v2.4.0 [] for the imputed variants. We compared either all MI patients or only STEMI or NSTEMI patients with the same set of controls using age, gender and the 10 first genomic principal components as covariates. For the replication sample I with controls matched for geographical area but not for age or gender, we did not use age or gender as covariates as recommended []. We combined the results of the logistic regression models with GWAMA v2.1 [] using inverse-variance weighted fixed- and random-effects meta-analysis models. To compare association statistics between different MI types, we used a Bayesian model comparison analysis method described in detail elsewhere []. For the prospective replication sample II, we used the Cox proportional hazards model implemented in the "survival" package [] for R v3.0.2 []. We stratified the analysis by study year, geographical region and genotyping batch setting gender, systolic blood pressure, blood pressure medication, total cholesterol, HDL-cholesterol, current smoking and prevalent diabetes as covariates. For analyzing the association of genetic variants with gene expression levels, we used SNPTEST v2.4.0 and linear regression with the probe intensities as the dependent variables. […]

Pipeline specifications

Application GWAS
Organisms Homo sapiens
Diseases Coronary Artery Disease, Myocardial Infarction