Computational protocol: Shared genetic aetiology between cognitive functions and physical and mental health in UK Biobank (N=112 151) and 24 GWAS consortia

Similar protocols

Protocol publication

[…] GWAS analyses were performed on the three UK Biobank cognitive test scores and on the Educational Attainment data to use the summary results for LD score regression. Details of the GWAS procedures are provided in the . [...] The UK Biobank genotyping data required recoding from numeric (1, 2) allele coding to standard ACGT format before being used in polygenic profile scoring analyses. This was achieved using a bespoke programme developed by one of the present authors (DCML), details of which are provided in the .Polygenic profiles were created for 24 health-related phenotypes (see Table 3, and ) in all genotyped participants using PRSice. PRSice calculates the sum of alleles associated with the phenotype of interest across many genetic loci, weighted by their effect sizes estimated from a GWAS of that phenotype in an independent sample. Before creating the scores, SNPs with a minor allele frequency <0.01 were removed and clumping was used to obtain SNPs in linkage equilibrium with an r2<0.25 within a 200 bp window. Multiple scores were then created for each phenotype containing SNPs selected according to the significance of their association with the phenotype. The GWAS summary data for each of the 24 health-related phenotypes were used to create five polygenic profiles in the UK Biobank participants, at thresholds of P<0.01, P<0.05, P<0.1, P<0.5 and all SNPs.Correlation coefficients were calculated between each of the UK Biobank cognitive phenotypes. The associations between the polygenic profiles and the target phenotype were examined in regression models (linear regression for the continuous cognitive traits and logistic regression for the binary education variable), adjusting for age at measurement, sex, genotyping batch and array, assessment centre and the first 10 genetic principal components to adjust for population stratification. We corrected for multiple testing across all polygenic profile scores at all significance thresholds for associations with all cognitive phenotypes (470 tests) using the false discovery rate method. We conducted sensitivity analyses as follows. Where the original findings were false discovery rate significant, UK Biobank participants with cardiovascular disease (N=5300) were then removed from analyses of coronary artery disease, those with diabetes (N=5800) were removed from type 2 diabetes analyses, and those with hypertension (N=26 912) were removed from systolic blood pressure analyses. See for further details of these sensitivity analysis. Four multivariate regression models were then performed, including all 24 polygenic profile scores and the covariates described above. […]

Pipeline specifications

Software tools LDSC, PRSice
Databases UK Biobank
Application GWAS
Organisms Homo sapiens
Diseases Alzheimer Disease, Coronary Artery Disease, Stroke