Computational protocol: Sex-specific glioma genome-wide association study identifies new risk locus at 3p21.31 in females, and finds sex-differences in risk at 8q24.21

Similar protocols

Protocol publication

[…] GICC cases and controls were genotyped on the Illumina Oncoarray. The array included 37,000 beadchips customized to include previously-identified glioma-specific candidate single nucleotide polymorphisms (SNPs). SFAGS-GWAS cases and some controls were genotyped on Illumina’s HumanCNV370-Duo BeadChip, and the remaining controls were genotyped on the Illumina HumanHap300 and HumanHap550. MDA-GWAS cases were genotyped on the Illumina HumanHap610 and controls using the Illumina HumanHap550 (CGEMS breast,) or HumanHap300 (CGEMS prostate). GliomaScan cases were genotyped on the Illumina 660 W, while controls were selected from cohort studies and were genotyped on Illumina 370D, 550 K, 610Q, or 660 W (See Rajaraman et al. for specific details of genotyping). Details of DNA collection and processing are available in previous publications,–. Individuals with a call rate (CR) <99% were excluded, as well as all individuals who were of non-European ancestry (<80% estimated European ancestry using the FastPop procedure developed by the GAMEON consortium). For all apparent first-degree relative pairs were removed (identified using estimated identity by descent [IBD] ≥ .5), for example, the control was removed from a case-control pair; otherwise, the individual with the lower call rate was excluded. SNPs with a call rate < 95% were excluded as were those with a minor allele frequency (MAF) <0.01, or displaying significant deviation from Hardy-Weinberg equilibrium (HWE) (p < 1 × 10−5). Additional details of quality control procedures have been previously described in Melin et al.. All datasets were imputed separately using SHAPEIT v2.837 and IMPUTE v2.3.2 using a merged reference panel consisting of data from phase three of the 1,000 genomes project and the UK10K–.TCGA cases were genotyped on the Affymetrix Genomewide 6.0 array using DNA extracted from whole blood (see previous manuscript for details of DNA processing,), and underwent standard GWAS QC, and duplicate and related individuals within datasets have been excluded. Ancestry outliers were identified in TCGA using principal components analysis in plink 1.9. Resulting files were imputed using Eagle 2 and Minimac3 as implemented on the Michigan imputation server ( using the Haplotype Reference Consortium Version r1.1 2016 as a reference panel–. Somatic characterization of TCGA cases was obtained from the final dataset used for the TCGA pan-glioma analysis, and classification schemes were adopted from Eckel-Passow, et al. and Ceccarelli, et al.. [...] The data were analyzed using sex-stratified logistic regression models in SNPTEST for all SNPs on autosomal chromosomes within 500 kb of previously identified risk loci, and/or those found to be nominally significant (p < 5 × 10−4) in a previous meta-analysis (Fig. ),. Sex-specific betas (βM and βF), standard errors (SEM and SEF), and p-values (pM and pF) were generated using sex-stratified logistic regression models that were adjusted for number of principal components found to significant differed between cases and controls within each study in a previous meta-analysis,. Genomic inflation factors were calculated After excluding SNPs with MAF < 0.05, INFO score < 0.7, and that significantly violated Hardy-Weinberg equilibrium in controls (p < 5 × 10−8), genomic inflation factors (Males: GICC: λadjusted = 1.04, SFAGS-GWAS: λadjusted = 1.01 MDA-GWAS: λadjusted = 1.02; Gliomascan: λadjusted = 1.01. Females: GICC: λadjusted = 1.03; SFAGS-GWAS: λadjusted = 1.02; MDA-GWAS: λadjusted = 1.04; Gliomascan: λadjusted = 1.01). [...] X and Y chromosome data were available from GICC set only. Males and females were imputed separately for the X chromosome using the previously described merged reference panel. X chromosomes were analyzed using logistic regression model in SNPTEST module ‘newml’ assuming complete inactivation of one allele in females, and males are treated as homozygous females (Fig. ). For prioritized SNPs in the combined model, sex-specific effect estimates were generated using stratified logistic regression models. Y chromosome data were analyzed using logistic regression in SNPTEST (Fig. ). Figures were generated using LocusZoom and R 3.3.2 using GenABEL, qqman, and ggplot–. […]

Pipeline specifications