Computational protocol: Reverse Pathway Genetic Approach Identifies Epistasis in Autism Spectrum Disorders

Similar protocols

Protocol publication

[…] Data preparation, quality control, and imputation were conducted as described previously in Mitra, I. et al. []. First, SNPs were filtered using PLINK[,] (see ) for Hardy-Weinberg equilibrium (HWE), call rate, minor allele frequency (MAF) and Mendel errors separately in each ASD dataset (). Next, imputation was performed separately for each dataset using IMPUTE2[] (see ), following the recommended pipeline. Lastly, each ASD dataset was combined together, and the following quality control steps were performed: SNPs with severe departure (P < 1.0x10-6) from HWE in Caucasian founders were removed; SNPs were removed if they had different MAF (P < 1.0x10-6) in Caucasians between multiple datasets; SNPs were removed if they had MAF < 1% in Caucasians, or MAF < 2% in the combined dataset. We excluded chromosome X. The final dataset for the analysis included 4,471,807 autosomal SNPs. [...] We performed a multidimensional scaling analysis of genome-wide pairwise identity-by-state (IBS) distances in PLINK[,] for all individuals in the dataset. We used the first five dimensions as covariates in the analysis to correct for population stratification and batch effects. To accurately compare between the RASopathy and control sibling groups, we scaled the T-scores within each group (CFC, CS, NF1, NS, and sibling) so that the mean of the values was 0 and variance was 1, and excluded outlier values greater or less than 3 standard deviations (SD) from the mean (). For each group (CFC, CS, NF1, NS, sibling), we performed QTL mapping by implementing in PLINK[,] a linear regression analysis using the scaled SRS scores as a quantitative trait (—linear). This resulted in the multi-linear regression model Y = b0 + b1*ADD + b2*COV1 + b3*COV2… bn*COV5 + e. To analyze the RASopathy groups together (CFC, CS, NF1, and NS) with greater statistical power, we used METASOFT[] (see ) to conduct a random effects meta-analysis using Han and Eskin's random effects model[]. We also report the Cochran’s Q statistic, calculated using METASOFT[], to analyze heterogeneity between RASopathy groups. The data underlying the top six potential modifiers are graphically represented in , by boxplot (MAF>0.05) or distribution (MAF≤ 0.05). […]

Pipeline specifications

Software tools PLINK, IMPUTE, METASOFT
Applications WGS analysis, GWAS
Organisms Homo sapiens
Diseases Genetic Diseases, Inborn