Computational protocol: The Integration of Epistasis Network and Functional Interactions in a GWAS Implicates RXR Pathway Genes in the Immune Response to Smallpox Vaccine

Similar protocols

Protocol publication

[…] Simulated Evaporative Cooling (EC) is a machine-learning algorithm that incorporates main effect contributions with interaction effects to prioritize variants [,]. The algorithm removes the least important SNPs in an iterative process analogous to physical cooling of a gas by evaporation of atoms. Part of the EC simulation process involves balancing the main effect and interaction contribution to each predictor’s score. Whereas standard linear regression for variant prioritization ignores interaction effects, EC uses ReliefF to calculate the interaction component for each SNP, providing a more robust indication of a variant’s utility in a posterior epistatic network. The other half of EC is Random Jungle, an implementation of Random Forest, which is used to calculate the main effect contribution of the SNP importance scores. By filtering variants before the construction of an epistasis network, we significantly reduce the computational burden of computing pairwise interactions in Step 2 without sacrificing variants likely to be integral in an epistasis network. One could use a univariate filter instead, but as mentioned in the introduction, we hypothesize that including interactions will be important for characterizing the full pathway for a phenotype. [...] Using the reGAIN matrix from Step 2, we employed an eigenvector centrality algorithm called SNPrank [] to prioritize the variants whose aggregate interactions and main effects contributed most to the variance in the antibody titer. This additional interaction and main-effect filtering step removes more irrelevant SNPs while adjusting for covariates. To assess the possible biological activity of the variants implicated in our epistasis network, variants with the strongest centrality were mapped to their corresponding genes and used for enrichment analysis. […]

Pipeline specifications

Software tools EC, Inbix, Random Jungle, SNPrank
Application GWAS
Organisms Variola virus