1 - 50 of 84 results

BEAM / Bayesian Epistasis Association Mapping

A method for genome-wide case-control studies. BEAM treats the disease-associated markers and their interactions via a bayesian partitioning model and computes, via Markov chain Monte Carlo, the posterior probability that each marker set is associated with the disease. Testing this on an age-related macular degeneration genome-wide association data set, we demonstrate that the method is significantly more powerful than existing approaches and that genome-wide case-control epistasis mapping with many thousands of markers is both computationally and statistically feasible.

MBS / Multiple Beam Search

Discovers genome-phenome relationship using Bayesian Networks (BNs). MBS employs the extended greedy search and learn directed acyclic graph (DAG) models that contain two or more predictors in the epistatic interaction. It can be used to learn from data the interactive relationship among a subset of predictors that together can have a causal effect on a clinical feature. This tool has been tested on genome-wide association study (GWAS) data and successfully discovered the epistatic interaction of single nucleotide polymorphisms (SNPs) that have causal effect on Late Onset Alzheimer disease (LOAD).


Allows to make epistasis detection based on hierarchical representation of linkage disequilibrium (LD). LinDen reduces the number of tests performed in epistasis detection. It uses correlations between the genotypes of neighboring loci to construct groups that hierarchically represent LD trees and derives representative genotypes for these LD groups. It utilizes these representative genotypes to score the potential interaction between any pair of loci in the respective groups, but also to filter out pairs of loci groups that are not promising.


Provides a convenient single interface for accessing multiple publicly available human genetic data sources that have been compiled in the supporting database of the Library of Knowledge Integration (LOKI). Biofilter is a software which allows to annotate genomic location or region based data, filter genomic location or region based data on biological criteria and generate predictive models for gene-gene, single nucleotide polymorphism (SNP)-SNP, or copy number variants (CNV)-CNV interactions based on biological information, with priority for models to be tested based on biological relevance.

TS-GSIS / Two Stage-Grouped Sure Independence Screening

Permits the study of single nucleotide polymorphism (SNP)–SNP interactions with or without marginal effects. TS-GSIS provides valid variable selection for the analysis of quantitative and disease traits under various types of correlation, MAF and trait dispersion. This method can (i) determine whether SNP jointly form a candidate model with associations, (ii) determine the size of the candidate model automatically, (iii) discover significant SNP–SNP interactions without individual marginal SNP effects, (iv) make direct inference and easy interpretation on the biologically meaningful gene, (v) identifies SNPs in a gene when they jointly contribute to the trait y and (vi) reduce the search space for identification of the interaction effects.


An efficient family-based gene-gene interaction test for trios (i.e., two parents and one affected sib). The GCORE compares interlocus correlations at two SNPs between the transmitted and non-transmitted alleles. We used simulation studies to compare the statistical properties such as type I error rates and power for the GCORE with several other family-based interaction tests under various scenarios. We applied the GCORE to a family-based GWAS for autism consisting of approximately 2,000 trios. Testing a total of 22,471,383,013 interaction pairs in the GWAS can be finished in 36 hours by the GCORE without large-scale computing resources, demonstrating that the test is practical for genome-wide gene-gene interaction analysis in trios.


A general measure for epistasis testing. W-test is fast, model-free, and powerful. We have demonstrated that the W-test has robust power for linear and non-linear genetic models over a range of genetic environments. The method is especially advantageous for low frequency variants and has persistent power when the sample size is small. The proposed method aims to test the distributional differences between cases and controls, using the sum of squared log odds ratio over the complete cell distribution in a contingency table. The cell distribution that is formed by a pair of markers has the overall probability to be one, in the control group and the case group, respectively. This constraint keeps the cell proportions to reflect distributional differences, which are tested cell by cell using the odds ratio.

ETMA / Epistasis Test in Meta-Analysis

A Markov chain Monte Carlo-based method using genotype summary data to obtain consistent estimates of epistasis effects in meta-analysis. We defined a series of conditions to generate simulation data and tested the power and type I error rates in ETMA, individual data analysis and conventional meta-regression-based method. ETMA not only successfully facilitated consistency of evidence but also yielded acceptable type I error and higher power than conventional meta-regression.

epiNEM / Epistatic NEMs

Can take into account double knockouts and infer more complex network signalling pathways. EpiNEM incorporates logical functions that describe interactions between regulators. The epiNEM method can be applied to all datasets that measure multi-parametric phenotypes for combinatorial perturbations. This tool is designed to use large knock-out screens to identify those hidden signalling genes as modulators of the signal and explanation of the corresponding data. It allows to understand mediators of complex phenotypes of genetic interactions.

ClinGen Pathogenicity Calculator / Clinical Genome Resource Pathogenicity Calculator

Assesses pathogenicity of Mendelian germline sequence variants. ClinGen Pathogenicity Calculator allows users to enter the applicable American College of Medical Genetics and Genomics /Association for Molecular Pathology-style evidence tags for a specific allele with links to supporting data for each tag and generate guideline-based pathogenicity assessment for the allele. The software is modular, equipped with robust application program interfaces and as a cloud-hosted web service, thus facilitating both stand-alone use and integration with existing variant curation and interpretation systems.


An efficient gene-gene interaction test for discordant sib pairs (DSPs), which is suitable for genome-wide interaction analysis for single nucleotide polymorphisms (SNP) pairs in DSPs. We used simulations to demonstrate that the GCORE-sib has correct type I error rates and has comparable power to that of the regression-based interaction test. We also showed that the GCORE-sib can run more than 10 times faster than the regression-based test. Finally, the GCORE-sib was applied to a GWAS dataset with approximately 2,000 discordant sib pairs, and the GCORE-sib finished testing 19,368,078,382 pairs of SNPs within 6 days.

MBS-IGain / Multiple Beam Search with Information Gain

Identifies interactive effects in high-dimensional datasets. MBS-IGain employs information gain to determine whether to add a predictor on a given beam rather than using the score. It uses the score to end its search on each beam. This tool was applied to a real genome wide association studies (GWAS) Late Onset Alzheimer disease (LOAD) dataset. It identifies predictors that are interacting rather than merely identifying high scoring models.

umMDR / Unified Model based Multifactor Dimensionality Reduction

Obtains the significance of a multi-locus model, even a high-order model through a regression framework with a semi-parametric correction procedure for controlling Type I error rates. UM-MDR avoids heavy computation in order to achieve the significance of a multi-locus model. The approach is able to incorporate different types of traits and evaluate significances of the existing MDR extensions. The tool provides a supplement of existing MDR method due to its efficiency in achieving significance for every multi-locus model, its power and its flexibility of handling different types of traits.

SIPI / SNP Interaction Pattern Identifier

Takes non-hierarchical models, inheritance modes and mode coding direction into consideration. SIPI can intensively and effectively search pairwise SNP–SNP interactions. It detects 45 interaction models, which take inheritance mode (both original and reverse coding), and risk category grouping (model structure) into consideration. Benchmark shows that SIPI is a more comprehensive and flexible tool for detecting two-way SNP–SNP interactions compared with the three full model approaches: AA_Full in PLINK, Geno_Full and SNPassoc. All these methods are based on hierarchical models, and the difference is how the inheritance modes are deal.

JBASE / Joint Bayesian Analysis of Subphenotypes and Epistasis

An integrative mixture model. JBASE explores two major reasons of missing heritability: interactions between genetic variants, a phenomenon known as epistasis and phenotypic heterogeneity, addressed via subphenotyping. Our extensive simulations in a wide range of scenarios repeatedly demonstrate that JBASE can identify true underlying subphenotypes, including their associated variants and their interactions, with high precision. JBASE is the first algorithm to tackle modeling of epistasis and subphenotyping simultaneously. We show that taking both of these causes of missing heritability into account increases the power and reduces the Type 1 Error in detecting associations.


Allows to identify epistatic interactions, which is based on ant colony optimization (ACO) algorithm. Highlights of epiACO are the introduced fitness function Svalue, path selection strategies, and a memory based strategy. The Svalue leverages the advantages of both mutual information and Bayesian network to effectively and efficiently measure associations between single nucleotide polymorphism (SNP) combinations and the phenotype. It can perform well on both simulation data sets and a real age-related macular degeneration (AMD) data set.

PEPIS / Pipeline for estimating EPIStatic effect

A web server-based tool for analysing polygenic epistatic effects. PEPIS is based on a linear mixed model that has been used to predict the performance of hybrid rice. It includes two main sub-pipelines: the first for kinship matrix calculation, and the second for polygenic component analyses and genome scanning for main and epistatic effects. PEPIS was dedicatedly developed for epistatic genetic estimation. It will help overcome the bottleneck in genetic epistasis analysis.

Source code from Determination of Nonlinear Genetic Architecture using Compressed Sensing

A compressed sensing method that can reconstruct nonlinear genetic models (i.e., including epistasis, or gene-gene interactions) from phenotype-genotype (GWAS) data. Our method uses L1-penalized regression applied to nonlinear functions of the sensing matrix. Our results indicate that predictive models for many complex traits, including a variety of human disease susceptibilities (e.g., with additive heritability h2 ∼0.5), can be extracted from data sets comprised of n⋆ ∼100s individuals, where s is the number of distinct causal variants influencing the trait. For example, given a trait controlled by ∼10 k loci, roughly a million individuals would be sufficient for application of the method.

BNPP / Bayesian Network Posterior Probability

Handles multi-locus hypotheses by computing the posterior probability of a hypothesis. BNPP allows users to compute the posterior probability of multi-locus models. It represents models where a single locus by itself is associated with a phenotype such as a disease by using particular types of Bayesian network (BN) structures. This tool computes the posterior probability of a model based on the likelihoods of these structures and their prior probabilities.


Provides algorithms for training and evaluating several types of Boltzmann Machines (BMs). BoltzmannMachines.jl is a Julia package that supports multiple cores: (i) learning of Restricted Boltzmann Machines (RBMs) using Contrastive Divergence, (ii) greedy layerwise pre-training of Deep Boltzmann Machines, (iii) learning procedure for general Boltzmann Machines using mean-field inference and stochastic approximation, (iv) exact calculation of the likelihood of BMs, (v) Annealed Importance Sampling (AIS) for estimating the likelihood of larger BMs.


Combines the differential evolution (DE) algorithm with a classification based multifactor-dimensionality reduction (CMDR) to identify potential epistasis in genome-wide association studies (GWAS). DECMDR is a fast and accurate method for epistatic interaction detection. It uses the metaheuristics to find the significant epistasis in genome-wide data sets to allow a shorter execution time. DECMDR is a powerful method for handling large-scale GWAS data both in terms of speed and detection of the more significant, previously unidentified interactions.

MAPIT / MArginal ePIstasis Test

Estimates and tests its marginal epistatic effect, the combined epistatic effect between the examined variant and all other variants. By modeling and inferring the marginal epistatic effects, MAPIT can identify variants that exhibit non-zero epistatic interactions with any other variant without the need to identify the specific marker combinations that drive the epistatic association. Therefore, MAPIT represents an attractive alternative to standard methods for mapping epistasis. MAPIT is implemented as a set of R and C++ routines, which can be carried out within an R environment.

MDR / Multifactor Dimensionality Reduction

Detects epistatic relationships between genes. MDR is a nonparametric and genetic model-free data mining alternative to logistic regression for detecting and characterizing nonlinear interactions among discrete genetic and environmental attributes. The MDR method combines attribute selection, attribute construction, and classification with cross-validation and permutation testing to provide a comprehensive and powerful approach to detecting nonlinear interactions. Using graphics processing units (GPUs) to run MDR on a genome-wide dataset allows for statistically rigorous testing of epistasis.


A multi-objective heuristic optimization methodology for detecting genetic interactions. In MACOED, we combine both logistical regression and Bayesian network methods, which are from opposing schools of statistics. The combination of these two evaluation objectives proved to be complementary, resulting in higher power with a lower false-positive rate than observed for optimizing either objective independently. To solve the space and time complexity for high-dimension problems, a memory-based multi-objective ant colony optimization algorithm is designed in MACOED that is able to retain non-dominated solutions found in past iterations.

CINOEDV / Co-Information based N-Order Epistasis Detector and Visualizer

An R package for the detection and visualization of epistatic interactions of their orders from 1 to n (n ≥ 2). CINOEDV is composed of two stages, namely, detecting stage and visualizing stage. In detecting stage, co-information based measures are employed to quantify association effects of n-order SNP combinations to the phenotype, and two types of search strategies are introduced to identify n-order epistatic interactions: an exhaustive search and a particle swarm optimization based search. In visualizing stage, all detected n-order epistatic interactions are used to construct a hypergraph, where a real vertex represents the main effect of a SNP and a virtual vertex denotes the interaction effect of an n-order epistatic interaction. By deeply analyzing the constructed hypergraph, some hidden clues for better understanding the underlying genetic architecture of complex diseases could be revealed.

FITF / Focused Interaction Testing Framework

Identifies susceptibility genes involved in epistatic interactions for case-control studies of candidate genes. In the FITF approach, likelihood-ratio tests are performed in stages that increase in the order of interaction considered. Joint tests of main effects and interactions are performed conditional on significant lower-order effects. A reduction in the number of tests performed is achieved by prescreening gene combinations with a goodness-of-fit chi2 statistic that depends on association among candidate genes in the pooled case-control group. Multiple testing is accounted for by controlling false-discovery rates.

BHIT / Bayesian High-order Interaction Toolkit

A Bayesian partition computational method for detecting SNP interactions (epistasis). The proposed approach builds a Bayesian model on both continuous data and discrete data to partition multiple-phenotype data. Comparing with other methods on both simulation data and real data, the key strengths of BHIT are as follows: (i) With the advanced Bayesian model equipped with MCMC search, BHIT can efficiently explore high-order interactions. (ii) BHIT has a flexible Bayesian model on continuous and discrete data, so that both continuous and discrete phenotypes could be handled simultaneously, and the interaction within or between phenotypes and genetic data can also be detected.

KNN-MDR / K-Nearest Neighbors - Multi Dimensional Reduction

Detects gene-gene interactions as a possible alternative to existing algorithms, especially in situations where the number of involved determinants is high. KNN-MDR can be seen as an interesting addition to the arsenal used in complex traits analyses. It is more computationally efficient than other exhaustive strategies, facilitating the analysis of large-scale data sets with potentially genome-wide Single Nucleotide Polymorphisms (SNPs).


An approach based on ridge regression with polynomial kernels and model selection technique for determining the true degree of epistasis among single nucleotide polymorphisms (SNPs). KDSNP employs polynomial kernels to predict the maximum degree of interactions in given data. The performance of this method was evaluated in simulated data with artificially generated phenotypes, in which we know the true degree of epistasis. In comparison to SNP–SNP interaction detection method PLINK, KDSNP gives us richer information.

LIMEpi / Latent Interaction Modelling for Epistasis detection

Implements a two-stage epistasis detection procedure from genotype data. LIMEpi is a selection method for polymorphic loci and can be used as the first stage of two-stage epistasis detection. It allows users to reduce the computational complexity for epistatic loci. This method allows for ranking the single nucleotide polymorphisms (SNPs) in computation time linear and for identifying epistatic SNPs. It prioritizes interactions between genes involved in the G-protein responses to neurotransmitters (PDE1A), neurodevelopment (RBFOX1), synaptosome (SNAP-25) and cell adhesion (NRXN1).


Reduces the multiple testing burden to one test per single nucleotide polymorphism (SNP) and allows interactions with unobserved factors. semisup is an R package able to move away from testing interaction terms, and move towards testing whether an individual SNP is involved in any interaction. Analysing one SNP at a time, it splits the individuals into two groups, based on the number of minor alleles. If the quantitative trait differs in mean between the two groups, the SNP has a main effect.

PLATO / PLatform for the Analysis Translation and Organization of large-scale data

Filters a large, genomic dataset down to a subset of genetic variants, which may be useful for interaction analysis. PLATO is a filter-based method bringing together many analytical methods simultaneously in an effort to solve this problem. This platform was developed for the analysis of genome wide association study (GWAS) data that will incorporate numerous analytic approaches as filters. It also analyzes single nucleotide polymorphisms (SNPs) and other independent variables using a variety of filters in an effort to identify a subset of interesting SNPs from a much larger set.