A SNP-set (e.g., a gene or a region) level test for association between a set of rare (or common) variants and dichotomous or quantitative phenotypes. SKAT aggregates individual score test statistics of SNPs in a SNP set and efficiently computes SNP-set level p-values, e.g. a gene or a region level p-value, while adjusting for covariates, such as principal components to account for population stratification. SKAT also allows for power/sample size calculations for designing for sequence association studies.
Provides an assortment of methods to establish and fit a wide range of models. BhGLM offers an R package which is developed to handle about six different types of models including Bayesian hierarchical, negative binomial, or Cox survival models. The application includes features to compute measures to evaluate a given model as well as utilities which serves to numerically and graphically summarize it.
Recognizes and investigates gene-gene effects. hapConstructor employs a haplotype-mining method that can take into consideration multi-locus data at two genes and test for association and interaction. It is available through a Monte Carlo (MC) testing framework and provides empirical construction-wide significance assessment for hypothesis testing. This tool can be useful for hypothesis generation.
A suite of R routines for the analysis of indirectly measured haplotypes. The statistical methods assume that all subjects are unrelated and that haplotypes are ambiguous (due to unknown linkage phase of the genetic markers). The main functions are: haplo.em, haplo.glm, haplo.score, haplo.power, and seqhap.
Analyzes multiple tightly linked markers. HS-TDT can be used for testing linkage or association between the disease-susceptibility locus and a chromosome region in which several tightly linked markers have been typed. It is applicable to both qualitative traits and quantitative traits, to any size of nuclear families with or without ambiguous phase information, and to any number of alleles at each of the markers.
Inference of trait associations with SNP haplotypes and other attributes using the EM algorithm. The R functions are used for inference of trait associations with haplotypes and other covariates in generalized linear models. The functions are developed primarily for data collected in cohort or cross-sectional studies. They can accommodate uncertain haplotype phase and handle missing genotypes at some SNPs.
Offers a platform for performing genome wide association studies (GWAS) based on haplotypes. ParaHaplo is an application leaning on data parallelism to allow users to perform analysis with an increased speed for the assessing of both haplotypes and P values. The application can be used in conjunction with other software for running: (i) genotype imputation and haplotype reconstruction; (ii) haplotype estimation and (iii) haplotype-based GWAS.
A weighted haplotype-based approach and an imputation-based approach, to test for the effect of rare variants with GWAS data. Both methods can incorporate external sequencing data when available. We evaluated our methods and compared them with methods proposed in the sequencing setting through extensive simulations. Our methods clearly show enhanced statistical power over existing methods for a wide range of population-attributable risk, percentage of disease-contributing rare variants, and proportion of rare alleles working in different directions.
Models haplotype association with disease in population studies. GENEBPM is a reversible-jump Markov chain–Monte Carlo (MCMC) algorithm that assesses the evidence in favor of disease association with polymorphisms in a candidate gene or a small candidate region. This method was developed to obtain maximum-likelihood estimates of the relative frequencies of haplotypes consistent with a sample of observed single-nucleotide–polymorphism (SNP) genotypes.
A package designed to call haplotypes from phased marker data. GHap R identifies the different haplotype alleles (HapAllele) present in the data and scores sample haplotype allele genotypes based on HapAllele dose (i.e., 0, 1 or 2 copies). The output is not only useful for analyses that can handle multi-allelic markers, but is also conveniently formatted for existing pipelines intended for bi-allelic markers.
Allows users to handle and solve the single individual haplotyping (SIH) problem. PEATH can identify reliable haplotypes (low error rates and reliably longer haplotype length). It shows the best phased length and N50 values: the length of the haplotype is initialized by the number of total mutation sites and the phasing blocks are divided only in cases with no connection by the overlapped sequence reads. Moreover, this algorithm can be useful for long read sequencing technologies.
Enables quality control and imputation of genome-wide association studies (GWAS) data. Gimpute is a genotyping data processing and imputation pipeline that includes processing steps for genotype liftOver, quality control, population outlier detection, haplotype pre-phasing, imputation, post imputation, and data management. The software can be combined with existing pipelines by means of its modular structure. It is applicable for any study design.
Performs genetic association analysis. UNPHASED is an application that permits users to analyze nuclear families and unrelated subjects, discrete or quantitative traits. It also provides global association tests, tests of individual haplotypes and permutation tests that allows for multiple testing. This method supports non-genetic covariates including parent-of-origin.
Combines an algorithm designed to cluster haplotypes of interest from a given set of haplotypes with two existing tools: Haploview, for analyses of linkage disequilibrium blocks and haplotypes, and PLINK, to generate all possible diplotypes from given genotypes of samples and calculate linear or logistic regression. In addition, procedures for generating all possible diplotypes from the haplotype clusters and transforming these diplotypes into PLINK formats were implemented. Diplotyper is a fully automated tool for performing association analysis based on diplotypes in a population. Diplotyper is useful for identifying more precise and distinct signals over single-locus tests.
Provides a tree-based ensemble method. T-Trees is an extension of the random forest method that takes into account the correlation structure among the genetic markers implied by linkage disequilibrium (LD) in genome-wide association studies (GWAS) data. This method can be useful in terms of predictive power. It also suggests the existence of multivariate and/or non-linear effects due to the combination of several single nucleotide polymorphisms (SNPs).
A fast predictor for the inference of blood groups from single nucleotide variant (SNV) databases. BOOGIE correctly predicted the blood group with 94% accuracy for the Personal Genome Project whole genome profiles where good quality SNV annotation was available. Additionally, BOOGIE produces a high quality haplotype phase, which is of interest in the context of ethnicity-specific polymorphisms or traits. The versatility and simplicity of the analysis make it easily interpretable and allow easy extension of the protocol towards other phenotypes.
Detects disease-haplotype associations. rGLM is a method applicable to both the common disease/common variant (CD/CV) and common disease/rare variant (CD/RV) scenarios based on the generalized linear model (GLM) framework. It uses unphased single nucleotide polymorphisms data. This method permits users to identify a number of significant results that have haplotype frequencies as low as 0.0014.
An R package that performs Logistic Bayesian Lasso for finding association of SNP haplotypes and environmental factors with a trait in a case-control setting. Bayesian lasso is used to find the posterior distributions of logistic regression coefficients, which are then used to calculate Bayes Factor to test for association.
Converts genotyping input into various outputs. SNPTransformer accepts linkage and chip formats as input, and transforms them into: packages for association, transmission disequilibrium tests (TDT), calculating linkage disequilibrium (LD) measures, haplotype inference, haplotype block partition, tagSNPs and multilocus interaction. This tool can be used to perform data analysis for genome-wide association studies (GWAS).
Detects disease association by a set of markers, at any user-specified polymorphic site(s), under arbitrary disease model and sample sizes. HaploPowerCalc uses an approach based on haplotype-sampling. The software is designed for users who wish to estimate the power (or sample sizes required to obtain adequate power) in their association study.