Complex trait prediction software tools | Genome-wide association study data analysis
The success of genome-wide association studies (GWASs) has led to increasing interest in making predictions of complex trait phenotypes, including disease, from genotype data. Rigorous assessment of the value of predictors is crucial before implementation.
Supplies a method to compute exact values of standard test statistics in linear mixed models. GEMMA is a program built on EMMA software. The application fits three types of models: univariate and multivariate linear mixed model as well as Bayesian sparse linear mixed model. In addition, it estimates variance component and chip heritability. This tool provides a mean to make exact calculations for large genome wide association studies (GWAS).
Identifies associations between gene expression and complex traits using summary data from genome-wide association studies (GWAS) and expression quantitative trait locus (eQTL). Then, a heterogeneity test to distinguish pleiotropy from linkage can be realized. The SMR tool allows to search the most functionally relevant genes at the loci identified in GWAS data for complex traits. It provides a useful tool to prioritize genes underlying GWAS hits for follow up functional studies.
Employs the conjugate gradient-based iterative framework for mixed model computations. BOLT-REML estimates variance parameters for models involving multiple variance components and multiple traits. This algorithm demonstrates that it possible to study previously-intractable large-sample heritability analysis. BOLT-REML uses a Monte Carlo average information restricted maximum likelihood algorithm with respect to the variance parameters being estimated.
Allows the detection of significant signatures of natural selection. GCTB is an application that assists users in deducing the action of natural selection on the genetic variants underlying a complex trait. This software implemented a method called BayeeS: a Bayesian mixed linear model (MLM) approach which can simultaneously estimate features such as single nucleotides polymorphisms (SNP)-based heritability, joint distribution of effect size as well as minor allele frequency (MAF) in conventionally unrelated individuals.
Allows quantification of genome-wide directional effect of a signed functional annotation on polygenic disease risk. SLDP detects directional effects by assessing whether the vector of marginal single nucleotide polymorphism (SNP) effects and the signed linkage disequilibrium (LD) profile are systematically correlated genome-wide. This method can be used to link disease to biological processes beyond transcription factor (TF) binding.
Allows users to evaluate Residual Maximum Likelihood (REML). MTG is a standalone software that exploits a multivariate linear mixed model and uses a direct average information algorithm. In addition, it also supplies best liner unbiased prediction (BLUP) to detect additive genetic effects. Moreover, the algorithm permits to analyze both univariate and multivariate data.
Estimates the variance explained by all the SNPs on a chromosome or on the whole genome for a complex trait rather than testing the association of any particular SNP to the trait. We introduce GCTA's five main functions: data management, estimation of the genetic relationships from SNPs, mixed linear model analysis of variance explained by the SNPs, estimation of the linkage disequilibrium structure, and GWAS simulation.
Provides an approach dedicated to the genetic prediction of complex traits. DPR is a statistical method, based on a non-parametric model, that delivers predictions for various complex traits and can be applied to a wide range of genetic architectures. The application uses a flexible non-parametric prior on the single nucleotide polymorphisms (SNPs) effect sizes. It was tested with simulations and applications on four real datasets.
Extends the BLUP (best linear unbiased prediction) model to include multiple random effects, allowing greatly improved prediction when the random effects correspond to classes of SNPs with distinct effect-size variances. MultiBLUP improves on BLUP when the kinship matrices correspond to subsets of predictors with distinct effect size variances.
Proposes a single multiplex genotyping system based on the six currently most eye-color informative single nucleotide polymorphisms (SNPs). IrisPlex is a web application that (i) allows users to predict blue and brown eye color for single and multiple individuals; (ii) provides a sensitive tool for the analysis of picogram amounts of DNA; (iii) is designed to cater for degraded DNA; and (iv) is based on a genotyping technology.
Allows for examination of complex phenotypes and the development of nonparametric sampling techniques. MEGHA is a statistical method for large-scale heritability analysis using genome-wide single-nucleotide polymorphism (SNP) data from unrelated individuals. It could be used to prioritize brain structural magnetic resonance imaging (MRI) phenotypes based on heritability. This method provides both magnitude estimates and significance measures of heritability with orders of magnitude less computational effort relative to GCTA.
Provides a massive parallel and user friendly implementation of the PBAT-analysis tools for family based association tests (FBATs) in large-scale studies, including genome-wide association studies with several thousand subjects. P2BAT is a software that integrates all PBAT-analysis tools for binary and complex traits into R and makes them accessible through a user-friendly GUI. It also allows to run the analysis of genome-wide association studies massively parallel on cluster, reducing the analysis time of 100 000 SNPs and more to a couple of minutes.
Permits users to control local genomic structure. GoShifter is an enrichment test employing an intuitive method that locally shifts sites of tested features within each locus, to generate a null distribution of annotations overlapping associated variants by chance. Using the local-shifting approach, this method allows prioritization of loci by determining informative functional variants.
Attributes single nucleotide polymorphisms (SNPs) to genes. LDsnpR can be used to measure the added value of linkage disequilibrium (LD)-based binning. It can be utilized for scoring the genes for direct entry into pathway-analysis tools, using the combined p-values of all the markers assigned to the gene bins and numerous summary statistics for computation of joint p-values. This tool is based on the physical position of SNPs and on their pairwise LD with other SNPs.
Predicts species by displaying strong heterotic or specific combining ability effects. Sommer leans on several algorithms based on maximum likelihood (ML) and restricted maximum likelihood (REML), efficient mixed model association (EMMA), direct average information (AI), and expectation maximization (EM). It allows users to specify more than one random effect and their variance-covariance structure.
Allows users to capture current standard practices in polygenic risk score (PRS) studies and the different applications of PRS. PRSice performs a simulation study to estimate a P-value significance threshold for high-resolution PRS studies and produces plots for inspection of results. One of the function of this software is to automate PRS analyses. It is able to calculate PRS at any number of P-value thresholds (PT) and can thus identify the most predictive threshold.
Performs family-based genetic and genomic analyses. ONETOOL provides four main analysis modules: informatics and quality control (InfoQC), trait analysis, linkage analysis and association analysis. This software offers several options for the variant-wise InfoQC and filtering. It combines different types of association methods to allows seemingly harmonized family-based association analyses.
Finds complex prediction models including non-linear interactions and different types of high-throughput data. ATHENA is based on various statistical methods combined with a filtering-modeling pipeline. It assists users to perform a powerful meta-dimensional study. This tool can recognize biological pathways or sets of genes that are a part of the genetic etiology of various complex phenotypes.
A method for association mapping that considers dynamic phenotypes measured at a sequence of time points. TV-GroupSpAM relies on the use of time-varying group sparse additive models for high-dimensional, functional regression. This model detects a sparse set of genomic loci that are associated with trait dynamics, and demonstrates increased statistical power over existing methods.
Methods implemented in the software tool GetSynth for the search for multi-locus haplotype markers in near perfect linkage disequilibrium (LD) with a genome-wide association studies (GWAS) tag variant. Such haplotype markers fulfil the formal criteria of a synthetic association. GetSynth can be applied in a case-control setting as well as to public reference genotype data. Filter criteria, set size, function classes and the number of functional variants that shall be involved in a synthesis can be pre-specified by the user.