Multivariate analysis software tools | Genome-wide association study
Joint association analysis of multiple traits in a genome-wide association study offers several advantages over analyzing each trait in a separate GWAS. Several methods that have been developed to perform multiple trait analysis.
A free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner. The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype or CNV calls from raw data). Through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation and storage of results.
A program for the analysis of single SNP association in genome-wide studies. The tests implemented include 1) binary (case-control) phenotypes, single and multiple quantitative phenotypes, 2) Bayesian and Frequentist tests, 3) ability to condition upon an arbitrary set of covariates and/or SNPs and 4) various different methods for the dealing with imputed SNPs.
Provides a suite of statistical methods for genetic association analysis that includes genomic annotations. DAP-G performs multi-single nucleotide polymorphism (SNP) genetic association analysis, quantitative trait loci (QTL) discovery and enrichment analysis. This software is built on statistical model and a key algorithm named deterministic approximation of posteriors (DAP). It suits for both genome-wide association studies (GWAS) and genome-wide molecular QTL mapping studies.
Infers past demography and recombination. ABLE is a simulation-based composite likelihood method that uses the blockwise site frequency spectrum for working. This method is designed for a wide variety of data from unphased diploid genomes to genome-wide multi-locus data. Moreover, this tool was tested in analyzing whole genomes from the two species of orangutan (Pongo pygmaeus and P. abelii).
Implements a hierarchical multiple testing procedure. In the context of eQTL studies, TreeQTL provides methods allowing control of the false discovery rate or family wise error rate for the discovery of eSNPs or eGenes, as well as control of the expected average proportion of false discoveries for eAssociations involving the identified eSNPs or eGenes. In the context of multi-trait association studies, TreeQTL can be used to control the error rate for the discovery of variants associated to any phenotypes and the average false discovery rate of phenotypes influenced by such variants.
Enables structural equation modeling (SEM) with continuous data. lavaan is an R package providing a collection of tools that can be used to explore, estimate, and understand a wide family of latent variable models, including factor analysis, structural equation, longitudinal, multilevel, latent class, item response, and missing data models. The software can serve for estimating multiple multivariate statistical models, such as path analysis, confirmatory factor analysis, structural equation modeling and growth curve models.
A program for association analysis that searches for genetic variations influencing a group of correlated traits. This approach represents the dependency structure among the quantitative traits explicitly as a network, and leverages this trait network to encode structured regularizations in a multivariate regression model over the genotypes and traits, so that the genetic markers that jointly influence subgroups of highly correlated traits can be detected with high sensitivity and specificity.
A method that allows gene-based testing of multivariate phenotypes in unrelated individuals. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0.
Colocalizes genetic risk variants through the analysis of summary statistics, specifically Z-scores. LLR is a statistical approach to prioritizing risk variants using the pleiotropy across multiple related studies. The software can efficiently handle the analysis of large-scale genomic data. Its advantages were demonstrated through simulation studies and joint analysis of 18 genome-wide association studies (GWAS) data sets. It is useful for the integrative analysis of multiple GWAS data.
Tests the association between complex objects. The theoretical properties of GSU was studied in a general setting and then focused on the application of the test to sequencing association studies. Based on theoretical analysis, it was proposed to use Laplacian kernel based similarity for GSU to boost power and enhance robustness. Through simulation, GSU did have advantages over existing methods in terms of power and robustness. It was further performed a whole genome sequencing (WGS) scan for Alzheimer’s Disease Neuroimaging Initiative (ADNI) data, identifying three genes, APOE, APOC1 and TOMM40, associated with imaging phenotype.
Permits users to assign statistical significance in genome wide association studies (GWAS). hierGWAS is based on a multivariable approach which includes all the single nucleotide polymorphisms (SNPs) and controls the familywise error rate. This method proposes to assign P-values in a hierarchical manner: first for chromosomes, and then in a topdown fashion from larger to smaller groups of SNPs.
Provides statistical methods for testing multi-trait variant-set association. MSKAT is an R package that implements statistical models for genome-wide association studies (GWAS) summary statistics to motivate novel multi-trait single nucleotide polymorphism (SNP)-set association tests, including variance component test, burden test and their adaptive test, as well as numerical algorithms to compute their analytical p-values.
Assists users in incorporating expert feedback about the impact of genomic measurements. This program proposes a targeted sequential expert knowledge elicitation approach. It offers the possibility to collect feedbacks based on Bayesian experimental design. This methodology can assist researchers in clinical applications, for instance, by determining drug sensitivity of ex vivo blood cancer cells.
Enables the simultaneous analysis of various phenotypes in genome wide associations studies (GWAS). CLC offers a method dedicated to association testing which clusters individual statistics into clusters of positively correlated individual statistics. This application only depends on summary statistics. It allows users to evaluate the association between several phenotypes and a genetic variant of interest.
Allows users to quantify relationships between two complex traits. UNITY is a statistical method, based on a merging of genetic correlation and colocalization, that is able to assess the proportion of causal variants shared between a pair of complex traits. Users can concomitantly model a set of local parameters including the effect of a single nucleotides polymorphism (SNP) on each of the traits and global parameters such as the trait heritability.
Assists users in testing genotype-phenotype relations. TATES is a multivariate method for combining p-values across different, correlated phenotypes. This method allows researchers to test their genetic associations using standard genome-wide association study (GWAS) software. It deals with the high phenotypic dimensionality by combining the univariate analyses while correcting for the relatedness between phenomic dimensions.
Enables joint analysis of multiple phenotypes in genetic association studies. AFC uses the optimal number of p-values, which is determined by the data, to test the association. It includes features for testing the association between multiple phenotypes and the genetic variant. Furthermore, it can be used in genome-wide association studies (GWAS) on complex diseases with multiple phenotypes such as chronic obstructive pulmonary disease (COPD).
A multi-SNP association test using a variable p-value threshold algorithm to select SNPs with the strongest association signal at a particular p-value threshold. The OPTPDT will be helpful for gene-based or pathway association analysis. The method is ideal for the secondary analysis of existing GWAS datasets, which may identify a set of SNPs with joint effects on the disease.
Allows association studies between multiple traits and a unique single nucleotide polymorphism (SNP). GATE performs the principal component analysis (PCA) on all traits and calculates the p-values of the association analysis. It utilizes the Fisher-combined method to combine p-values within and between groups. The tool uses all obtained principal components (PCs) instead of some selected PCs. It is directly applicable to multiple traits association studies with covariates.
A method that relaxes the unrealistic independence assumption of the classical Fisher combination test and is computationally efficient. In order to demonstrate applications of the proposed method, we conduct statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE). The proposed method outperforms existing methods in most settings and also has great applications in GWAS on complex diseases with multiple phenotypes such as the substance abuse disorders.