Gene co-expression detection software tools | Transcription data analysis
Ever since the publication of the first gene expression arrays, the correlated expression of genes involved in a related molecular process has been used to predict functional relations between gene pairs. Large amounts of microarray and RNA-seq transcript expression, measured under a plethora of conditions enable mining for concordantly expressed genes.
Predicts the function of genes and gene sets. GeneMANIA is used for probing of gene function and revealing pairwise connections linking genes in yeast, fly, worm, human and other species. It allows users to construct networks from gene lists for custom organisms and network data. The prediction performed provides a method for leveraging functionally informative associations to explore bacterial gene function.
A comprehensive collection of R functions for performing various aspects of weighted correlation network analysis. WGCNA includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software. While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings.
An R/C++ package to identify patterns and biological process activity in transcriptomic data. CoGAPS provides an integrated package for isolating gene expression driven by a biological process, enhancing inference of biological processes from transcriptomic data. It improves on other enrichment measurement methods by combining a Markov chain Monte Carlo (MCMC) matrix factorization algorithm (GAPS) with a threshold-independent statistic inferring activity on gene sets. coGAPS infers biological activity by identifying overlapping, coregulated sets of genes and applying Z-score based statistics. It can be used to isolate transcription factor (TF) or BP activity in datasets of thousands of genes and tens to thousands of samples. The software is provided as open source C++ code built on top of JAGS software with an R interface.
Scores the evolutionary conservation of gene neighborhoods using syntenic blocks. G-NEST combines genomic location, gene expression, and evolutionary sequence conservation data to score putative gene neighborhoods across all possible window sizes in terms of gene number or base pair length. This algorithm utilizes quantitative gene expression data, such as that derived from microarray or RNA-sequencing technologies. It also enables the identification of neighborhoods containing paralogous, divergent, or unannotated genes.
Provides a sequence-independent comparative framework for two or more genomic datasets, where the variables and operations represent biological reality. The approximately common HO GSVD subspace represents the cell-cycle mRNA expression oscillations, which are similar among the datasets. Applications of HO GSVD in biotechnology include comparison of multiple genomic datasets, each corresponding to (i) the same experiment repeated multiple times using different experimental protocols; (ii) one of multiple types of genomic information, such as DNA copy number, DNA methylation and mRNA expression, collected from the same set of samples; (iii) one of multiple chromosomes of the same organism, to illustrate their relation; and (iv) one of multiple interacting organisms, e.g., in an ecosystem, to illuminate the exchange of biological information in these interactions.
Provides a multivariate differential coexpression test that accounts for the complete correlation structure between genes. GSNCA characterizes differences in coexpression networks, without requiring the network inference step. GSNCA should be a valuable addition to gene set analysis approaches because (i) it identifies differentially coexpressed pathways that are overlooked otherwise, (ii) eigenvectors are computed efficiently and (iii) it provides information about the importance of genes in pathways that may result in new biological hypotheses.
Utilizes the estimated pseudotime of the cells to find gene co-expression that involves time delay. LEAP sorts cells according to the estimated pseudotime and then computes the maximum correlation of all possible time lags. In addition, LEAP can apply a time-series inspired lag-based correlation analysis to reveal linearly dependent genetic associations.