Gene expression clustering software tools | Transcription data analysis
Microarray technology has been widely applied in biological and clinical studies for simultaneous monitoring of gene expression in thousands of genes. Gene clustering analysis is found useful for discovering groups of correlated genes potentially co-regulated or associated to the disease or conditions under investigation.
This Python package implements the clustering algorithm proposed by Alex Rodriguez and Alessandro Laio. It generates the initial rho and delta values for each observation then use these values to assign observations to clusters.Dcluster supports interacive clustering based on Decision Graph.
Serves for the functional analysis of gene expression and genomic data. Babelomics offers the possibility to explore the effects of alteration in gene expression levels or changes in genes sequences within a functional context. It provides user-friendly access to a full range of methods that cover: (1) primary data analysis; (2) a variety of tests for different experimental designs; and (3) different enrichment and network analysis algorithms for the interpretation of the results of such tests in the proper functional context.
A widely used program for clustering biological sequences to reduce sequence redundancy and improve the performance of other sequence analyses. In response to the rapid increase in the amount of sequencing data produced by the next-generation sequencing technologies, a new CD-HIT program accelerated with a novel parallelization strategy and some other techniques has been developed to allow efficient clustering of such datasets.
A clustering analysis platform to promote streamlined evaluation, comparison and reproducibility of clustering results in the future. This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. ClustEval allows biomedical researchers to pick the appropriate tool for their data type and allows method developers to compare their tool to the state of the art.
Analyses microarray gene expression data, particularly large data sets. QUBIC is a biclustering algorithm that can identify statistically significant biclusters. It can work on gene expression data for discovering complex relationships among genes and conditions. Furthermore, it can be a useful tool in transcriptional regulation network prediction.
Allows the analysis of gene expression data. EXPANDER gives the user access to a range of microarray analysis algorithms covering the complete analysis process: preprocessing (2) visualizing (3) clustering (4) biclustering and (5) performing downstream analysis of clusters and biclusters such as functional enrichment and promoter analysis. The software incorporates several conventional gene expression analysis algorithms.
An easy-to-use application for microarray, RNA-Seq and metabolomics analysis. For splicing sensitive platforms (RNA-Seq or Affymetrix Exon, Gene and Junction arrays), AltAnalyze will assess alternative exon (known and novel) expression along protein isoforms, domain composition and microRNA targeting. In addition to splicing-sensitive platforms, AltAnalyze provides comprehensive methods for the analysis of other data (RMA summarization, batch-effect removal, QC, statistics, annotation, clustering, network creation, lineage characterization, alternative exon visualization, gene-set enrichment and more).
Discovers approximate gene clusters. Gecko enables ranking and filtering of the gene clusters. It performs a statistical evaluation of all computed clusters, based on the null model of random gene order. This tool does not require gene clusters to be collinear or monophyletic. It is able to assign P-values to all gene clusters which are based on the number of genomes a cluster is detected in, the number of genes and the degree of conservation.
Allows to visualize complex gene expression analysis results coming from biclustering algorithms. BicOverlapper visualizes the most relevant aspects of the analysis, including expression data, profiling analysis results and functional annotation. It integrates several state-of-the-art numerical methods, such as differential expression analysis, gene set enrichment or biclustering. The tool permits to have an overall view of several expression aspects, from raw data to analysis results and functional annotations.
Permits exploratory analysis of multiple data sources. GFAsparse is able to detect the correct number of biclusters exactly in 90.4% of the runs. It can be used in a multi-view drug sensitivity prediction task. The tool shows good prediction performance, and infers meaningful structure present in subsets of the data. It gives condensed and interpretable information with respect to the data collection.
Provides a way to judge the quality of clustering performed elsewhere on some entities. ClusterJudge is an R package and its judgement is based on some additional entity-attribute information. The software provides several functions, for instance to calculate the mutual information between each attribute of the entity.attribute pairs, to judge clustering using an entity.attribute table, or to convert the Saccharomyces Genome Database (SGD) Gene Id into the systematic name of the gene.
A program for clustering and differential expression analysis of expression data generated by next-generation sequencing assays, such as RNA-seq, CAGE and others. DGEclust takes as input a table of count data and it estimates the number and parameters of the clusters supported by the data. The estimated cluster configurations can be post-processed in order to identify differentially expressed genes and for generating gene- and sample-wise dendrograms and heatmaps.
Allows automatic extraction of co-expressed gene clusters from gene expression data. Clust assists users in production of co-expressed clusters of genes that satisfy the biological expectations of a co-expressed gene cluster. This tool utilizes a number of base clustering methods (e.g. k-means clustering, hierarchical clustering, and self-organizing maps) to produce initial sets of clusters.
Enables discovery of the most biologically meaningful bi-clusters in gene expression datasets. UniBic is an application that takes an essential step towards the identification of the most general and meaningful bi-clusters hidden in a noisy and complex data matrix. It assists users to locate a seed of each to-be-identified bi-cluster hidden in a background matrix by finding a longest common subsequence between two rows of the index matrix derived from the input matrix.
Allows users to upload their own data and easily create Principal Component Analysis (PCA) plots and heatmaps. Data can be uploaded as a file or by copy-pasteing it to the text box. Data format is shown under "Help" tab. Several R packages are used internally, including shiny, ggplot2, pheatmap, RColorBrewer, FactoMineR, pcaMethods, shinyBS and others.
Provides several unique features in a modular and flexible system for the analysis of microarray data. The design and modular conception of CARMAweb allows the use of the different analysis modules either individually or combined into an analytical pipeline. CARMAweb performs (i) data preprocessing (background correction, quality control and normalization), (ii) detection of differentially expressed genes, (iii) cluster analysis, (iv) dimension reduction and (v) visualization, classification, and Gene Ontology-term analysis.
Solves the biclustering problem in a more general form, compared to existing algorithms, through employing a combination of qualitative (or semi-quantitative) measures of gene expression data and a combinatorial optimization technique. The QUBIC algorithm can identify all statistically significant biclusters including biclusters with the so-called ‘scaling patterns’, a problem considered to be rather challenging. Furthermore, the algorithm solves such general biclustering problems very efficiently, capable of solving biclustering problems with tens of thousands of genes under up to thousands of conditions in a few minutes of the CPU time on a desktop computer. rqubic can be a useful tool in transcriptional regulation network prediction.
Offers solutions to explore investigation of microarray data and an interface for deployment of specific methods. Expression Profiler provides a set of tools allowing users to try out several different approaches, compare the results and select those methods that make the most sense. It contains hierarchical and partitioning-based clustering methods. This tool is useful to discover the modular structure of expression data matrices.
An online tool for the analysis of gene expression profiles for diagnostic and classification purposes. SCUDO is based on a method for the clustering of profiles based on a subject-specific, as opposed to disease-specific, signature. This approach relies on construction of a reference map of transcriptional signatures, from both healthy and affected subjects, derived from their respective mRNA or miRNA profiles. A diagnosis for a new individual can then be performed by determining the position of the individual's transcriptional signature on the map.
A software tool implementing a novel heuristic algorithm that efficiently solves the weighted bicluster editing problem. It is based on the weighted bicluster editing model, to perform biclustering on arbitrary sets of biological entities, given any kind of pairwise similarities. Bi-Force is implemented in Java and integrated into the open source software package of BiCluE.