1 - 50 of 135 results


Provides a way to judge the quality of clustering performed elsewhere on some entities. ClusterJudge is an R package and its judgement is based on some additional entity-attribute information. The software provides several functions, for instance to calculate the mutual information between each attribute of the entity.attribute pairs, to judge clustering using an entity.attribute table, or to convert the Saccharomyces Genome Database (SGD) Gene Id into the systematic name of the gene.


A clustering analysis platform to promote streamlined evaluation, comparison and reproducibility of clustering results in the future. This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. ClustEval allows biomedical researchers to pick the appropriate tool for their data type and allows method developers to compare their tool to the state of the art.


An easy-to-use application for microarray, RNA-Seq and metabolomics analysis. For splicing sensitive platforms (RNA-Seq or Affymetrix Exon, Gene and Junction arrays), AltAnalyze will assess alternative exon (known and novel) expression along protein isoforms, domain composition and microRNA targeting. In addition to splicing-sensitive platforms, AltAnalyze provides comprehensive methods for the analysis of other data (RMA summarization, batch-effect removal, QC, statistics, annotation, clustering, network creation, lineage characterization, alternative exon visualization, gene-set enrichment and more).

BicAT / Biclustering Analysis Toolbox

Finds the hidden order-preserving submatrices in the random matrix. BicAT recovers the hidden order-preserving submatrices with a very high success rate. It offers a graphical user interface (GUI) for several existing biclustering and clustering algorithms. The tool provides a number of algorithms to find biclusters (or clusters) within expression data, as well as a number of postprocessing utilities useful for a further analysis of the results. Its main purpose is to help biologists with the analysis and exploration of the gene expression data, e.g. microarrays.


Serves for the functional analysis of gene expression and genomic data. Babelomics offers the possibility to explore the effects of alteration in gene expression levels or changes in genes sequences within a functional context. It provides user-friendly access to a full range of methods that cover: (1) primary data analysis; (2) a variety of tests for different experimental designs; and (3) different enrichment and network analysis algorithms for the interpretation of the results of such tests in the proper functional context.


Solves the biclustering problem in a more general form, compared to existing algorithms, through employing a combination of qualitative (or semi-quantitative) measures of gene expression data and a combinatorial optimization technique. The QUBIC algorithm can identify all statistically significant biclusters including biclusters with the so-called ‘scaling patterns’, a problem considered to be rather challenging. Furthermore, the algorithm solves such general biclustering problems very efficiently, capable of solving biclustering problems with tens of thousands of genes under up to thousands of conditions in a few minutes of the CPU time on a desktop computer. rqubic can be a useful tool in transcriptional regulation network prediction.


Allows to visualize complex gene expression analysis results coming from biclustering algorithms. BicOverlapper visualizes the most relevant aspects of the analysis, including expression data, profiling analysis results and functional annotation. It integrates several state-of-the-art numerical methods, such as differential expression analysis, gene set enrichment or biclustering. The tool permits to have an overall view of several expression aspects, from raw data to analysis results and functional annotations.

QUBIC / QUery BICluster

Implements a well-cited biclustering algorithm, QUBIC, for the interpretation of gene expression profile data. The unique features of QUBIC include: (i) biclustering is integrated with analyses functions (i.e. data discretization, query-based biclustering, bicluster expanding, biclusters comparison, heatmap visualization and co-expression network elucidation); (ii) the QUBIC source code is optimized and converted to C++, thus has better memory control and is more efficient than the original QUBIC; (iii) on five large-scale datasets, QUBIC performs the best among four popular tools according to the running time. Biclustering algorithms facilitate researchers in identification of co-expressed gene subsets in their gene expression dataset, and has become a useful approach for the interpretation of gene expression profile data.

BicPAMS / Biclustering based on PAttern Mining Software

Discovers exhaustive and flexible structures of biclusters, with parameterizable coherency and robustness to noise and missings. BicPAMS combines state-of-the-art pattern-based biclustering algorithms and makes them available within usable interfaces. It provides the unprecedented possibility to parameterize the properties of the biclustering solutions. The tool is applicable to dense or sparse, symbolic or real-valued data, and is able to incorporate domain knowledge.

isa2 / Iterative Signature Algorithm

Finds modules in an input matrix. isa2 is a biclustering algorithm that iteratively refines sets of genes and conditions until they match this definition. The algorithm was applied to real expression data from the yeast Saccharomyces cerevisiae. Applications could include the analysis of biological data on protein-protein interactions (PPIs) or cell growth assays, as well as other large-scale data, where a meaningful reduction of complexity is needed.

iBBiG / Iterative Binary Biclustering of Genesets

A package based on a bi-clustering algorithm to perform meta-GSA that addresses the shortcomings of ‘ranked list’ meta-GSA approaches. iBBiG scales well when applied to hundreds of datasets, is tolerant to noise characteristic of genomics data and when applied on simulated data, outperforms clustering and bi-clustering methods including hierarchical and k-means clustering, FABIA, COALESCE and Bimax. iBBiG is optimized for meta-analysis of large numbers of diverse genomics data that may have unmatched samples. It does not require prior knowledge of the number or size of clusters. When applied to simulated data, it outperforms commonly used clustering methods, discovers overlapping clusters of diverse sizes and is robust in the presence of noise. In summary, iBBiG provides a simple, robust, rapid and scalable method for meta-GSA.


Detects co-expression patterns and allows clustering of high-throughput sequencing data. clusterSeq is composed of two methods to identify of co-expression: (i) the first one increases the empirical Bayesian analysis which can be useful to experimental designs incorporating replication (ii) the second is based on comparisons between k-means clustering on the expression of individual genes, which is applicable even without a known replicate structure, and can be useful to detect co-expression in large numbers of samples.


Identifies gene clusters that exhibit distinctly similar or different gene expression patterns among the comparing sample conditions. TimesVector is a triclustering algorithm which is designed for clustering three-dimensional time series data to capture distinctively similar or different gene expression patterns between two or more sample conditions. This tool identifies clusters with distinctive expression patterns in three steps: (i) dimension reduction and clustering of time-condition concatenated vectors, (ii) post-processing clusters for detecting similar and distinct expression patterns and (iii) rescuing genes from unclassified clusters.


Allows users to analyze pixelated data in digital imagery. Cluster could be used to facilitate the study of questions concerning the dominant inoculum sources and impact of cluster size on crop loss on a field scale. The application computes the percent area occupied by targeted pixels, identifies the centroids of targeted clusters, and calculates the relative compass angle of orientation for each cluster. Users can deselect anomalous clusters by specifying a size threshold value to exclude smaller targets from the analysis.

CLIFF / Clustering via Iterative Feature Filtering

Allows users to cluster biological samples using gene expression microarray data. CLIFF combines a clustering process and a feature selection process in a bootstrap-like iterative way. The algorithm can capture the partition that characterizes the samples but is masked in the original high-dimensional feature space. It is generalizable to arbitrary multi-way clustering, either through recursive 2-way cuts or simultaneous use of several eigenvectors.

SCUDO / Signature-based ClUstering for DiagnOstic purposes

An online tool for the analysis of gene expression profiles for diagnostic and classification purposes. SCUDO is based on a method for the clustering of profiles based on a subject-specific, as opposed to disease-specific, signature. This approach relies on construction of a reference map of transcriptional signatures, from both healthy and affected subjects, derived from their respective mRNA or miRNA profiles. A diagnosis for a new individual can then be performed by determining the position of the individual's transcriptional signature on the map.

RSAT matrix-clustering

Supports dynamic browsing of motif trees with custom collapse/expansion of branches. RSAT matrix-clustering also provides multiple ways to inspect the results: motif forest with branch motifs at each level of each tree, similarity heatmap, searchable table of motifs and clusters, comparison between multiple collections with contingency tables summarizing relationships between clusters and collections, as well as cross-coverage between collections. This method relies on hierarchical clustering with a bottom-up partitioning.

EXPANDER / EXpression Analyzer and DisplayER

An integrated software platform for the analysis of microarray gene expression data. EXPANDER is designed to support all the stages of microarray data analysis, from raw data normalization to inference of transcriptional regulatory networks. The microarray analysis starts with importing the data into and is followed by normalization and filtering. Then, clustering and network-based analyses are performed. The gene groups identified are tested for enrichment in function, co-regulation (using transcription factor and microRNA target predictions) or co-location.


Uses dynamic programming to guarantee clustering optimality in O(n 2k) time. Ckmeans.1d.dp can help users to quantize biological data for qualitative dynamic modeling. Its advantage over heuristic clustering algorithms in efficiency and accuracy is increasingly pronounced as the number of clusters k increases. The 1-D dynamic programming strategy can be extended to multiple dimensional spaces so that clustering can be done in polynomial time. Another function generates histograms that are adaptive to patterns in data.

EDISA / Extended Dimension Iterative Signature Algorithm

Extracts biclusters from multiple time-series of gene expression profiles. EDISA is a probabilistic clustering approach for 3D gene-condition-time datasets and an extension of the ISA approach. This package is capable of mining gene modules in the three-dimensional datasets. It is capable of capturing such complex response patterns with manifold trajectories. It also allows for a flexible, integrative analysis resulting in informative and dense modules, which can be subject to further downstream functional analysis.


Cluster-select the optimal number of clusters and removes outlier objects. Thresher is a method that assists in selecting and filtering out noise, estimating the optimal number of clusters for the remaining good objects, and in performing grouping based on the von Mises-Fisher mixture model. The software can detect whether the objects of interest are “good” or “bad” (outliers). It was applied to a wide variety of breast cancer data sets for estimating the number of subtypes.

RFcluE / Random Forest cluster Ensemble

Allows discovery of the underlying structure of genetic data. RFcluE is a cluster ensemble approach based on an Random Forest (RF) algorithm, that addresses the problem of population structure analysis. The software is composed of two stages: (1) ensemble construction, in which an RF-based clustering method is applied to generate a set of clusters for the same dataset; (2) and consensus function, which integrates all the clusters to produce a final data clustering.