1 - 50 of 55 results


Allows studying of spatial patterning of gene expression at the single-cell level. Seurat is an R package that enables quality control (QC), analysis, and exploration of single cell RNA-seq data. The software includes three computational methods: (1) unsupervised clustering and discovery of cell types and states, (2) spatial reconstruction of single cell data, and (3) integrated analysis of single cell RNA-seq across conditions, technologies, and species. It can also localize rare subpopulations, and map both spatially restricted and scattered groups.

RaceID / Rare cell type IDentification

An algorithm for the identification of rare and abundant cell types from single cell transcriptome data. RaceID is based on transcript counts obtained with unique molecular identifies. We demonstrate that this algorithm can resolve cell types represented by only a single cell in a population of randomly sampled organoid cells. We use this algorithm to identify Reg4 as a novel marker for enteroendocrine cells, a rare population of hormone-producing intestinal cells.

RCA / Reference Component Analysis

Projects single-cell transcriptomes into a space defined by variability in a reference data set. RCA is an R package for robust clustering analysis of single cell RNA sequencing data (scRNAseq). This method outperforms existing algorithms for clustering single-cell transcriptomes and generates tight cell clusters consisting almost entirely of cells of the same type. It also identifies multiple cell types in CRC tumors and normal mucosa, despite the strong batch effects in clinical samples.

SPADE / Spanning tree Progression of Density normalized Events

Facilitates the analysis of cellular heterogeneity, the identification of cell types, and comparison of functional markers in response to perturbations, based on a versatile method. SPADE helps to organize high-dimensional cytometry data in an unsupervised manner, and to investigate natural and pathogenic cellular heterogeneity for biological insight. The SPADE algorithm consists of four components: (i) density-dependent downsampling, (ii) clustering, (iii) linking clusters with a minimum spanning tree, and (iv) upsampling to restore all cells in the final result. This modularized process allows more efficient sub-algorithms to replace the current components. In this sense, SPADE can be viewed as a framework for cytometric data analysis and visualization that has the capacity to be evolved and adapted.


An easy-to-use application for microarray, RNA-Seq and metabolomics analysis. For splicing sensitive platforms (RNA-Seq or Affymetrix Exon, Gene and Junction arrays), AltAnalyze will assess alternative exon (known and novel) expression along protein isoforms, domain composition and microRNA targeting. In addition to splicing-sensitive platforms, AltAnalyze provides comprehensive methods for the analysis of other data (RMA summarization, batch-effect removal, QC, statistics, annotation, clustering, network creation, lineage characterization, alternative exon visualization, gene-set enrichment and more).


Provides a linear model and normality based transformation method. Linnorm is an R package for the analysis of RNA-seq, scRNA-seq, ChIPseq count data or any large-scale count data. It transforms such datasets for parametric tests. Some pipelines are implemented: (i) library size/batch effect normalization, (ii) cell sub-population analysis and visualization, (iii) differential expression analysis or differential peak detection, (iv) highly variable gene discovery and visualization, (v) gene correlation network analysis and visualization, (vi) stable gene selection for scRNA-seq data and (vii) data imputation.


Preserves distinct structural properties of the data. dropClust uses Locality Sensitive Hashing (LSH), a logarithmic-time algorithm to determine approximate neighborhood for individual transcriptomes. It employs an exponential decay function to select higher number of expression profiles from clusters of relatively smaller sizes. This tool is able to detect principal components (PCs) with multi-modal distribution of the projected transcriptomes by using mixtures of Gaussians.

TSCAN / Tools for Single Cell ANalysis

A software tool developed to better support in silico pseudo-time reconstruction in single-cell RNA-seq analysis. TSCAN uses a cluster-based minimum spanning tree (MST) approach to order cells. Cells are first grouped into clusters and an MST is then constructed to connect cluster centers. Pseudo-time is obtained by projecting each cell onto the tree, and the ordered sequence of cells can be used to study dynamic changes of gene expression along the pseudo-time. Clustering cells before MST construction reduces the complexity of the tree space. This often leads to improved cell ordering. It also allows users to conveniently adjust the ordering based on prior knowledge. TSCAN has a graphical user interface (GUI) to support data visualization and user interaction. Furthermore, quantitative measures are developed to objectively evaluate and compare different pseudo-time reconstruction methods.

reCAT / recover Cycle Along Time

Reconstructs cell cycle time-series using single-cell transcriptome data. reCAT is a computational method consists of four steps: (i) the data processing, including quality control, normalization, and clustering of single cells, (ii) the order of the clusters is then recovered by finding a traveling salesman cycle, (iii) two scoring methods, Bayes-scores and mean-scores subsequently discriminate among cycle stages and (iv) a hidden Markov model (HMM) and a Kalman smoother finally estimate the underlying gene expression levels of the single-cell time-series.

Clustering on Transcript Compatibility Counts

Offers a universal, efficient and accurate solution for extracting information from single-cell RNA-seq experiments. In the same way that single-cell analysis can be viewed as the ultimate resolution for transcriptomics, transcript-compatibility counts are the most direct way to “count” reads. Our method departs from standard analysis pipelines, comparing and clustering cells based not on their transcript or gene quantifications but on their transcript-compatibility read counts.

ESAT / End Sequence Analysis ToolKit

A toolkit designed for the analysis of short reads obtained from end-sequence RNA-seq. ESAT addresses mis-annotated or sample-specific transcript boundaries by providing a search step in which it identifies possible unannotated ends de novo. It provides a robust handling of multi mapped reads, which is critical in 3’ DGE analysis. ESAT provides a module specifically designed for alternative start or 3’ UTR (untranslated region) differential isoform expression. It also includes a set of features specifically designed for the analysis of single-cell RNA-seq data.

Eclair / Ensemble Cell Lineage Analysis with Improved Robustness

A computational method for the statistical inference of cell lineage relationships from single-cell gene expression data. ECLAIR uses an ensemble approach to improve the robustness of lineage predictions, and provides a quantitative estimate of the uncertainty of lineage branchings. We show that the application of ECLAIR to published datasets successfully reconstructs known lineage relationships and significantly improves the robustness of predictions. In conclusion, ECLAIR is a powerful bioinformatics tool for single-cell data analysis. It can be used for robust lineage reconstruction with quantitative estimate of prediction accuracy.


Makes analysis more broadly accessible to researchers. Granatum is a web browser based scRNAseq analysis pipeline that conveniently walks the users through various steps of scRNA-seq analysis. It has a comprehensive list of modules, including plate merging and batch effect removal, outlier sample removal, gene filtering, gene expression normalization, cell clustering, differential gene expression analysis, pathway/ontology enrichment analysis, protein network interaction visualization, and pseudo-time cell series construction.


Allows quality control (QC) and analysis components of parallel single cell transcriptome and epigenome data. Dr.seq is a quality control (QC) and analysis pipeline that provides both multifaceted QC reports and cell clustering results. Parallel single cell transcriptome data generated by different technologies can be transformed to the standard input with contained functions. Using relevant commands, the software can also be used to report quality measurements based on four aspects and can generate detailed analysis results for scATAC-seq and Drop-ChIP datasets.

SCDIFF / Cell Differentiation Analysis Using Time-series Single Cell RNA-seq Data

Rebuilds dynamic regulatory networks from single cell time series data. SCDIFF exploits the cell differentiation process that uses time-series single cell RNA-seq data. This tool is appropriate to predict transcription factors that regulate the cell differentiation process. It uses static information about targets of transcription factors (TF). This method enhances both the learning of a branching model and the identification of TF that adjust various stages in the process.


A density-based clustering algorithm, which is both time- and space-efficient and proceeds as follows: densityCut first roughly estimates the densities of data points from a K-nearest neighbour graph and then refines the densities via a random walk. A cluster consists of points falling into the basin of attraction of an estimated mode of the underlining density function. A post-processing step merges clusters and generates a hierarchical cluster tree. The number of clusters is selected from the most stable clustering in the hierarchical cluster tree. densityCut effectively clustered irregular shape synthetic benchmark datasets. We have successfully used densityCut to cluster variant allele frequencies of somatic mutations, single-cell gene expression data, and single-cell CyTOF data. densityCut is based on density estimation on graphs. It could be considered as a variation of the spectral clustering algorithms but is much more time- and space-efficient. Moreover, it automatically selects the number of clusters and works for the datasets with a large number of clusters. In summary, densityCut does not make assumptions about the shape, size, and the number of clusters, and can be broadly applicable for exploratory data analysis.


An unsupervised hierarchical clustering approach for the identification of putative cell sub-populations from single-cell transcriptomics profiles. Clustering occurs in a linearly transformed subspace obtained from principal component directions and, at each level of our hierarchical clustering structure, the similarity between clusters is measured in subspaces of decreasing dimensionality by discarding principal directions as the number of clusters decreases. Using two real single cell datasets, we compared our approach to other commonly used statistical techniques, such as K-means and hierarchical clustering. We found that pcaReduce was able to give more consistent clustering structures when compared to broad and detailed cell type labels.

SAUCIE / Sparse Autoencoder for Unsupervised Clustering, Imputation, and Embedding

Offers a method for handling and extracting structure from single-cell RNA-sequencing and CyTOF data. SAUCIE is a standalone software that provides a deep learning approach developed for the analysis of single-cell data from a cohort of patients. The application is based on different layers able to performs several tasks such as data imputation, clustering, batch correction or visualization. The approach is based on the autoencoder neural network framework for unsupervised learning.

PIVOT / Platform for Interactive analysis and Visualization Of Transcriptomics data

Allows users to analyze and visualize RNA-Seq data. PIVOT furnishes four mains functionalities (i) a graphical interface that is able to wrap existing open source packages in a single user-interface (ii) multiple tools to manipulate datasets to perform derivation or normalization (iii) a way for allowing the compatibility between inputs and outputs from different analysis modules and, (iv) functions for automatically generate reports, publication-quality figures, and reproducible computations.

ascend / Analysis of Single Cell Expression, Normalisation and Differential expression

Allows creation of workflow for the analysis of Single cell RNA sequencing (scRNA-seq) experiments. ascend can handle data generated from any single cell library preparation platform. It includes functions to leverage multiple CPUs, allowing most analyses to be performed on a standard desktop or laptop. In summary, this tool implements a state-of-the-art unsupervised clustering method and integrates established analysis techniques for normalization and differential gene expression.

SCENIC / Single Cell rEgulatory Network Inference and Clustering

Allows to reconstruct gene regulatory networks (GRNs). SCENIC uses single-cell RNA-seq data to identify stable cell states. It analyzes all the co-expression modules using cis-regulatory motif analyses. The tool reduces data dimensionality by using transcription factor (TF) regulons rather than principal components. It accounts for noise and removes technical biases, and uncovers master regulators and gene regulatory networks for each cell type.

UNCURL / UNified CompUtational framework for scRNA-seq data processing and Learning

Allows unsupervised and semi-supervised learning using Single Cell RNA-Seq data. To operate these learning, UNCURL provides a method for standardizing any prior biological information including bulk RNA-seq data, microarray data or even information about individual marker gene expression to a form compatible with scRNA-Seq data. Additionally, this package allows the integration of prior information which leads to large improvements in accuracy.

Sake / Single-cell RNA-Seq Analysis and Klustering Evaluation

Assists in navigating through the expression profile. SAKE is an R package that uses non-negative matrix factorization (NMF) method for unsupervised clustering. It offers (i) quality controls modules to compare total sequenced reads to total gene transcripts detected, (ii) sample correlation heatmap plot, (iii) heatmap of sample assignment from NMF run, with dark red indicating high confidence in cluster assignments, and (iv) t-distributed stochastic neighbor embedding (t-SNE) plot to compare NMF assigned groups with t-SNE projections.

Neural network based cell type retrieval

Allows analysis and retrieval of single cell RNA-Seq data. Neural network based cell type retrieval is based on neural networks (NN) to obtain a reduced dimension representation of the single cell expression data. It is able to learn the importance of different combinations of gene expression levels for defining cell types and such combination are usually more robust than values for individual genes or markers. The tool achieves very good classification performance on training data and improves upon prior methods when used to cluster datasets from experiments that were not used in the training.

SINCERA / SINgle CEll RNA-seq profiling Analysis

A generally applicable analytic pipeline for processing single-cell RNA-seq data from a whole organ or sorted cells. SINCERA provides a panel of analytic tools for users to conduct data filtering, normalization, clustering, cell type identification, and gene signature prediction, transcriptional regulatory network construction and important regulatory node identification. The pipeline enables RNA-seq analysis from heterogeneous single cell preparations after the nucleotide sequence reads are aligned to the genome of interest.

ASAP / Automated Single-cell Analysis Pipeline

Aims at the complete analysis of scRNA-seq data post genome alignment: from the parsing, filtering, and normalization of the input count data files, to the visual representation of the data, identification of cell clusters, differentially expressed genes (including cluster-specific marker genes), and functional gene set enrichment. ASAP combines a wide range of commonly used algorithms with sophisticated visualization tools. It allows researchers to interact with the data in a straightforward fashion and in real time.


A linear modeling framework that correlates genotype and phenotype information in scRNA-seq data. SSrGE uses an accumulative ranking approach to select expressed nucleotide variations linked to the expression of a particular gene. SSrGE infers a sparse linear model for each gene and keeps the non-null inferred coefficients. SSrGE can be used as a dimension reduction/feature selection procedure or as a feature ranking. In all the cancer datasets tested, effective and expressed nucleotide variations (eeSNVs) achieve better accuracies and visualization than gene expression for identifying subpopulations

CIDR / Clustering through Imputation and Dimensionality Reduction

An ultrafast algorithm which uses a novel yet very simple "implicit imputation" approach to alleviate the impact of dropouts in single cell RNA-seq (scRNA-seq) data in a principled manner. Using a range of simulated and real data, we have shown that CIDR outperforms the state-of-the-art methods, namely t-SNE, ZIFA and RaceID, by at least 50% in terms of clustering accuracy, and typically completes within seconds for processing a dataset of hundreds of cells. We believe that single-cell mRNA sequencing in combination with the RaceID algorithm is a powerful tool to unravel heterogeneity of rare cell types in both healthy and diseased organs.

CALISTA / Clustering And Lineage Inference in Single-Cell Transcriptional Analysis

Permits analysis of single-cell gene transcriptional profiles. CALISTA provides three types of study: cell clustering, lineage progression inference, and pseudotemporal cell ordering. It constructs the cell lineage progression graph using cluster distances, a likelihood-based measure of dissimilarity between cell clusters. This tool can assign cells to state transition edges and assess the cell pseudotimes.