Dimension reduction software tools | Single-cell RNA sequencing data analysis
Single cell RNA sequencing (scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities in single cell level. It is an important step for studying cell sub-populations and lineages based on scRNA-seq data by finding an effective low-dimensional representation and visualization of the original data. The scRNA-seq data are much noiser than traditional bulk RNA-Seq: in the single cell level, the transcriptional fluctuations are much larger than the average of a cell population and the low amount of RNA transcripts will increase the rate of technical dropout events.
Allows studying of spatial patterning of gene expression at the single-cell level. Seurat is an R package that enables quality control (QC), analysis, and exploration of single cell RNA-seq data. The software includes three computational methods: (1) unsupervised clustering and discovery of cell types and states, (2) spatial reconstruction of single cell data, and (3) integrated analysis of single cell RNA-seq across conditions, technologies, and species. It can also localize rare subpopulations, and map both spatially restricted and scattered groups.
Facilitates the analysis of cellular heterogeneity, the identification of cell types, and comparison of functional markers in response to perturbations, based on a versatile method. SPADE helps to organize high-dimensional cytometry data in an unsupervised manner, and to investigate natural and pathogenic cellular heterogeneity for biological insight. The SPADE algorithm consists of four components: (i) density-dependent downsampling, (ii) clustering, (iii) linking clusters with a minimum spanning tree, and (iv) upsampling to restore all cells in the final result. This modularized process allows more efficient sub-algorithms to replace the current components. In this sense, SPADE can be viewed as a framework for cytometric data analysis and visualization that has the capacity to be evolved and adapted.
Allows users to analyze single-cell gene expression experiments. Monocle can realize differential expression analysis, clustering, visualization, and other useful tasks on single-cell expression data. The software enjoins individual cells according to a defined progress through a biological process, without knowing ahead of time which genes define progress through that process. It is designed to work with RNA-Seq and quantitative polymerase chain reaction (qPCR) data, and implements Census and BEAM tools.
Leads to low-dimensional representations of the data that account for zero inflation (dropouts), over-dispersion, and the count nature of the data. ZINB-WaVE is a general and flexible zero-inflated negative binomial model which is able to give a more stable and accurate low dimensional representation of the data than principal component analysis (PCA) and zero-inflated factor analysis (ZIFA), without the need for a preliminary normalization step.
An easy-to-use application for microarray, RNA-Seq and metabolomics analysis. For splicing sensitive platforms (RNA-Seq or Affymetrix Exon, Gene and Junction arrays), AltAnalyze will assess alternative exon (known and novel) expression along protein isoforms, domain composition and microRNA targeting. In addition to splicing-sensitive platforms, AltAnalyze provides comprehensive methods for the analysis of other data (RMA summarization, batch-effect removal, QC, statistics, annotation, clustering, network creation, lineage characterization, alternative exon visualization, gene-set enrichment and more).
Consists of a graph learning algorithm for building the trajectory tree according to the similarities among data points from highly scattered data. DDRTree (i) reduces high dimension data into a low dimension space, (ii) recovers an explicit smooth graph structure with local geometry only captured by distances of data points in the low dimension space, and (iii) obtains clustering structures of data points in reduced dimension. It can be used to outline the common disease progression trajectories of a population.
Processes Chromium single cell 3’ RNA-seq output to align reads, generates gene-cell matrices and performs clustering and gene expression analysis. Cell Ranger combines Chromium-specific algorithms with the widely-used RNA-seq aligner STAR. It is delivered as a single, self-contained tar file that can be unpacked anywhere on the system. The tool includes four pipelines: cellranger mkfastq; cellranger count; cellranger aggr; cellranger reanalyze.
Serves for single-cell data analysis. Granatum is a program that provides biologists with access to single-cell bioinformatics methods, and software developers with the opportunity to promote and combine their tools with various others in customizable pipelines. Its architecture simplifies the incorporation of cutting-edge tools and enables handling of large datasets. Moreover, it can eliminate inter-module incompatibilities by isolating the dependencies of each module.
Allows users to capture and visualize the low-dimensional structures in single-cell gene expression data. scvis is a robust latent variable model that allows to spot underlying low-dimensional structures in scRNA-seq data. It learns a parametric mapping from the high-dimensional space to a low-dimensional embedding. This tool estimates the uncertainty of mapping a high-dimensional point to a low-dimensional space which adds rich capacity to interpret results.
Single cell RNA-seq data allows insight into normal cellular function and diseases including cancer through the molecular characterisation of cellular state at the single-cell level. Dimensionality reduction of such high-dimensional datasets is essential for visualization and analysis, but single-cell RNA-seq data is challenging for classical dimensionality reduction methods because of the prevalence of dropout events leading to zero-inflated data. ZIFA is a dimensionality reduction method which explicitly models the dropout characteristics.
Allows analysis of single-cell gene expression data. Scanpy integrates preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing and simulation of gene regulatory networks. It enables interfacing of advanced machine learning packages. This tool provides pseudotemporal-ordering and the reconstruction of branching trajectories. It allows simulating single cells governed by gene regulatory networks.
Assists in navigating through the expression profile. SAKE is an R package that uses non-negative matrix factorization (NMF) method for unsupervised clustering. It offers (i) quality controls modules to compare total sequenced reads to total gene transcripts detected, (ii) sample correlation heatmap plot, (iii) heatmap of sample assignment from NMF run, with dark red indicating high confidence in cluster assignments, and (iv) t-distributed stochastic neighbor embedding (t-SNE) plot to compare NMF assigned groups with t-SNE projections.
Offers a method for rare cell type identification into single-cell RNA-seq. GiniClust can perform its detection on both in normal tissues and disease samples. This program is based on a modification of the Gini index which was normalized and defined as bidirectional to allows the identification of genes specifically unexpressed in a rare cell type and the removal of a systematic bias toward lowly expressed genes.
A toolkit designed for the analysis of short reads obtained from end-sequence RNA-seq. ESAT addresses mis-annotated or sample-specific transcript boundaries by providing a search step in which it identifies possible unannotated ends de novo. It provides a robust handling of multi mapped reads, which is critical in 3’ DGE analysis. ESAT provides a module specifically designed for alternative start or 3’ UTR (untranslated region) differential isoform expression. It also includes a set of features specifically designed for the analysis of single-cell RNA-seq data.
Performs a simultaneous detection of common and rare cell types from single-cell gene expression data. GiniClust2 is a cluster-aware, weighted ensemble clustering method that combines Gini index- and Fano factor-based clustering methods. This software clusters the targeted cells using Gini index-based features followed by a second clustering, using then Fano factor-based features, to lastly combine each result via a cluster-aware, weighted ensemble approach.
Offers a method for dimensionality reduction based on parametrization. t-SNE parametrizes the non-linear mapping between the data space and the latent space by means of a feed-forward neural network. This software is implemented into seven different languages, and, additionally, as Barnes-Hut and parametric implementation. This tool is fitted for the visualization of high-dimensional datasets.
Preserves distinct structural properties of the data. dropClust uses Locality Sensitive Hashing (LSH), a logarithmic-time algorithm to determine approximate neighborhood for individual transcriptomes. It employs an exponential decay function to select higher number of expression profiles from clusters of relatively smaller sizes. This tool is able to detect principal components (PCs) with multi-modal distribution of the projected transcriptomes by using mixtures of Gaussians.
Learns representations for scRNA-seq data by considering the prior gene–gene association. SCRL is a data-driven and nonlinear dimension reduction method based on network-based embedding technique. It provides two advantages: (i) it can integrate both scRNA-seq data and prior biological knowledge for more insightful low-dimensional representations, and (ii) it can simultaneously learn a shared low-dimensional representation for both cells and genes.
Allows quality control (QC) and analysis components of parallel single cell transcriptome and epigenome data. Dr.seq is a quality control (QC) and analysis pipeline that provides both multifaceted QC reports and cell clustering results. Parallel single cell transcriptome data generated by different technologies can be transformed to the standard input with contained functions. Using relevant commands, the software can also be used to report quality measurements based on four aspects and can generate detailed analysis results for scATAC-seq and Drop-ChIP datasets.
An easy to use R package allowing for easy creation and plotting of diffusion maps. Diffusion maps are a spectral method for non-linear dimension reduction and have recently been adapted for the visualization of single cell expression data. This allows to visualize high-dimensional relations between data points in a low-dimensional plot. destiny includes a single-cell specific noise model allowing for missing and censored values.