Best bioinformatics software for single-cell RNA sequencing

RNA-sequencing is often performed on well-identified groups of cells thought to be homogeneous. However, quantification of molecular changes is made by estimating the mean value from millions of cells and averaging the signal of individual cells, thus ignoring cell-to-cell heterogeneity. Single-cell RNA-sequencing (scRNAseq) enables to unravel the heterogeneity of cell genotype, phenotype, and function within a given subpopulation.

 

ScRNA-seq now has a wide variety of applications, and numerous tools were developed to analyze this new kind of sequencing data. To help you perform your experiments in the best conditions, we asked OMICtools members to choose their favorite scRNA-seq analysis tools.

 

Main applications for scRNA-sequencing

 

Single-cell RNA sequencing finds its main applications in immunology, cancerology, and the study of development. This technology has already permitted to refine our comprehension of differentiation decisions made by histologically identical cells, to identify new cell types and states within organs, and has promising applications in personalized medicine for cancer and biomarker identification.

 

 

Sequencing at the single cell level encouraged the development of new analysis methods and computational approaches, such as pseudotime cell ordering. To help you choose between all available tools, we asked the OMICtools community to choose for the best sc-RNA seq analysis tools. Here is the top 3 of this survey.

 

1. Seurat and Monocle
2. TSCAN and RCA
3. Whishbone

First position for Seurat and Monocle

 

You were 47% to choose Seurat and Monocle as your favorite scRNA-seq analysis tool.

 

Seurat is an R package that enables quality control (QC), analysis, and exploration of single cell RNA-seq data. The software includes three computational methods: (1) unsupervised clustering and discovery of cell types and states, (2) spatial reconstruction of single cell data, and (3) integrated analysis of single cell RNA-seq across conditions, technologies, and species. It can also localize rare subpopulations, and map both spatially restricted and scattered groups.

 

Seurat main asset is its ability to use data from different sequencing technology, species or condition, and integrate them to identify shared population across data sets and downstream comparative analysis, by identifying shared sources of variation.

 

Overview of Seurat alignment of single-cell RNA-seq datasets
Overview of Seurat alignment of single-cell RNA-seq data sets.

 

Monocle is a comprehensive package that provides tools for analyzing single-cell expression experiments. Monocle introduced the strategy of ordering single cells in pseudotime, placing them along a trajectory corresponding to a biological process such as cell differentiation by taking advantage of individual cell’s asynchronous progression of those processes.

 

Monocle orders cells by learning an explicit principal graph from the single cell genomics data with advanced machine learning techniques (Reversed Graph Embedding), which robustly and accurately resolves complicated biological processes. Monocle also performs clustering (i.e. using t-SNE and density peaks clustering). Monocle then performs differential gene expression testing, allowing one to identify genes that are differentially expressed between different state, along a biological process as well as alternative cell fates.

 

Second position for TSCAN and RCA

 

TSCAN and RCA were chosen by 43% of OMICtools voters.

 

TSCAN is another tool to perform pseudo-temporal ordering of cells based on the gradual transition of their transcriptomes. TSCAN uses a cluster-based minimum spanning tree (MST) approach to order cells. Cells are first grouped into clusters and an MST is then constructed to connect cluster centers. Pseudo-time is obtained by projecting each cell onto the tree, and the ordered sequence of cells can be used to study dynamic changes of gene expression along the pseudo-time. Clustering cells before MST construction reduces the complexity of the tree space. This often leads to improved cell ordering. It also allows users to conveniently adjust the ordering based on prior knowledge. TSCAN has a graphical user interface (GUI) to support data visualization and user interaction. Furthermore, quantitative measures are developed to objectively evaluate and compare different pseudo-time reconstruction methods.

 

TSCAN overview
TSCAN overview. From Zhicheng Ji and Hongkai Ji.

 

TSCAN comes as a Bioconductor package or a Web user interface.  A short video demo of TSCAN is also available on Youtube.

 

Reference Component Analysis (RCA) projects single-cell transcriptomes into a space defined by variability in a reference data set. RCA is an R package for robust clustering analysis of scRNAseq. This method outperforms existing algorithms for clustering single-cell transcriptomes and generates tight cell clusters consisting almost entirely of cells of the same type. It also identifies multiple cell types in colorectal cancer tumors and normal mucosa, despite the strong batch effects in clinical samples.
RCA includes the following features:

 

  • clustering analysis of scRNAseq data from Human samples.

 

  • Three modes :
    • GlobalPanel: default option for clusterig general single cell data sets that include a wide spectrum of cell types.
    • ColonEpitheliumPanel: suitable for analyzing human colon/intestine derived samples.
    • SelfProjection: suitable for analyzing data sets from not-well-studied tissue types (still under optimization).

 

Third position for Wishbone

 

Wishbone is an algorithm to align single cells from differentiation systems with bifurcating branches. Wishbone pinpoints bifurcation points and labels each cell as pre-bifurcation or as one of two post-bifurcation cell fates to order cells according to their developmental progression. It is generalizable to additional lineages, as it was demonstrated by applying it to mouse myeloid differentiation. Wishbone has been designed to work with multidimensional single cell data from diverse technologies such as Mass cytometry and single cell RNA-seq.

 

Wishbone single cell rna sequencing
Alignment of cells along bifurcating trajectories with Wishbone.

 

Wishbone is implemented in Python3. Tutorials on Wishbone usage and results visualization for scRNA-seq or mass cytometry data can be found on Github.

References

 

Cole Trapnell et al. (2014). The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nature Biotechnology.

Andrew Butler et al. (2018). Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nature Biotechnology.

Zhicheng Ji and Hongkai Ji. (2016). TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis. Nucleic Acids Research.

Huipeng Li et al. (2017). Reference component analysis of single-cell transcriptomes elucidates cellular heterogeneity in human colorectal tumors. Nature Genetics.

Manu Setty et al. (2017). Wishbone identifies bifurcating developmental trajectories from single-cell data. Nature Biotechnology.