A computational method to reconstruct full-length, paired T cell receptor (TCR) sequences from T lymphocyte single-cell RNA sequence data. TraCeR links T cell specificity with functional response by revealing clonal relationships between cells alongside their transcriptional profiles. TraCeR extracts TCR-derived sequencing reads for each cell by alignment against ‘combinatorial recombinomes’ comprising all possible combinations of V and J segments. Reads are then assembled into contiguous sequences that are analyzed to find full-length, recombined TCR sequences. Importantly, the reconstructed recombinant sequences typically contain nearly the complete length of the TCR V(D)J region and so allow high-confidence discrimination between closely related gene segments. Our method is sensitive, accurate and easy to adapt to any species for which annotated TCR gene sequences are available.
Provides a way of removing amplification biases, the assumed absolute quantification does not appear to hold true perfectly. Umis is a flexible tool for counting the number of unique molecular identifiers. There are four steps in this method: (i) formatting reads, (ii) filtering noisy cellular barcodes, (iii) pseudo-mapping to cDNAs, and (iv) counting molecular identifiers. The quantitation used in umis handles reads that could come from multiple transcripts by assigning a fractional count to each transcript and then filtering for a minimum count at the end.
Rebuilds paired full-length B-cell receptor (BCR) sequences. BraCeR is a program which can be used for downstream analyses. This program is able to reconstitute multiple heavy and light chains detected within a target cell as well as to highlight non-productively rearranged chains. This program can also be used as a method for deducing clonal relationships and perform immunoglobulin lineage reconstruction.
Reconstructs continuous biological processes at single-cell resolution. Waterfall is a pipeline that uses k-means clustering to build a trajectory and assign an individual cell a pseudotime based on each cell’s proximity to the cluster-derived trajectory. Adult neurogenesis was used as a model and the software was applied to other stem cell datasets. It can be used for single-cell omics analyses of various continuous biological processes.
CEL-Seq provides its first single-cell, on-chip barcoding method, and we detected gene expression changes accompanying the progression through the cell cycle in mouse fibroblast cells. The pipeline consists of the following steps: (1) demultiplexing: using the barcode from R1 we split R2 reads into their original samples creating a separate file for each sample. Since the unique molecular identifier (UMI) is also read in R1 we extract it and attach it to the R2 read metadata for downstream analysis; (2) mapping: using Bowtie2, we map the reads of the different samples in parallel, cutting the analysis time by roughly the number of available cores; (3) read counting: A modified version of the htseq-count script that supports the identification and elimination of reads sharing the same UMI to generate an accurate molecule count for each feature. We use binomial statistics to convert the number of UMIs into transcript counts. The different steps in the pipeline are wrapped together in a single program with a simple configuration file allowing to control for different run modes.
An easy-to-use application for microarray, RNA-Seq and metabolomics analysis. For splicing sensitive platforms (RNA-Seq or Affymetrix Exon, Gene and Junction arrays), AltAnalyze will assess alternative exon (known and novel) expression along protein isoforms, domain composition and microRNA targeting. In addition to splicing-sensitive platforms, AltAnalyze provides comprehensive methods for the analysis of other data (RMA summarization, batch-effect removal, QC, statistics, annotation, clustering, network creation, lineage characterization, alternative exon visualization, gene-set enrichment and more).
Processes Chromium single cell 3’ RNA-seq output to align reads, generates gene-cell matrices and performs clustering and gene expression analysis. Cell Ranger combines Chromium-specific algorithms with the widely-used RNA-seq aligner STAR. It is delivered as a single, self-contained tar file that can be unpacked anywhere on the system. The tool includes four pipelines: cellranger mkfastq; cellranger count; cellranger aggr; cellranger reanalyze.
Quantifies splicing in individual single cells. BRIE is a flexible framework that detects differential splicing between individual cells from scRNA-seq data. This method was developed for modelling and, while sequence features are particularly appealing due to their ease of usage and availability, additional side information, such as DNA methylation and chromatin accessibility, could easily be incorporated.
Demonstrates the value of properly accounting for errors in unique molecular identifiers (UMIs). UMI-tools removes PCR duplicates and implements a number of different UMI deduplication schemes. It can extract, remove and append UMI sequences from fastq reads. Compared with previous method, this one is superior at estimating the true number of unique molecules. The simulations provide an insight into the impact on quantification accuracy and indicate that application of an error-aware method is even more important with higher sequencing depth.
A toolkit designed for the analysis of short reads obtained from end-sequence RNA-seq. ESAT addresses mis-annotated or sample-specific transcript boundaries by providing a search step in which it identifies possible unannotated ends de novo. It provides a robust handling of multi mapped reads, which is critical in 3’ DGE analysis. ESAT provides a module specifically designed for alternative start or 3’ UTR (untranslated region) differential isoform expression. It also includes a set of features specifically designed for the analysis of single-cell RNA-seq data.
Performs initial pre-processing and analysis of the droplet-based scRNA-seq data. DropEst in composed of three steps: (1) identifier parsing phase; (2) read mapping phase; and (3) filtering and quality control phase. It can characterize the quality of a library using a wide range of diagnostic indicators or filters out artefactual cellular barcodes. This tool provides extensive configuration options to accommodate alternative scRNA-seq protocol designs.
Reconstructs T cell receptors (TCRs) from paired-end sequencing libraries of single cells, even at short (25 bp) read length. TRAPeS is a software that works on the original reads - leading to increased sensitivity. The TRAPeS algorithm has four main steps: (i) identifying putative pairs of variable (V) and joining (J) segments, (ii) collecting putative CDR3-originating reads, (iii) reconstructing the CDR3 and (iv) separating similar TCRs and determining chain productivity.
Estimates the biological variance of a gene and detects differentially expressed genes. TASC is a statistical framework, that models the cell-specific dropout rates and amplification bias by using external RNA spike-ins. It incorporates technical parameters, which reflect cell-to-cell batch effects, into a hierarchical mixture model. This tool is programmed to take advantage of multi-threaded parallelization.
Corrects the bias in expression quantification without specifying the source or format of the bias. BCseq is based on joint modeling of multiple cells that permits users to share information between cells. It provides a two-step weighting scheme that assigns a large weight to a cell and achieves an optimal estimator. This tool delivers a quality score for the expression measure of each gene in each cell.
Allows quality control (QC) and analysis components of parallel single cell transcriptome and epigenome data. Dr.seq is a quality control (QC) and analysis pipeline that provides both multifaceted QC reports and cell clustering results. Parallel single cell transcriptome data generated by different technologies can be transformed to the standard input with contained functions. Using relevant commands, the software can also be used to report quality measurements based on four aspects and can generate detailed analysis results for scATAC-seq and Drop-ChIP datasets.
Integrates an effective bias removal with a weighted expectation maximization (EM) algorithm to distribute reads among isoforms efficiently. WemIQ improves the quantification of isoform and gene expression as well as the derived exon inclusion rates. It provides robust expression estimates across different laboratories and protocols, which is valuable for the integrative analysis of RNA-seq. This tool can distinguish bias heterogeneity from true biological heterogeneity.
A cloud-based framework designed for multi-sample analysis of transcriptomic data in an efficient and scalable manner. Falco utilises state-of-the-art big data technology of Apache Hadoop and Apache Spark to perform massively parallel alignment, quality control, and feature quantification of single-cell transcriptomic data in Amazon Web Service (AWS) cloud-computing environment. We have evaluated the performance of Falco using two public scRNA-seq datasets and demonstrated Falco's scalability. The result shows Falco performs at least 2.6x faster against a highly optimized single node analysis and a reduction in runtime with increasing number of computing nodes. Falco also allows user to the utilise low-cost spot instances of AWS, providing a 65% reduction in cost of analysis.
Processes raw reads to count tables for RNA-seq data using Unique Molecular Identifiers (UMIs). zUMIs is a pipeline applicable for most experimental designs of RNA-seq data, such as single-nuclei sequencing techniques. This method allows for down sampling of reads before summarizing UMIs per feature, which is recommended for cases of highly different read numbers per sample. zUMIs is flexible with respect to the length and sequences of the barcodes (BCs) and UMIs, making it compatible with a large number of protocols.
Addresses the lack of a comprehensive workflow for processing sequencing data from 3 prime end protocols. scPipe can deal with both unique molecular identifiers (UMIs) and sample barcodes, map reads to the genome and summarizes these results into gene-level counts. It implements a simple outlier-based method for discovering low quality cells and possible doublets to remove from further analysis.
A software package for two-dimensional visualization of single cell data, which utilizes a plethora of projection methods and provides a way to systematically investigate the biological relevance of these low dimensional representations by incorporating domain knowledge. Annotated gene sets (referred to as gene 'signatures') are incorporated so that features in the projections can be understood in relation to the biological processes they might represent. FastProject provides a novel method of scoring each cell against a gene signature so as to minimize the effect of missed transcripts as well as a method to rank signature-projection pairings so that meaningful associations can be quickly identified. Additionally, FastProject is written with a modular architecture and designed to serve as a platform for incorporating and comparing new projection methods and gene selection algorithms.
Censuses expression from unique k-mers detected in genes within RNA-Seq data of interest. Matataki is an application able to verify fragments of reads at fixed skips for both indexed or not indexed the k-mer. This program focuses on the gene level to number expression directly. It aims to increase the speed of RNA-seq analysis and can also be used for large-scale reanalysis such as searching similar gene expression profiles.
A package for the summation of counts across all cells on each plate for each gene. The count sums can then be used in a differentially expressed (DE) analysis, effectively treating plates as individual samples. This restores type I error control and avoids the detection of excessive false positives. Summation prior to DE analyses also affects the biological conclusions of a real scRNA-seq study, by decreasing the size of the DE lists and improving the ranking of relevant genes relative to a conventional single-cell analysis.
Identifies and error-corrects barcodes by traversing the de Bruijn graph of circularized barcode k-mers. Sircel counts k-mers in circularized barcodes extracted from the reads. It assigns reads to consensus fingerprints constructed from k-mers. The tool permits to make insertion, deletion, and mismatch errors. It requires a minimal number of user-inputted parameters. Sircel can identify several cyclic paths from the barcode de Bruijn graph.
Allows users to transform raw data from dropSeq/scrbSeq experiment to the final count matrix with QC plots. dropSeqPipe is an open source application that can perform five different tasks: (i) generate fastqc reports of the input data, (ii) obtain the final file for the aligned sorted data, (iii) produce plots based on pre-processing and alignement, (iv) create species plot, and (v) extract the expression data.
Contains useful tools for the analysis of single-cell gene expression data using the statistical software R. scater places an emphasis on tools for quality control, visualisation and pre-processing of data before further downstream analysis. scater enables the following: (i) automated computation of QC metrics; (ii) transcript quantification from read data with pseudo-alignment; (iii) data format standardisation; (iv) rich visualisations for exploratory analysis; (v) seamless integration into the Bioconductor universe; (vi) simple normalisation methods.
0 - 0 of 0
1 - 5 of 5