CEL-Seq provides its first single-cell, on-chip barcoding method, and we detected gene expression changes accompanying the progression through the cell cycle in mouse fibroblast cells. The pipeline consists of the following steps: (1) demultiplexing: using the barcode from R1 we split R2 reads into their original samples creating a separate file for each sample. Since the unique molecular identifier (UMI) is also read in R1 we extract it and attach it to the R2 read metadata for downstream analysis; (2) mapping: using Bowtie2, we map the reads of the different samples in parallel, cutting the analysis time by roughly the number of available cores; (3) read counting: A modified version of the htseq-count script that supports the identification and elimination of reads sharing the same UMI to generate an accurate molecule count for each feature. We use binomial statistics to convert the number of UMIs into transcript counts. The different steps in the pipeline are wrapped together in a single program with a simple configuration file allowing to control for different run modes.
Allows quality control (QC) and analysis components of parallel single cell transcriptome and epigenome data. Dr.seq is a quality control (QC) and analysis pipeline that provides both multifaceted QC reports and cell clustering results. Parallel single cell transcriptome data generated by different technologies can be transformed to the standard input with contained functions. Using relevant commands, the software can also be used to report quality measurements based on four aspects and can generate detailed analysis results for scATAC-seq and Drop-ChIP datasets.
Provides a way of removing amplification biases, the assumed absolute quantification does not appear to hold true perfectly. Umis is a flexible tool for counting the number of unique molecular identifiers. There are four steps in this method: (i) formatting reads, (ii) filtering noisy cellular barcodes, (iii) pseudo-mapping to cDNAs, and (iv) counting molecular identifiers. The quantitation used in umis handles reads that could come from multiple transcripts by assigning a fractional count to each transcript and then filtering for a minimum count at the end.
Processes Chromium single cell 3’ RNA-seq output to align reads, generates gene-cell matrices and performs clustering and gene expression analysis. Cell Ranger combines Chromium-specific algorithms with the widely-used RNA-seq aligner STAR. It is delivered as a single, self-contained tar file that can be unpacked anywhere on the system. The tool includes four pipelines: cellranger mkfastq; cellranger count; cellranger aggr; cellranger reanalyze.
Allows to obtain high-fidelity mutation profiles and call ultra-rare variants by handling caveats of Unique Molecular Identifier (UMI)-based analysis. MAGERI accounts for polymerase chain reaction (PCR) errors by using a variant quality scoring model. It can handle reads with high error load, indels and random offsets. The tool was able to detect circulating tumor DNA (ctDNA) in peripheral blood of cancer patients. It allows easy and efficient processing of high-throughput sequencing data generated.
Demonstrates the value of properly accounting for errors in unique molecular identifiers (UMIs). UMI-tools removes PCR duplicates and implements a number of different UMI deduplication schemes. It can extract, remove and append UMI sequences from fastq reads. Compared with previous method, this one is superior at estimating the true number of unique molecules. The simulations provide an insight into the impact on quantification accuracy and indicate that application of an error-aware method is even more important with higher sequencing depth.
Allows users to handle sequencing data with unique molecular identifiers (UMIs). Umitools can be used for small RNA-seq data and RNA-seq data. This tool facilitates the processing of data that has incorporated a UMI assuming if the UMI is incorporated as a part of the read.