CEL-Seq provides its first single-cell, on-chip barcoding method, and we detected gene expression changes accompanying the progression through the cell cycle in mouse fibroblast cells. The pipeline consists of the following steps: (1) demultiplexing: using the barcode from R1 we split R2 reads into their original samples creating a separate file for each sample. Since the unique molecular identifier (UMI) is also read in R1 we extract it and attach it to the R2 read metadata for downstream analysis; (2) mapping: using Bowtie2, we map the reads of the different samples in parallel, cutting the analysis time by roughly the number of available cores; (3) read counting: A modified version of the htseq-count script that supports the identification and elimination of reads sharing the same UMI to generate an accurate molecule count for each feature. We use binomial statistics to convert the number of UMIs into transcript counts. The different steps in the pipeline are wrapped together in a single program with a simple configuration file allowing to control for different run modes.
Allows quality control (QC) and analysis components of parallel single cell transcriptome and epigenome data. Dr.seq is a quality control (QC) and analysis pipeline that provides both multifaceted QC reports and cell clustering results. Parallel single cell transcriptome data generated by different technologies can be transformed to the standard input with contained functions. Using relevant commands, the software can also be used to report quality measurements based on four aspects and can generate detailed analysis results for scATAC-seq and Drop-ChIP datasets.
Processes Chromium single cell 3’ RNA-seq output to align reads, generates gene-cell matrices and performs clustering and gene expression analysis. Cell Ranger combines Chromium-specific algorithms with the widely-used RNA-seq aligner STAR. It is delivered as a single, self-contained tar file that can be unpacked anywhere on the system. The tool includes four pipelines: cellranger mkfastq; cellranger count; cellranger aggr; cellranger reanalyze.
Determines thresholds in deep-sequencing datasets of short RNA transcripts. Threshold-seq addresses the critical question of how many reads need to support a short RNA molecule in a given dataset before it can be considered different from “background. It can work with individual datasets; i.e. it does not require the availability of technical or of biological replicates. The tool achieves a good balance between sensitivity and specificity by resampling the distinct sequences of the dataset at hand.
A cloud-based framework designed for multi-sample analysis of transcriptomic data in an efficient and scalable manner. Falco utilises state-of-the-art big data technology of Apache Hadoop and Apache Spark to perform massively parallel alignment, quality control, and feature quantification of single-cell transcriptomic data in Amazon Web Service (AWS) cloud-computing environment. We have evaluated the performance of Falco using two public scRNA-seq datasets and demonstrated Falco's scalability. The result shows Falco performs at least 2.6x faster against a highly optimized single node analysis and a reduction in runtime with increasing number of computing nodes. Falco also allows user to the utilise low-cost spot instances of AWS, providing a 65% reduction in cost of analysis.
Allows users to transform raw data from dropSeq/scrbSeq experiment to the final count matrix with QC plots. dropSeqPipe is an open source application that can perform five different tasks: (i) generate fastqc reports of the input data, (ii) obtain the final file for the aligned sorted data, (iii) produce plots based on pre-processing and alignement, (iv) create species plot, and (v) extract the expression data.