Best bioinformatics software for RNA-seq quality control
RNA-sequencing (RNA-seq) is currently the leading technology for transcriptome analysis. RNA-seq has a wide range of applications, from the study of alternative gene splicing, post-transcriptional modifications, to comparison of relative gene expression between different biological samples.
To help you prepare and analyse your RNA-seq experiments under the best conditions, we have launched a new series of surveys focused on the best tools for each fundamental step of an RNA-seq experiment.
Starting your analysis with quality control
The first step in the analysis of an RNA-seq experiment is quality control. This crucial step will ensure that your data are of the best quality to perform the subsequent steps of your analysis. Quality control usually includes sequence quality, sequencing depth, reads duplication rates (clonal reads), alignment quality, nucleotide composition bias, etc. So we have started this series by presenting you the best QC tools, as chosen by the OMICtools community!
|1.||NGS QC Toolkit|
First position for NGS QC Toolkit
NGS QC Toolkit was the favorite tool for 79% of OMICtools members.
This standalone and open source application proposes several QC tools to quality check and filter your NGS data. The toolbox is divided into 4 major groups of tools:
- Quality control tools for Illumina or Roche 454 data
- Trimming tools
- Format conversion tools
- Statistics tools
All QC tools can generate graphs as outputs, as well as diverse statistics, such as average quality scores at each base position, GC content distribution, etc.
Second position for RseqFlow
RseqFlow is a RNA-seq analysis pipeline that covers pre- and post-mapping quality control, as well as other analysis steps. The pipeline is divided into 4 branches, that can be run individually or in a workflow mode.
- Branch 1: Quality Control and SNP calling based on the merging of alignments to the transcriptome and genome.
- Branch 2: Expression level quantification for Gene/Exon/Splice Junctions based on alignment to the transcriptome.
- Branch 3: Some file format conversions for easy storage, backup and visualization.
- Branch 4: Differentially expressed gene identification based on the output of the expression level quantification from Branch 2.
RseqFlow provides a downloadable Virtual Machine (VM) image managed with Pegasus, that allows users to run the pipeline easily using different computational resources, available here at this Link.
RseqFlow can also be run with a unix shell mode that allows users to execute each branch of analysis with a unix command (the following software must be pre-installed: Python 2.7 or higher, R 2.11 or higher, and GCC).
Third position for Trim Galore!
Trim Galore! is a wrapper script to automate quality and adapter trimming as well as quality control, with some added functionalities to remove biased methylation positions for RRBS sequence files (for directional, non-directional (or paired-end) sequencing).
Its main features include:
- For adapter trimming, Trim Galore! uses the first 13 bp of Illumina standard adapters (‘AGATCGGAAGAGC’) by default (suitable for both ends of paired-end libraries), but accepts other adapter sequences too
- For MspI-digested RRBS libraries, Trim Galore! performs quality and adapter trimming in two sequential steps. This allows it to remove 2 additional bases that contain a cytosine which was artificially introduced in the end-repair step during the library preparation
- For any kind of FastQ file other than MspI-digested RRBS, Trim Galore! can perform single-pass adapter- and quality trimming
- The Phred quality of basecalls and the stringency for adapter removal can be specified individually
- And more…
Trim Galore is built around Cutadapt and FastQC, and thus requires both tools to be installed to function properly. The tool is downloadable here at this Link and comes with a comprehensive and illustrated User Guide.
(Patel RK et al., 2012) NGS QC Toolkit: A Toolkit for Quality Control of Next Generation Sequencing Data. PLoS ONE.
(Wang et al., 2011) RseqFlow: workflows for RNA-Seq data analysis. Bioinformatics.
(Wu et al., 2011) Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq. Bioinformatics.