A tool to create a single report visualizing output from multiple tools across many samples, enabling global trends and biases to be quickly identified. MultiQC allows accurate comparison between samples, allowing detection of subtle differences not noticeable when switching between different files. Data visualization aids batch effect detection and minimizes the risk of confounding factors affecting the results of the study.
MICRA / Microbial Identification and Characterization through Reads Analysis
Identifies and defines microbes via reads analysis. MICRA employs read mapping methods to make use of the increasing number of sequenced microbial genomes. The working consists in four parts: (1) pre-processing, (2) sequence identification, (3) identification of the closest reference genome and plasmids by the core part and (4) the post-analysis. This pipeline software is available as a download version and as a web interface.
Furnishes functions to control quality for high throughput sequence data. FastQC aims to provide a simple manner to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which users can employ to obtain a quick impression of whether data has any problems of which users should be aware before doing any further analysis.
Examines epigenomic and transcriptomic next generation sequencing (NGS) data. Octopus-toolkit can be used for antibody- or enzyme-mediated experiments and studies for the quantification of gene expression. It can accelerate the data mining of public epigenomic and transcriptomic NGS data for basic biomedical research. This tool provides a private and a public mode: one to process the user’s own data, and the other to analyze public NGS data by retrieving raw files from the GEO database.
Verifies sample identities from FASTQ, BAM or VCF files. NGSCheckMate uses a model-based method to compare allele read fractions at known single-nucleotide polymorphisms (SNPs), considering depth-dependent behavior of similarity metrics for identical and unrelated samples. It is effective for a variety of data types, including exome sequencing, whole-genome sequencing, RNAseq, ChIP-seq, targeted sequencing and single-cell whole-genome sequencing, with a minimal requirement for sequencing depth. The tool can be used as a quality control step in next-generation sequencing (NGS) studies.
CLC bio / CLC Genomics Workbench
Allows to analyze, compare, and visualize next generation sequencing (NGS) data. CLC Genomics Workbench offers a complete and customizable solution for genomics, transcriptomics, epigenomics, and metagenomics. The software enables to generate custom workflows, which can combine quality control steps, adapter trimming, read mapping, variant detection, and multiple filtering and annotation steps into a pipeline.
Assists users in manipulating high-throughput sequencing (HTS) data and formats. Picard is a Java toolkit that provides a set of command line scripts. It comprises Java-based utilities that manipulate SAM files, and a Java API for creating new programs that reads and writes SAM files. Both SAM text format and SAM binary (BAM) format are supported. It also works with next generation sequencing (NGS).
A package for input, quality assessment, manipulation and output of high-throughput sequencing data. ShortRead extends Bioconductor with tools useful in the initial stages of short-read DNA sequence analysis. Main functionalities include data input, quality assessment, data transformation and access to downstream analysis opportunities. It is an important gateway to use of Bioconductor for processing high-throughput DNA sequence data. ShortRead data structures allow convenient manipulation of data, such as filtering reads based on sequence characteristics.
Displays statistics of large sequence files from next-generation sequencing (NGS) projects. SAMStat is a program which plots nucleotide over-representation and other statistics in mapped and unmapped reads. The software can be used to verify individual processing steps in large analysis pipelines. Specific applications include the verification and quality control of processing pipelines, the tracking of data quality over time and the visualization of data properties derived from new protocols.
Allows users simultaneously perform mRNA and miRNA expression analysis. wapRNA is a web application that includes major processes for the next-generation mRNA or miRNA data analysis, including preprocessing raw sequenced reads, mapping tags to reference sequences, gene expression annotation, and other downstream functional analysis such as detecting differentially expressed genes, Gene Ontology and KEGG pathway analysis for RNA, novel miRNA predication and miRNA target prediction. Executable packages are available for users to build their pipeline locally.
KAT / K-mer Analysis Toolkit
A user-friendly, extendible and scalable toolkit for rapidly counting, comparing and analysing k-mers from various data sources. The tools in KAT assist the user with a wide range of tasks including error profiling, assessing sequencing bias and identifying contaminants and de novo genome assembly QC and validation. KAT is a C++11 application containing multiple tools, each of which exploits multi-core machines via multi-threading where possible. Core functionality is contained in a library designed to promote rapid development of new tools.
Allows users to characterize and quantify the set of all RNA molecules produced in cells. RseqFlow contains several modules that include: mapping reads to genome and transcriptome references, performing quality control (QC) of sequencing data, generating files for visualizing signal tracks based on the mapping results, calculating gene expression levels, identifying differentially expressed genes, calling coding single nucleotide polymorphisms (SNPs) and producing MRF and BAM files.
aRNApipe / automated RNA-seq pipeline
Analyzes single-end and stranded or unstranded paired-end RNA-seq data. aRNApipe focuses on high performance computing (HPC) environments and the independent designation of computational resources at each stage allowing optimization of HPC resources. It is highly flexible because its project configuration and management options. This tool can be adapted to changes in the current applications and the addition of new functionalities. It allows users to complete primary RNA-seq analysis.
An open-source platform for aggregating multiple sources of quality metrics as well as meta-data associated with DNA sequencing runs from Illumina NextSeq and HiSeq machines. AlmostSignificant is a graphical platform to streamline the quality control of DNA sequencing data, to collect and store these data for future reference and to collect extra meta-data associated with the sequencing runs to check for errors and monitor the volume of data produced by the associated machines.
ST Pipeline
Permits to process and analyze the raw files generated with the Spatial Transcriptomics (ST) method. ST Pipeline enables demultiplexing of spatially-resolved RNA-seq data and robust quality filtering and identification of unique molecules. It is highly customizable with numerous parameter settings. The tool is more robust, efficient and scales better to arrays with higher density. It filters data, aligns it to a genome, annotates it to a reference, demultiplexes by array coordinates and then aggregates by counts that are not duplicates using the Unique Molecular Identifiers.
Facilitates quality control (QC) of FASTQ files. FQC combines a command line interface (CLI) depending on FastQC for processing FASTQ files, and a frontend website for plotting, styling and interactivity. The CLI wraps FastQC and builds the website with default QC metrics upon which one can expand without additional programming. The CLI and dashboard lower the threshold of performing and following up on quality issues that may be apparent upon visual inspection and it promotes evidence-based protocol changes in sequencing facilities to generate better quality data.
Processes raw reads to count tables for RNA-seq data using Unique Molecular Identifiers (UMIs). zUMIs is a pipeline applicable for most experimental designs of RNA-seq data, such as single-nuclei sequencing techniques. This method allows for down sampling of reads before summarizing UMIs per feature, which is recommended for cases of highly different read numbers per sample. zUMIs is flexible with respect to the length and sequences of the barcodes (BCs) and UMIs, making it compatible with a large number of protocols.
A collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing. Next-Generation sequencing machines usually produce FASTA or FASTQ files, containing multiple short-reads sequences (possibly with quality information). The main processing of such FASTA/FASTQ files is mapping (aka aligning) the sequences to reference genomes or other databases using specialized programs. Example of such mapping programs are: Blat, SHRiMP, LastZ, MAQ and many many others.
Analyzes the structure and functions of active microbial communities using the power of multi-threading computers. MetaTrans is designed to perform two types of RNA-Seq analyses: taxonomic and gene expression. It performs quality-control assessment, rRNA removal, maps reads against functional databases and also handles differential gene expression analysis. Its efficacy was validated by analyzing data from synthetic mock communities, data from a previous study and data generated from twelve human fecal samples.
A-GAME / A GAlaxy suite for functional MEtagenomics
Incorporates tools and workflows for the analysis of environmental DNA (eDNA) sequence data. A-GAME is a general bioinformatics workflow management system implemented within Galaxy. The software contains pre-designed workflows that utilize standard tools for data pre-processing, sequence assembly and annotation; as well as custom utilities dedicated to the analysis of functional metagenomics data. It allows the incorporation of most widely used bioinformatics tools. A-GAME can be used to build and customize bioinformatics workflows.
A workflow system for laboratories with the need to analyze data from multiple NGS projects at a time. QuickNGS takes advantage of parallel computing resources, a comprehensive back-end database, and a careful selection of previously published algorithmic approaches to build fully automated data analysis workflows. QuickNGS considerably reduces the barriers that still limit the usability of the powerful NGS technology and finally decreases the time to be spent before proceeding to further downstream analysis and interpretation of the data.
CoVaCS / Consensus Variant Calling System
Enables genotyping and variant annotation of resequencing data produced by second generation next generation sequencing (NGS) technologies. CoVaCS is an automated system that provides tools for variant calling and annotation along with a pipeline for the analysis of whole genome shotgun (WGS), whole exome sequencing (WES) and targeted resequencing data (TGS). The software allows non-specialists to perform all steps from quality trimming to variant annotation.
RNA-seq portal
Includes three types of workflows for different tasks. RNA-seq portal permits users to perform computing and analysis, including sequence quality control, read-mapping, transcriptome assembly, reconstruction and differential analysis. All these workflows support multiple samples and multiples groups of samples and perform differential analysis between groups in a single workflow job submission. This web portal offers bioinformatics software, workflows, computation and reference data and a platform to study complex RNA-seq data analysis for agricultural animal species.
Allows analyses of high-throughput small RNA (sRNA) sequence data in model and non-model plants, from raw data to identified and annotated conserved and novel sequences. miRPursuit is a pipeline performing a series of sRNA analyses. The software minimizes the need to perform manual repetitive tasks allowing to run several libraries in parallel, for comparing differences in sRNA read accumulation among sRNA libraries. It can directly analyze the sRNA sequencing raw data from any sequencer.
A user-friendly software package designed to generate detailed statistics and at-a-glance graphics of sequence data quality both quickly and in an automated fashion. SolexaQA contains associated software to trim sequences dynamically using the quality scores of bases within individual reads. It produces standardized outputs within minutes, thus facilitating ready comparison between flow cell lanes and machine runs, as well as providing immediate diagnostic information to guide the manipulation of sequence data for downstream analyses.
OncoRep / Oncogenomics Report
A fully automated RNA-Seq based report for patients with (breast) cancer, which includes molecular classification, detection of altered genes, detection of altered pathways, identification of gene fusion events, identification of clinical actionable mutations (in coding regions) and identification of treatable target structures. Furthermore, OncoRep reports suitable drugs based on identified actionable targets, which can be considered in the treatment decision making process.
FaQCs / FastQ Quality Control Software
A software package that can rapidly process large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.
TRAPLINE / Transparent Reproducible and Automated PipeLINE
Serves for RNAseq data processing, evaluation and prediction. TRAPLINE guides researchers through the NGS data analysis process in a transparent and automated state-of-the-art pipeline. It can detect protein-protein interactions (PPIs), miRNA targets and alternatively splicing variants or promoter enriched sites. This tool includes different modules for several functions: (1) it scans the list of differentially expressed genes; (2) it includes modules for miRNA target prediction; and (3) a module is implemented to identify verified interactions between proteins of significantly upregulated and downregulated mRNAs.
