1 - 50 of 94 results

CLC Genomics Workbench

star_border star_border star_border star_border star_border
star star star star star
forum (1)
Allows to analyze, compare, and visualize next generation sequencing (NGS) data. CLC Genomics Workbench offers a complete and customizable solution for genomics, transcriptomics, epigenomics, and metagenomics. The software enables to generate custom workflows, which can combine quality control steps, adapter trimming, read mapping, variant detection, and multiple filtering and annotation steps into a pipeline.


star_border star_border star_border star_border star_border
star star star star star
Assists users in manipulating high-throughput sequencing (HTS) data and formats. Picard is a Java toolkit that provides a set of command line scripts. It comprises Java-based utilities that manipulate SAM files, and a Java API for creating new programs that reads and writes SAM files. Both SAM text format and SAM binary (BAM) format are supported. It also works with next generation sequencing (NGS).


star_border star_border star_border star_border star_border
star star star star star
Furnishes functions to control quality for high throughput sequence data. FastQC aims to provide a simple manner to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which users can employ to obtain a quick impression of whether data has any problems of which users should be aware before doing any further analysis.


Examines epigenomic and transcriptomic next generation sequencing (NGS) data. Octopus-toolkit can be used for antibody- or enzyme-mediated experiments and studies for the quantification of gene expression. It can accelerate the data mining of public epigenomic and transcriptomic NGS data for basic biomedical research. This tool provides a private and a public mode: one to process the user’s own data, and the other to analyze public NGS data by retrieving raw files from the GEO database.


A package for input, quality assessment, manipulation and output of high-throughput sequencing data. ShortRead extends Bioconductor with tools useful in the initial stages of short-read DNA sequence analysis. Main functionalities include data input, quality assessment, data transformation and access to downstream analysis opportunities. It is an important gateway to use of Bioconductor for processing high-throughput DNA sequence data. ShortRead data structures allow convenient manipulation of data, such as filtering reads based on sequence characteristics.


Facilitates analysis of microarrays and miRNA/RNA-seq data on laptops. oneChannelGUI can be used for quality control, normalization, filtering, statistical validation and data mining for single channel microarrays. It offers a comprehensive microarray analysis for Affymetrix 3′ (IVT) expression arrays as well as for the new generation of whole transcript arrays: human/mouse/rat exon 1.0 ST and human gene 1.0 ST arrays. oneChannelGUI inherits the core affylmGUI functionalities and permits a wider range of analysis allowing biologists to choose among different criteria and algorithms in order to analyze their data. It is a didactical tool since it could be used to introduce young life scientists to the use and interpretation of microarray data. For this purpose various data sets and exercises are available at the oneChannelGUI web site.

aRNApipe / automated RNA-seq pipeline

Analyzes single-end and stranded or unstranded paired-end RNA-seq data. aRNApipe focuses on high performance computing (HPC) environments and the independent designation of computational resources at each stage allowing optimization of HPC resources. It is highly flexible because its project configuration and management options. This tool can be adapted to changes in the current applications and the addition of new functionalities. It allows users to complete primary RNA-seq analysis.


A pipeline for RNA-seq method to research polyA. SAPAS performs a systematic search and evaluation of protocols for typical steps to investigate to what extent these can indeed facilitate RNA-seq data analysis. 29 open-source interfaces and 6 of the more widely used interfaces were evaluated in detail. SAPAS processes the sequencing result using SAPAS method, including quality control, mapping to genome using bowtie, generating cleverage sites, internal priming, clustering cleverage sites.


Allows users to characterize and quantify the set of all RNA molecules produced in cells. RseqFlow contains several modules that include: mapping reads to genome and transcriptome references, performing quality control (QC) of sequencing data, generating files for visualizing signal tracks based on the mapping results, calculating gene expression levels, identifying differentially expressed genes, calling coding single nucleotide polymorphisms (SNPs) and producing MRF and BAM files.


A user-friendly software package designed to generate detailed statistics and at-a-glance graphics of sequence data quality both quickly and in an automated fashion. SolexaQA contains associated software to trim sequences dynamically using the quality scores of bases within individual reads. It produces standardized outputs within minutes, thus facilitating ready comparison between flow cell lanes and machine runs, as well as providing immediate diagnostic information to guide the manipulation of sequence data for downstream analyses.


Displays statistics of large sequence files from next-generation sequencing (NGS) projects. SAMStat is a program which plots nucleotide over-representation and other statistics in mapped and unmapped reads. The software can be used to verify individual processing steps in large analysis pipelines. Specific applications include the verification and quality control of processing pipelines, the tracking of data quality over time and the visualization of data properties derived from new protocols.


Facilitates quality control (QC) of FASTQ files. FQC combines a command line interface (CLI) depending on FastQC for processing FASTQ files, and a frontend website for plotting, styling and interactivity. The CLI wraps FastQC and builds the website with default QC metrics upon which one can expand without additional programming. The CLI and dashboard lower the threshold of performing and following up on quality issues that may be apparent upon visual inspection and it promotes evidence-based protocol changes in sequencing facilities to generate better quality data.

ST Pipeline

Permits to process and analyze the raw files generated with the Spatial Transcriptomics (ST) method. ST Pipeline enables demultiplexing of spatially-resolved RNA-seq data and robust quality filtering and identification of unique molecules. It is highly customizable with numerous parameter settings. The tool is more robust, efficient and scales better to arrays with higher density. It filters data, aligns it to a genome, annotates it to a reference, demultiplexes by array coordinates and then aggregates by counts that are not duplicates using the Unique Molecular Identifiers.


Verifies sample identities from FASTQ, BAM or VCF files. NGSCheckMate uses a model-based method to compare allele read fractions at known single-nucleotide polymorphisms (SNPs), considering depth-dependent behavior of similarity metrics for identical and unrelated samples. It is effective for a variety of data types, including exome sequencing, whole-genome sequencing, RNAseq, ChIP-seq, targeted sequencing and single-cell whole-genome sequencing, with a minimal requirement for sequencing depth. The tool can be used as a quality control step in next-generation sequencing (NGS) studies.


A FastQ/Fasta/SAM information extractor implemented in HTML5 capable of offering insights into next-generation sequencing (NGS) data. MuffinInfo can run on any software or hardware environment, in command line or graphically, and in browser or standalone. It presents information such as average length, base distribution, quality scores distribution, k-mer histogram, and homopolymers analysis. MuffinInfo improves upon the existing extractors by adding the ability to save and then reload the results obtained after a run as a navigable file, by supporting custom statistics implemented by the user, and by offering user-adjustable parameters involved in the processing, all in one software.


Provides an open source RNA-seq processing pipeline that can be used to extract knowledge from any study that profiled gene expression using RNA-seq applied to mammalian cells, comparing two conditions. Zika-RNAseq-Pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA) and hierarchical clustering (HC) plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression.


Analyzes the structure and functions of active microbial communities using the power of multi-threading computers. MetaTrans is designed to perform two types of RNA-Seq analyses: taxonomic and gene expression. It performs quality-control assessment, rRNA removal, maps reads against functional databases and also handles differential gene expression analysis. Its efficacy was validated by analyzing data from synthetic mock communities, data from a previous study and data generated from twelve human fecal samples.

KAT / K-mer Analysis Toolkit

A user-friendly, extendible and scalable toolkit for rapidly counting, comparing and analysing k-mers from various data sources. The tools in KAT assist the user with a wide range of tasks including error profiling, assessing sequencing bias and identifying contaminants and de novo genome assembly QC and validation. KAT is a C++11 application containing multiple tools, each of which exploits multi-core machines via multi-threading where possible. Core functionality is contained in a library designed to promote rapid development of new tools.


star_border star_border star_border star_border star_border
star star star star star
A tool to create a single report visualizing output from multiple tools across many samples, enabling global trends and biases to be quickly identified. MultiQC allows accurate comparison between samples, allowing detection of subtle differences not noticeable when switching between different files. Data visualization aids batch effect detection and minimizes the risk of confounding factors affecting the results of the study.


An open-source platform for aggregating multiple sources of quality metrics as well as meta-data associated with DNA sequencing runs from Illumina NextSeq and HiSeq machines. AlmostSignificant is a graphical platform to streamline the quality control of DNA sequencing data, to collect and store these data for future reference and to collect extra meta-data associated with the sequencing runs to check for errors and monitor the volume of data produced by the associated machines.


An integrated, automated, flexible and user-friendly tool for quality control in clinical research. It supports three major NGS sequencing technologies including Illumina, 454 and Ion Torrent along with Sanger sequencing. ClinQC offers full flexibility, accuracy and reproducibility. All input parameters can be customized in the “ClinQCOptions” configuration file. It is a one-stop solution to run from raw sequence reads and trace files to high quality FASTQ files with Sanger quality encoding. This tool can be easily integrated in any downstream analysis pipeline for, e.g., mutation screening. In summary ClinQC can be used to analyze 1) Sanger and NGS data together, 2) all quality control parameters can be customized for different sequencing data, 3) thousands of datasets / patients / samples can be analyzed in a single run, 4) paired-end, single-end reads and mixed reads generated from Illumina, 454 and Ion Torrent can be analyzed simultaneously in a single run. ClinQC excels over existing tools and software for better usability, multiple data handling, Sanger sequencing data analysis and common input output model for Sanger and NGS data analysis.

TRAPLINE / Transparent Reproducible and Automated PipeLINE

Serves for RNAseq data processing, evaluation and prediction. TRAPLINE guides researchers through the NGS data analysis process in a transparent and automated state-of-the-art pipeline. It can detect protein-protein interactions (PPIs), miRNA targets and alternatively splicing variants or promoter enriched sites. This tool includes different modules for several functions: (1) it scans the list of differentially expressed genes; (2) it includes modules for miRNA target prediction; and (3) a module is implemented to identify verified interactions between proteins of significantly upregulated and downregulated mRNAs.

RNA-seq portal

Includes three types of workflows for different tasks. RNA-seq portal permits users to perform computing and analysis, including sequence quality control, read-mapping, transcriptome assembly, reconstruction and differential analysis. All these workflows support multiple samples and multiples groups of samples and perform differential analysis between groups in a single workflow job submission. This web portal offers bioinformatics software, workflows, computation and reference data and a platform to study complex RNA-seq data analysis for agricultural animal species.


Detects possible sources of sequence-specific bias in short read data. Hercules is based on analyzing sequence motif correlations and employs the MapReduce formalism of Apache Spark to quantify bias in next-generation sequencing (NGS). It provides two phases: the first one utilises the annotation data and returns a key for the exon in which the read is mapped to; and the second one takes a read and returns an exon key and vector of the positions in which the motif occurs for that exon.


Processes raw reads to count tables for RNA-seq data using Unique Molecular Identifiers (UMIs). zUMIs is a pipeline applicable for most experimental designs of RNA-seq data, such as single-nuclei sequencing techniques. This method allows for down sampling of reads before summarizing UMIs per feature, which is recommended for cases of highly different read numbers per sample. zUMIs is flexible with respect to the length and sequences of the barcodes (BCs) and UMIs, making it compatible with a large number of protocols.

QASDRA / Quality Assessment of Sequencing Data via Range Analysis

Detects ranges and introduces new metrics computed from their lengths. QASDRA creates the quality assessment report of an input FASTQ file according to the user specified k and v parameters. The software analyzes the maximal ranges, which are defined as the longest segments in which no more than k scores are less than or equal to v. It also has the capabilities to filter out the reads according to the metrics introduced. This tool gives general and overall insight about the file you have in hand.


A free service that provides access to RNA-Seq and ChIP-Seq analysis tools for studying infectious diseases. The site makes available thousands of pre-indexed genomes, their annotations, and the ability to stream results to the bioinformatics resources VectorBase, EuPathDB, and PATRIC. The site also provides a combination of experimental data and metadata, examples of pre-computed analysis, step-by-step guides, and a user interface designed to enable both novice and experienced users of RNA-Seq data.


A Genotyping-by-sequencing (GBS) bioinformatics pipeline designed to provide highly accurate genotyping. Fast-GBS is capable of handling data from different sequencing platforms and can detect different kinds of variants (Single Nucleotide Polymorphisms (SNPs), Multiple Nucleotide Polymorphisms (MNPs), and Indels). This pipeline was benchmarked based upon a large-scale, species-wide analysis of soybean, barley and potato. It is easy to use with various species, in different contexts, and provides an analysis platform that can be run with different types of sequencing data and modest computational resources.


A workflow system for laboratories with the need to analyze data from multiple NGS projects at a time. QuickNGS takes advantage of parallel computing resources, a comprehensive back-end database, and a careful selection of previously published algorithmic approaches to build fully automated data analysis workflows. QuickNGS considerably reduces the barriers that still limit the usability of the powerful NGS technology and finally decreases the time to be spent before proceeding to further downstream analysis and interpretation of the data.

Pheniqs / PHilology ENcoder wIth Quality Statistics

Demultiplexes sequence and analyzes quality. Pheniqs introduces a Phred-adjusted maximum likelihood decoder that consults base calling quality scores and estimates the probability of a barcode decoding error. It was evaluated on real and semi-synthetic data and it achieves greater accuracy by correctly reflecting quality measurements. This application can report estimates of demultiplexing error probabilities in standard output formats based on read quality scores emitted by all major sequencing platforms.


Improves basepair accuracy of long reads. Hercules is an alignment-based hybrid error correction algorithm, using profile hidden Markov models (dubbed profile HMM or pHMM), that corrects erroneous long reads using short but accurate Illumina data. The software models each long and erroneous read as a template profile HMM. It was tested on the following datasets: (i) two BAC clones of complex regions of human chromosome 17, namely, CH17-157L1 and CH17-227A2, and (ii) human brain cerebellum polyA RNA-seq data.