A fundamental problem in bioinformatics is genome assembly. Next-generation sequencing (NGS) technologies produce large volumes of fragmented genome reads, which require large amounts of memory to assemble the complete genome efficiently. With recent improvements in DNA sequencing technologies, it is expected that the memory footprint required for the assembly process will increase dramatically and will emerge as a limiting factor in processing widely available NGS-generated reads. Source text: Kleftogiannis et al., 2013.
A5
A5
A pipeline for assembling DNA sequence data generated on the Illumina sequencing platform.
A pipeline for assembling DNA sequence data generated on the Illumina sequencing platform.
A5-miseq
A5-miseq
Produces high quality microbial genome assemblies on a laptop computer without any…
Produces high quality microbial genome assemblies on a laptop computer without any parameter tuning. A5-miseq does this by automating the process of adapter trimming, quality filtering, error correction, contig and scaffold generation, and detection…
ABruijn assembler
ABruijn assembler
Assembles long error-prone reads using de Bruijn graphs. While the running time of…
Assembles long error-prone reads using de Bruijn graphs. While the running time of overlap-layout-consensus (OLC) assemblers is dominated by the overlap detection step, the running time of the ABruijn assembler is dominated by the polishing step,…
ABySS
ABySS
A de novo, parallel, paired-end sequence assembler that is designed for short reads.
A de novo, parallel, paired-end sequence assembler that is designed for short reads.
Allora
Allora
An Overlap-Layout-Consensus based algorithm that is designed to assemble PacBio long…
An Overlap-Layout-Consensus based algorithm that is designed to assemble PacBio long reads. It’s best suited to microbial and smaller genomes (<10 MB).
ALLPATHS-LG
ALLPATHS-LG
A whole‐genome shotgun assembler that can generate high‐quality genome assemblies…
A whole‐genome shotgun assembler that can generate high‐quality genome assemblies using short reads (~100bp) such as those produced by the new generation of sequencers.
AMOS
AMOS
A collection of tools and class interfaces for the assembly of DNA reads.
A collection of tools and class interfaces for the assembly of DNA reads.
Anchored Assembly
Anchored Assembly
A novel analysis pipeline that accurately detects and maps variations that are often…
A novel analysis pipeline that accurately detects and maps variations that are often missed by standard analysis algorithms.
AutoAssemblyD
AutoAssemblyD
A graphical tool for genome assembly submission and remote management by multiple…
A graphical tool for genome assembly submission and remote management by multiple assemblers through XML templates. AutoAssemblyD also facilitates assembly on remote devices through distributed programming. It comprises three interfaces: template…
Celera assembler
Celera assembler
A de novo whole-genome shotgun (WGS) DNA sequence assembler.
A de novo whole-genome shotgun (WGS) DNA sequence assembler.
CG-Pipeline
CG-Pipeline
A tool for assembling genome sequence data and running feature prediction and annotation…
A tool for assembling genome sequence data and running feature prediction and annotation tools on the assembly.
ChromasPro
ChromasPro
It is able to assemble data from 454 and Illumina next-generation sequencers, with up to…
It is able to assemble data from 454 and Illumina next-generation sequencers, with up to 100,000 sequences if 2 Gb RAM is available.
CloudBrush
CloudBrush
A de novo next generation genomic sequence assembler based on string graph and MapReduce…
A de novo next generation genomic sequence assembler based on string graph and MapReduce cloud computing framework.
CongrPE
CongrPE
A de novo assembly algorithm for Next-Generation Sequencing technology.
A de novo assembly algorithm for Next-Generation Sequencing technology.
Connecting Overlapped Pair-End
Connecting Overlapped Pair-End
COPE
An efficient tool to connect overlapping pair-end reads using k-mer frequencies.
An efficient tool to connect overlapping pair-end reads using k-mer frequencies.
Contrail
Contrail
A Hadoop based genome assembler for assembling large genomes in the clouds.
A Hadoop based genome assembler for assembling large genomes in the clouds.
Denovo Solid Pipeline
Denovo Solid Pipeline
DSP
Pipeline for small genome assembly using SOLiD sequencing technology.
Pipeline for small genome assembly using SOLiD sequencing technology.
dipSPAdes
dipSPAdes
A genome assembler designed specifically for diploid highly polymorphic genomes based on…
A genome assembler designed specifically for diploid highly polymorphic genomes based on SPAdes. dipSPAdes takes advantage of divergence between haplomes in repetitive genome regions to resolve them and construct longer contigs. It produces…
DISCOVAR
DISCOVAR
A variant caller and small genome assembler. The heart of DISCOVAR is a de novo genome…
A variant caller and small genome assembler. The heart of DISCOVAR is a de novo genome assembler, one that is accurate enough to produce assemblies that can be used for variant calling given a reference sequence. DISCOVAR can also generate de novo…
DISCOVAR de novo
DISCOVAR de novo
A large (and small) de novo genome assembler. DISCOVAR de novo quickly generates highly…
A large (and small) de novo genome assembler. DISCOVAR de novo quickly generates highly accurate and complete assemblies using the same single library data. It requires reads from only a single PCR-free library, and has tested well on relatively…
DNA Baser Sequence Assembler
DNA Baser Sequence Assembler
Bioinformatics software for manual and automatic DNA sequence assembly, DNA sequence…
Bioinformatics software for manual and automatic DNA sequence assembly, DNA sequence analysis, contig editing, file format conversion and mutation detection.
Edena
Edena
A method that automatically determines suited overlaps cutoffs according to the…
A method that automatically determines suited overlaps cutoffs according to the contextual coverage, reducing thus the need for manual parameterization.
EPGA
EPGA
An algorithm which extracts paths from De Bruijn graph for genome assembly. EPGA uses a…
An algorithm which extracts paths from De Bruijn graph for genome assembly. EPGA uses a score function to evaluate extension candidates based on the distributions of reads and insert size. The distribution of reads can solve problems caused by…
EULER-SR
EULER-SR
The assembly package contains a suite of programs for correcting errors in short reads…
The assembly package contains a suite of programs for correcting errors in short reads and assembling them. EULER-SR may take as input classical Sanger reads, 454 sequences, and Illumina reads.
Geneious
Geneious
A powerful and comprehensive suite of molecular biology tools that provides visual…
A powerful and comprehensive suite of molecular biology tools that provides visual sequence alignment and editing, sequence assembly with a clear graphical interface, comprehensive molecular cloning and phylogenetic analysis. First released in 2005,…
Genome Assembly & Analysis Tool Box
Genome Assembly & Analysis Tool Box
GATB
An open-source library dedicated to genome assembly and analysis to fasten the process of…
An open-source library dedicated to genome assembly and analysis to fasten the process of developing efficient software.
Genome Assembly by Maximum Likelihood
Genome Assembly by Maximum Likelihood
GAML
Allows systematic combination of diverse sequencing datasets into a single assembly. We…
Allows systematic combination of diverse sequencing datasets into a single assembly. We achieve this by searching for an assembly with the maximum likelihood in a probabilistic model capturing error rate, insert lengths, and other characteristics of…
GenomeABC
GenomeABC
A web server for evaluating the performance of genome assemblers. GenomeABC can be used…
A web server for evaluating the performance of genome assemblers. GenomeABC can be used to evaluate the performance of assemblers on real, hypothetical and mutate genomes. (a) Whole genome is shattered into short reads by sequencers, (b) assemblers…
Gossamer
Gossamer
An application for the de novo assembly of genomes from fragments of DNA that…
An application for the de novo assembly of genomes from fragments of DNA that specifically attacks the question of scalability.
gsAssembler
gsAssembler
A software package for de novo DNA sequence assembly.
A software package for de novo DNA sequence assembly.
Hecate
Hecate
A series of distributed algorithms on map/reduce framework for short sequence assembly.
A series of distributed algorithms on map/reduce framework for short sequence assembly.
Hierarchical Genome Assembly
Hierarchical Genome Assembly
HGA
A methodology designed to take advantage of a high coverage by independently assembling…
A methodology designed to take advantage of a high coverage by independently assembling disjoint subsets of reads, combining the assemblies of the subsets, and finally re-assembling the combined contigs along with the original reads. By using HGA…
Hierarchical Genome-Assembly Process
Hierarchical Genome-Assembly Process
HGAP
A program for high-quality de novo microbial genome assemblies using only a single,…
A program for high-quality de novo microbial genome assemblies using only a single, long-insert shotgun DNA library in conjunction with Single Molecule, Real-Time (SMRT) DNA sequencing. The process itself relies on a succession of steps to generate…
hybridSPAdes
hybridSPAdes
An algorithm for assembling short and long reads. hybridSPAdes generates accurate…
An algorithm for assembling short and long reads. hybridSPAdes generates accurate assemblies (even in projects with relatively low coverage by long reads) thus reducing the overall cost of genome sequencing. Moreover, HYBRIDSPADES opens a…
HyDA-Vista
HyDA-Vista
A genome assembler that uses homology information to choose a value of k for each read…
A genome assembler that uses homology information to choose a value of k for each read prior to the de Bruijn graph construction. The chosen k is optimal if there are no sequencing errors and the coverage is sufficient. HyDA-Vista achieves superior…
iMetAMOS
iMetAMOS
An automated ensemble assembly pipeline.
An automated ensemble assembly pipeline.
in silico Whole Genome Sequencer and Analyzer
in silico Whole Genome Sequencer and Analyzer
iWGS
An automated pipeline for guiding the choice of appropriate sequencing strategy and…
An automated pipeline for guiding the choice of appropriate sequencing strategy and assembly protocols. iWGS seamlessly integrates the four key steps of a de novo genome sequencing project: data generation (through simulation), data quality control,…
IonGAP
IonGAP
A publicly available web platform designed for the analysis of whole bacterial genomes by…
A publicly available web platform designed for the analysis of whole bacterial genomes by using Ion Torrent sequence data. Besides assembly, IonGAP integrates a variety of comparative genomics, annotation and bacterial classification routines, based…
Iterative Virus Assembler
Iterative Virus Assembler
IVA
A de novo assembler designed to assemble virus genomes that have no repeat sequences,…
A de novo assembler designed to assemble virus genomes that have no repeat sequences, using Illumina read pairs sequenced from mixed populations at extremely high and variable depth. IVA produces significantly higher quality assemblies than existing…
JR-Assembler
JR-Assembler
An assembler for the de novo assembly of large genomes using short sequence reads via…
An assembler for the de novo assembly of large genomes using short sequence reads via jumping extension and read remapping.
Kmer Range EstimATION
Kmer Range EstimATION
KREATION
An automatic method for limiting the number of kmer values without a significant loss in…
An automatic method for limiting the number of kmer values without a significant loss in assembly quality but with savings in assembly time. This is a step forward to making multi-kmer methods more reliable and easier to use. KREATION is based on…
Konnector
Konnector
A scalable de novo assembler for paired-end reads. Konnector fills in the nucleotides of…
A scalable de novo assembler for paired-end reads. Konnector fills in the nucleotides of the sequence gap between read pairs by navigating a de Bruijn graph (DBG) represented by a Bloom filter.
LOCAS
LOCAS
A software to assemble short reads of next generation sequencing technologies at low…
A software to assemble short reads of next generation sequencing technologies at low coverage.
MaSuRCA
MaSuRCA
A whole genome assembly software. It combines the efficiency of the de Bruijn graph and…
A whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches.
Meraculous
Meraculous
An algorithm for whole genome assembly of deep paired-end short reads. Meraculous relies…
An algorithm for whole genome assembly of deep paired-end short reads. Meraculous relies on an efficient and conservative traversal of a subgraph of the k-mer (deBruijn) graph of oligonucleotides with unique high quality extensions in the dataset.…
Mimicking Intelligent Read Assembly
Mimicking Intelligent Read Assembly
MIRA
Sequence assembler and mapper for whole genome shotgun and EST/RNASeq sequencing data.
Sequence assembler and mapper for whole genome shotgun and EST/RNASeq sequencing data.
Minia
Minia
A short-read assembler based on a de Bruijn graph, capable of assembling a human genome…
A short-read assembler based on a de Bruijn graph, capable of assembling a human genome on a desktop computer in a day.
Minimap/miniasm
Minimap/miniasm
A mapper, minimap, and a de novo assembler, miniasm, for efficiently mapping and…
A mapper, minimap, and a de novo assembler, miniasm, for efficiently mapping and assembling single molecule real-time (SMRT) and Oxford Nanopore technologies (ONT) reads without an error correction stage. Miniasm implements the ‘O’ and ‘L’…
Nanopore Synthetic-long
Nanopore Synthetic-long
NaS
A hybrid approach developed to take advantage of data generated using MinION device. We…
A hybrid approach developed to take advantage of data generated using MinION device. We combine Illumina and Oxford Nanopore technologies to produce NaS reads of up to 60 kb that aligned with no error to the reference genome and spanned repetitive…
NeatFreq
NeatFreq
A software tool that reduces a data set to more uniform coverage by clustering and…
A software tool that reduces a data set to more uniform coverage by clustering and selecting from reads binned by their median kmer frequency (RMKF) and uniqueness. The normalization of deep coverage spikes, which would otherwise inhibit consensus…
Orione
Orione
A Galaxy-based framework consisting of publicly available research software and…
A Galaxy-based framework consisting of publicly available research software and specifically designed pipelines to build complex, reproducible workflows for next-generation sequencing microbiology data analysis. Enabling microbiology researchers to…
PacBio Corrected Reads pipeline
PacBio Corrected Reads pipeline
PBcR pipeline
Enables the use of the long-read sequences produced by the PacBio RS instrument.
Enables the use of the long-read sequences produced by the PacBio RS instrument.
Paired-End Reads Guided Assembler
Paired-End Reads Guided Assembler
PERGA
A de novo paired-end reads assembler. PERGA can generate large and accurate assemblies…
A de novo paired-end reads assembler. PERGA can generate large and accurate assemblies using the greedy-like prediction strategy to handle branches and errors to give much better extensions. By using look-ahead approach, PERGA distinguishes…
Paired-Read Iterative Contig Extension
Paired-Read Iterative Contig Extension
PRICE
A de novo genome assembler implemented in C++.
A de novo genome assembler implemented in C++.
PASHA
PASHA
A parallel short read assembler for large genomes using de Bruijn graphs.
A parallel short read assembler for large genomes using de Bruijn graphs.
PE-Assembler
PE-Assembler
We present a method that eschews the traditional graph-based approach in favor of a…
We present a method that eschews the traditional graph-based approach in favor of a simple 3′ extension approach that has potential to be massively parallelized.
Platanus
Platanus
A de novo sequence assembler that can reconstruct genomic sequences of highly…
A de novo sequence assembler that can reconstruct genomic sequences of highly heterozygous diploids from massively parallel shotgun sequencing data.
PoreSeq
PoreSeq
An open source program and Python library for de novo sequencing, consensus and variant…
An open source program and Python library for de novo sequencing, consensus and variant calling on data from Oxford Nanopore Technologies’ MinION platform. Features include: de novo error correction without reference using overlap alignment;…
QSRA
QSRA
A quality-value guided de novo short read assembler.
A quality-value guided de novo short read assembler.
RAMPART
RAMPART
A pipeline for creating multiple assemblies and a framework for analysing and comparing…
A pipeline for creating multiple assemblies and a framework for analysing and comparing them. Rampart supports a variety of third-party tools for assembling, scaffolding and read error correction. After assembling contigs using different tools and…
Ray
Ray
Assembles reads obtained with new sequencing technologies (Illumina, 454, SOLiD) using…
Assembles reads obtained with new sequencing technologies (Illumina, 454, SOLiD) using MPI 2.2.
Redundans
Redundans
A pipeline that specifically deals with the assembly of heterozygous genomes by…
A pipeline that specifically deals with the assembly of heterozygous genomes by introducing a step to recognise and selectively remove alternative heterozygous contigs. Redundans consists of three main steps: (i) detection and selectively removal of…
SHARCGS
SHARCGS
A DNA assembly program designed for de novo assembly of 25-40mer input fragments and deep…
A DNA assembly program designed for de novo assembly of 25-40mer input fragments and deep sequence coverage.
SHORTY
SHORTY
It is targetted for de novo assembly of microreads with mate pair information and…
It is targetted for de novo assembly of microreads with mate pair information and sequencing errors.
Small World Asynchronous Parallel model-Assembler
Small World Asynchronous Parallel model-Assembler
SWAP-Assembler
A highly scalable assembler for processing massive sequencing data using thousands of…
A highly scalable assembler for processing massive sequencing data using thousands of cores, where SWAP is an acronym for Small World Asynchronous Parallel model. In SWAP-Assembler, two fundamental improvements are crucial for its scalability.…
SMRT-Analysis
SMRT-Analysis
Open-source bioinformatics software suite for sequence alignment, assembly, variant…
Open-source bioinformatics software suite for sequence alignment, assembly, variant detection, and base modification discovery.
SOAPdenovo
SOAPdenovo
A short-read assembly method that can build a de novo draft assembly for the human-sized…
A short-read assembly method that can build a de novo draft assembly for the human-sized genomes. SOAPdenovo is specially designed to assemble Illumina GA short reads. It creates new opportunities for building reference sequences and carrying out…
SOLiD Assembler TRAnslation Program
SOLiD Assembler TRAnslation Program
SATRAP
A computer program designed to efficiently translate de novo assembled color-space…
A computer program designed to efficiently translate de novo assembled color-space sequences into a base-space format. The program was tested and validated using simulated and real transcriptomic data; its modularity allows an easy integration into…
SparseAssembler
SparseAssembler
Sparse k-mer Graph for Memory Efficient de novo Genome Assembly.
Sparse k-mer Graph for Memory Efficient de novo Genome Assembly.
SSAKE
SSAKE
It is designed to help leverage the information from short sequences reads by stringently…
It is designed to help leverage the information from short sequences reads by stringently clustering them into contigs that can be used to characterize novel sequencing targets.
StriDe
StriDe
An assembler that has advantages of both string and de Bruijn graphs. First, the reads…
An assembler that has advantages of both string and de Bruijn graphs. First, the reads are decomposed adaptively only in error-prone regions. Second, each paired-end read is extended into a long read directly using an FM-index. The decomposed and…
String Graph Assembler
String Graph Assembler
SGA
A de novo genome assembler based on the concept of string graphs. The major goal of SGA…
A de novo genome assembler based on the concept of string graphs. The major goal of SGA is to be very memory efficient, which is achieved by using a compressed representation of DNA sequence reads. It calculates per-base error rates, paired-end…
SUTTA
SUTTA
A de novo DNA sequence assembler based on global search-methods in order to contain the…
A de novo DNA sequence assembler based on global search-methods in order to contain the complexity of the assembly problem. SUTTA’s binaries are freely available to non-profit institutions for research and educational purposes.
Taipan
Taipan
A fast hybrid short-read assembly tool.
A fast hybrid short-read assembly tool.
Targeted Assembler of [short] Sequence Reads
Targeted Assembler of [short] Sequence Reads
TASR
A genomics application that allows hypothesis-based interrogation of genomic regions…
A genomics application that allows hypothesis-based interrogation of genomic regions (sequence targets) of interest. It only considers NGS reads for assembly that have overlap potential to input sequence targets.
Telescoper
Telescoper
An algorithm that iteratively extends long paths through a series of read-overlap graphs…
An algorithm that iteratively extends long paths through a series of read-overlap graphs and evaluates them based on a statistical framework.
VCAKE
VCAKE
A genetic sequence assembler capable of assembling millions of small nucleotide reads…
A genetic sequence assembler capable of assembling millions of small nucleotide reads even in the presence of sequencing error.
Velvet
Velvet
A de novo genomic assembler specially designed for short read sequencing technologies.
A de novo genomic assembler specially designed for short read sequencing technologies.
Velvet Assembler Graphical User Environment
Velvet Assembler Graphical User Environment
VAGUE
A multi-platform graphical front-end for Velvet. VAGUE aims to make sequence assembly…
A multi-platform graphical front-end for Velvet. VAGUE aims to make sequence assembly accessible to a wider audience and to facilitate better usage amongst existing users of Velvet.
VelvetOptimiser
VelvetOptimiser
A multi-threaded Perl script for automatically optimising the three primary parameter…
A multi-threaded Perl script for automatically optimising the three primary parameter options (K, -exp_cov, -cov_cutoff) for the Velvet de novo sequence assembler.
VICUNA
VICUNA
A de novo assembly program targeting populations with high mutation rates.
A de novo assembly program targeting populations with high mutation rates.
VirAmp
VirAmp
A web-based semi-de novo fast virus genome assembly pipeline designed for extremely high…
A web-based semi-de novo fast virus genome assembly pipeline designed for extremely high coverage NGS data. VirAmp is a collection of existing tools, combined into a single Galaxy interface. Users without further computational knowledge can easily…