Provides tools and class interfaces for the assembly of DNA reads. AMOS includes modular assembly pipelines, as well as tools for overlapping, consensus generation, contigging, and assembly manipulation. The AMOS pipeline config file can be modified by users to add additional processing steps. The software includes a number of conversion utilities allowing to process data from a variety of input sources and to output the data in commonly used assembly formats.
Allows automation improvement of gene structures in Arabidopsis thaliana. PASA was used in Eukaryotic genome annotation projects such as Rice, Aspergillus species, Plasmodium falciparum, Schistosoma mansoni, Aedes aegypti, mouse, human, among others. This tool is able to recognize and organize splicing variations supported by the transcript alignments. It can clean the transcripts, validate perfect alignments or procced to automatic genome annotation.
Merges multiple assemblies of a genome into a single superior sequence. We apply it to the four genomes from the Assemblathon competitions and show it consistently and substantially improves the contiguity and quality of each assembly. We also develop guidelines for meta-assembly by systematically evaluating 120 permutations of merging the top 5 assemblies of the first Assemblathon competition.
A meta-assembler designed for ultra-deep sequencing data. Slicembler takes advantage of the whole dataset, and significantly improves the final quality of the assembly. Slicembler partitions the input data into optimal-sized “slices” and uses a standard assembly tool (e.g., Velvet, SPAdes, IDBA, Ray) to assemble each slice individually. Slicembler uses majority voting among the individual assemblies to identify long contigs that can be merged to the consensus assembly. It extracts high-quality contigs from the slice assemblies, and prevents contigs containing mis-joins and calling errors to be included in the final assembly.
Improves contiguity of genome assemblies based on long molecule sequences. quickmerge merges assemblies to produce a more contiguous assembly. This software uses information from assemblies built with Pacific Biosciences of Oxford Nanopore long reads or Illumina short reads to enhance contiguities of an assembly generated with only long reads alone. It splices and merges contiguities without introducing any new assembly errors. It does not include spurious sequences or large scale missassemblies.
Constructs an accordance graph to capture the mapping information between the target and query assemblies. Based on the accordance graph, the contigs or scaffolds of the target assembly can be extended, merged or bridged together. Extra constraints, including gap sizes, mate pairs, scaffold order and orientation, are explored to enforce those accordance operations in the correct context.
Makes assembly reconciliation using optical maps. Novo&Stitch aims to enhance the contiguity of de novo genome assemblies. It can discover overlaps between contigs and drive the stitching process. This tool exploits the alignments of contigs from multiple input assemblies to an optical map in order to make its detection. It utilizes a conflict graph built on an undirected hypergraph aiming to diminish false alignments.
Integrates the assemblies into a hybrid set of contigs, resulting in assemblies of superior contiguity and accuracy, compared with the assemblies generated by the state-of-the-art assemblers and the hybrid assemblies merged by existing tools. This tool is implemented in Python and requires MUMmer and BLAST+ to be installed on the local machine. A user supplies a set of contigs from at least three assemblers in FASTA format to CISA to obtain integrated contigs.
Merges two or more assemblies in order to enhance contiguity and correctness of both. GAM-NGS does not rely on global alignment: regions of the two assemblies representing the same genomic locus (called blocks) are identified through reads' alignments and stored in a weighted graph. The merging phase is carried out with the help of this weighted graph that allows an optimal resolution of local problematic regions.
Mixes two or more draft assemblies, without relying on a reference genome and having the goal to reduce contig fragmentation and thus speed-up genome finishing. Mix builds an extension graph where vertices represent extremities of contigs and edges represent existing alignments between these extremities. These alignment edges are used for contig extension. The resulting output assembly corresponds to a set of paths in the extension graph that maximizes the cumulative contig length.
Reduces the number of misassemblies in a genome sequence assembly. Tigmint uses linked reads to realize from 10x Genomics Chromium these analyses. It works in several steps: (i) it aligns reads to the assembly and infers the extents of the large DNA molecules from these alignments, and (ii) it searches for atypical drops in physical molecule coverage.
A software tool for comparative analysis and merging of two or more given scaffold assemblies. CAMSA takes as an input two or more assemblies of the same set of scaffolds and generates a comprehensive comparative report for them. The report not only contains multiple numerical characteristics for the input assemblies, but also provides an interactive framework for their visual comparison and analysis. CAMSA also computes a merged assembly, combining the input assemblies into more comprehensive one, which resolves conflicts and determine orientation of unoriented scaffolds in the most confident way.
Aims at producing an improve assembly in presence of a sequence belonging to a closely related organism. The idea is to combine de novo and reference-guided assembly in order to obtain enhanced results.
A hybrid sequencing technology assembler. Zorro merges two sets of pre-assembled contigs into a more contiguous and consistent assembly. The main feature of Zorro is the treatment before and after assembly to avoid errors. Zorro takes 2 contigs fasta files as input (representing assembled contigs from a whole genome assembly) and one fasta file containing some of the reads used for assembly.
An assembly integrator that makes use of all available data, i.e. multiple de novo assemblies and mappings against multiple related genomes, by optimizing a weighted combination of criteria. The MAIA approach has two main advantages. First, multiple known related genomes can be used simultaneously in the assembly process. Second, different NGS sources can be assembled with specific de novo assemblers, to be integrated afterwards with MAIA.