1. Directory
  2. Genomics
  3. Genome annotation
  4. Repetitive DNA
Join community Sign in
By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.

A highly scalable assembler for processing massive sequencing data using thousands of cores, where SWAP is an acronym for Small World Asynchronous Parallel model. In SWAP-Assembler, two fundamental improvements are crucial for its scalability. Firstly, multi-step bi-directed graph (MSG) is presented as a comprehensive mathematical abstraction for genome assembly. With MSG the computational interdependence is resolved. Secondly, SWAP computational framework triggers the parallel computation of all operations without interference. Two additional steps are included to improve the quality of contigs. One is graph cleaning, which adopts the traditional methods of removing k-molecules and edges with low frequency, and the other is contig extension, which resolves special edges and some cross nodes with a heuristic method. Compared with several other assemblers, it showed very good performance in terms of scalability and contig quality.

Software type:
Standalone
Interface:
Command line interface
Restrictions to use:
None
Operating system:
Unix/Linux
Computer skills:
Advanced
Version:
SWAP-Assembler version 0.4
View all reviews

0 user review

No review has been posted.

View all issues

0 issue

No open issue.

Institution(s)

Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 518055 Shenzhen, P.R. China; Institute of Computing Technology, Chinese Academy of Sciences, 100190 Beijing, P.R. China; University of Chinese Academy of Sciences, 100049 Beijing, P.R. China; Beijing Genomics Institute, 518083 Shenzhen, P.R. China; Mathematics and Computer Science Division, Argonne National Laboratory, 60439-4844 USA

Funding source(s)

National Science Foundation of China under grant No. 11204342, the Science Technology and Innovation Committee of Shenzhen Municipality under grant No. JCYJ20120615140912201, and Shenzhen Peacock Plan under grant No. KQCX20130628112914299

  • (Meng et al., 2014) SWAP-Assembler: scalable and efficient genome assembly towards thousands of cores. BMC bioinformatics.
    PMID: 25253533
  • (Magoc et al., 2013) GAGE-B: an evaluation of genome assemblers for bacterial organisms. Bioinformatics.
    PMID: 23665771
  • (Miller et al., 2010) Assembly algorithms for next-generation sequencing data. Genomics.
    PMID: 20211242
  • (Narzisi and Mishra, 2011) Comparing de novo genome assembly: the long and short of it. PloS one.
    PMID: 21559467
  • (Henson et al., 2012) Next-generation sequencing and large genome assemblies. Pharmacogenomics.
    PMID: 22676195
  • (Kleftogiannis et al., 2013) Comparing memory-efficient genome assemblers on stand-alone and cloud infrastructures. PloS one.
    PMID: 24086547
  • (Nagarajan and Pop, 2013) Sequence assembly demystified. Nature reviews Genetics.
    PMID: 23358380
  • (Bradnam et al., 2013) Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. GigaScience.
    PMID: 23870653
  • (Salzberg et al., 2012) GAGE: A critical evaluation of genome assemblies and assembly algorithms. Genome research.
    PMID: 22147368
  • (Utturkar et al., 2014) Evaluation and validation of de novo and hybrid assembly techniques to derive high-quality genome sequences. Bioinformatics.
    PMID: 24930142
  • (Alkan et al., 2011) Limitations of next-generation genome sequence assembly. Nature methods.
    PMID: 21102452
  • (Love et al., 2016) Evaluation of DISCOVAR de novo using a mosquito sample for cost-effective short-read genome assembly. BMC genomics.
    PMID: 26944054

77 related tools