1. Directory
  2. Genomics
  3. Genome annotation
  4. Repetitive DNA
Join community Sign in
By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.

An algorithm which extracts paths from De Bruijn graph for genome assembly. EPGA uses a score function to evaluate extension candidates based on the distributions of reads and insert size. The distribution of reads can solve problems caused by sequencing errors and short repetitive regions. Through assessing the variation of the distribution of insert size, EPGA can solve problems introduced by some complex repetitive regions. EPGA2 updates some modules in EPGA which can improve memory efficiency in genome asssembly.

Software type:
Command line interface
Restrictions to use:
Operating system:
Programming languages:
Computer skills:
EPGA version 2
View all reviews

0 user review

No review has been posted.

View all issues

0 issue

No open issue.


  • Jianxin Wang <jxwang at mail.csu.edu.cn>


School of Information Science and Engineering, Central South University, ChangSha 410083, China; College of Computer Science and Technology, Henan Polytechnic University, JiaoZuo, 454000, China; Division of Biomedical Engineering, University of Saskatchewan, Saskatchewan S7N 5A9, Canada; Department of Computer Science, Georgia State University, Atlanta, GA 30302, USA

Funding source(s)

This work was supported in part by the National Natural Science Foundation of China under Grant No.61232001, No.61420106009, No.61379108 and the Program for New Century Excellent Talents in University under Grant NCET-12-0547.

  • (Luo et al., 2015) EPGA2: memory-efficient de novo assembler. Bioinformatics.
    PMID: 26315905
  • (Luo et al., 2014) EPGA: de novo assembly using the distributions of reads and insert size. Bioinformatics.
    PMID: 25406329
  • (Magoc et al., 2013) GAGE-B: an evaluation of genome assemblers for bacterial organisms. Bioinformatics.
    PMID: 23665771
  • (Miller et al., 2010) Assembly algorithms for next-generation sequencing data. Genomics.
    PMID: 20211242
  • (Narzisi and Mishra, 2011) Comparing de novo genome assembly: the long and short of it. PloS one.
    PMID: 21559467
  • (Henson et al., 2012) Next-generation sequencing and large genome assemblies. Pharmacogenomics.
    PMID: 22676195
  • (Kleftogiannis et al., 2013) Comparing memory-efficient genome assemblers on stand-alone and cloud infrastructures. PloS one.
    PMID: 24086547
  • (Nagarajan and Pop, 2013) Sequence assembly demystified. Nature reviews Genetics.
    PMID: 23358380
  • (Bradnam et al., 2013) Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. GigaScience.
    PMID: 23870653
  • (Salzberg et al., 2012) GAGE: A critical evaluation of genome assemblies and assembly algorithms. Genome research.
    PMID: 22147368
  • (Utturkar et al., 2014) Evaluation and validation of de novo and hybrid assembly techniques to derive high-quality genome sequences. Bioinformatics.
    PMID: 24930142
  • (Alkan et al., 2011) Limitations of next-generation genome sequence assembly. Nature methods.
    PMID: 21102452
  • (Love et al., 2016) Evaluation of DISCOVAR de novo using a mosquito sample for cost-effective short-read genome assembly. BMC genomics.
    PMID: 26944054

77 related tools