PGAP statistics

Tool stats & trends

Looking to identify usage trends or leading experts?

PGAP specifications


Unique identifier OMICS_10860
Alternative names Prokaryotic Genome Annotation Pipeline, Prokaryotic Genome Automatic Annotation Pipeline, PGAAP
Interface Web user interface
Restrictions to use None
Input data On input, PGAP accepts an assembly (either draft or complete) with a predefined NCBI Taxonomy ID that defines the genetic code of the organism. PGAP also accepts a predetermined clade identifier, matching the genome in question to a species-specific clade.
Computer skills Basic
Stability Stable
Maintained Yes


  • person_outline Mark Borodovsky

Publications for Prokaryotic Genome Annotation Pipeline

PGAP citations


Complete Genome Sequence of Staphylococcus haemolyticus Type Strain SGAir0252

Genome Announc
PMCID: 5946035
PMID: 29748397
DOI: 10.1128/genomeA.00229-18
call_split See protocol

[…] 99.56% identity with the available reference genome sequence of Staphylococcus haemolyticus JCSC1435 (NCBI reference sequence number NC_007168).Multilevel genome annotation was conducted by the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) (). The complete genome consists of 2,528 genes, of which 2,445 were protein-coding genes, 19 rRNA subunits (7 genes for 5S, 6 for 16S, and 6 for 23S), 60 […]


Complete Genome Sequences of 10 Yersinia pseudotuberculosis Isolates Recovered from Wild Boars in Germany

Genome Announc
PMCID: 5946050
PMID: 29748399
DOI: 10.1128/genomeA.00266-18
call_split See protocol

[…] e PATRIC database ( (). SPAdes assembly calculations resulted in 10- to 20-fold sequence coverages per consensus sequence for all isolates. Genome annotation using the automated Prokaryotic Genome Annotation Pipeline (PGAP) of the NCBI-database ( revealed that the Y. pseudotuberculosis genomes exhibit only little variability […]


Complete Genome Sequences of Two Atypical Enteropathogenic Escherichia coli O145 Environmental Strains

Genome Announc
PMCID: 5946043
PMID: 29748413
DOI: 10.1128/genomeA.00418-18
call_split See protocol

[…] on mode. A FASTQ file was generated using SMRT Analysis (v 2.3.0), and assembly was done with RS_HGAP_Assembly.3. The complete genome sequences were submitted to GenBank for annotation using the NCBI Prokaryotic Genome Annotation Pipeline.The E. coli RM14715 genome is composed of a 4,825,089-bp chromosome, encoding 4,947 coding sequences (CDSs), 22 rRNAs, and 88 tRNAs. The E. coli RM14723 genome i […]


Draft Genome Sequence of Staphylococcus aureus Strain HD1410, Isolated from a Persistent Nasal Carrier

Genome Announc
PMCID: 5946038
PMID: 29748411
DOI: 10.1128/genomeA.00411-18

[…] for length (>500 bp) and coverage (>10×) to ensure no errors and contamination in the draft genome. The contigs were then annotated using Prokka 1.12 (based on Genetic Code Table 11) () and the NCBI Prokaryotic Genome Annotation Pipeline.The multilocus sequence typing (MLST), VirulenceFinder, ResFinder, and PlasmidFinder ( databases were used to determine the sequen […]


Draft Genome Sequences of Plant Associated Bacillus Strains Isolated from the Qinghai Tibetan Plateau

Genome Announc
PMCID: 5946045
PMID: 29748403
DOI: 10.1128/genomeA.00375-18
call_split See protocol

[…] assembled de novo using the A5 pipeline (). Genome coverage of the obtained scaffolds was 45× on average. Scaffolds were submitted to GenBank for gene annotation, which was implemented using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) (). The genome-to-genome-distance calculator (GGDC) version 2.1 provided by DSMZ ( was used for genome-based species delineation. Form […]


Draft Genome Sequences of Two Salmonella Strains Isolated from Wild Animals on the Eastern Shore of Virginia

Genome Announc
PMCID: 5946047
PMID: 29748400
DOI: 10.1128/genomeA.00329-18

[…] in silico Salmonella serotype prediction tool SeqSero version 1.0 (). De novo assemblies were generated using SPAdes version 3.9.0 (), and annotation of the draft genomes was performed using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) ().Quality of the de novo assemblies was assessed using Quast version 4.5 (). The N50 values of the two assemblies were 278,931 bp and 357,417 bp for VA-W […]


Looking to check out a full list of citations?

PGAP institution(s)
National Center for Biotechnology Information, U.S. National Library of Medicine, Bethesda, MD, USA; Wallace H. Coulter Department of Biomedical Engineering, Georgia Tech, Atlanta, GA, USA; School of Computational Science and Engineering, Georgia Tech, Atlanta, GA, USA
PGAP funding source(s)
Intramural Research Program of the NIH National Library of Medicine (in part); NIH grant HG000783 (in part)

PGAP reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review PGAP