[…] data which provides an average 26.8× coverage of the genome and 832.1 Mb of Illumina draft data which provides an average 124× coverage of the genome., Genes were identified using Prodigal [] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePrimp pipeline []. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. These data sources were combined to assert a product description for each predicted protein. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE [], RNAMMer [], Rfam [], TMHMM [], and SignalP []. Additional gene prediction analyses and functional annotation were performed within the Integrated Microbial Genomes (IMG-ER) platform []., The genome is 6,690,028 bp long with a 62.56% GC content () and comprises a single chromosome and a single plasmid. From a total of 6,531 genes, 6,470 were protein encoding and 61 RNA only encoding genes. Within the genome, 206 pseudogenes were also identified. The majority of genes (70.74%) were assigned a putative function while the remaining genes were annotated as hypothetical. The distribution of genes into COGs functional categories is presented in , and , and ., This work was performed under the auspices of the US Department of Energy’ […]

