Computational protocol: Complete Genome Sequence of the d-Amino Acid Catabolism Bacterium Phaeobacter sp. Strain JL2886, Isolated from Deep Seawater of the South China Sea

Similar protocols

Protocol publication

[…] In the marine environment, the d-amino acids synthesized by microbes can be released into the seawater (). d-Amino acids (d-AAs), as the α-carbon enantiomers of l-amino acids (l-AAs), are commonly known as nonproteinogenic amino acids (), and there are few reports about marine bacteria utilizing d-amino acids as carbon and nitrogen sources (). A bacterial strain, JL2886, was isolated from deep seawater at 2,000-m depth of South China Sea collected during a cruise organized by the National Natural Science Foundation of China in August 2012. Phylogenetic analysis based on the 16S rRNA gene sequences revealed that strain JL2886 belongs to the genus Phaeobacter, Roseobacter clade (). JL2886 can utilize many d-AAs as a sole source of carbon or nitrogen for growth (our unpublished data).The complete genome sequencing of strain JL2886 was performed using the PacBio RS platform (Pacific Biosciences). A 10-kb library was sequenced using P4-C2 chemistry on two single-molecule real-time (SMRT) cells. The average read length was 6,734 bp, with a sequencing depth of 289×. The continuous long reads (CLR) were assembled de novo using SMRT Analysis version 2.1 and the protocol PacBio Hierarchical Genome Assembly Process (HGAP) (). The consensus polishing process resulted in a highly accurate self-overlapping contig, as observed using Gepard dotplot (), with a length of 4,061,725 bp, in addition to five self-overlapping 678,758-bp plasmids, and the overall G+C content of strain JL2886 was 61.52%. DNA methylation was determined using the RS Modification and Motif Analysis protocol within the SMRT Portal version 1.3.3.The genome was annotated using Prodigal version 2.6 (), RNAmmer version 1.2 (), and ARAGORN version 1.2 (), as implemented in the Prokka automatic annotation tool version 1.11 (). The main chromosome contained 12 rRNA operons, 58 tRNAs, a predicted 3,913 protein-coding genes, and 744 protein-coding genes in plasmids.The predicted open reading frames (ORFs) were annotated through comparisons with the NCBI-NR database and KEGG protein database (). The functional genes were then identified by association with Clusters of Orthologous Groups (COGs) classification () and the KEGG pathway collection (). A total of 3,690 proteins matched to known functions in the genome. There were 2,641 proteins classified to COG categories and 2,120 proteins classified to KEGG orthologs.The genome sequences from strain JL2886 contained 361 predicted protein-coding sequences (CDSs) related to amino acid transport and metabolism, including five CDSs for putative d-AA transferases, four CDSs for putative d-AA racemase, and one CDS for putative d-AA oxidases. Strain JL2886 has robust d-AA catabolism ability. […]

Pipeline specifications