Computational protocol: Genome Sequence of Pseudomonas stutzeri 273 and Identification of the Exopolysaccharide EPS273 Biosynthesis Locus

Similar protocols

Protocol publication

[…] The whole genome of P. stutzeri 273 was sequenced by single molecule, real-time (SMRT) technology []. Using SMRT Analysis 2.3.0 to filter low-quality reads and the filtered reads were assembled to generate one contig without gaps. A total of 90,551 filtered paired-end reads were produced with an average read length of 12,368 bp, which corresponded to approximately 222-fold coverage. tRNAscan-SE v.1.23 and RNAmmer v.1.2 were used to identify presence of tRNA and rRNA, respectively [,]. Gene prediction was performed by GeneMarkS [] with an integrated model that combined the GeneMarkS generated (native) and heuristic model parameters. A whole genome Blast search (E-value less than 1 × 10−5), minimal alignment length percentage larger than 40%) was performed against six databases. They are KEGG (Kyoto Encyclopedia of genes and genomes) [], COG (Clusters of Orthologous Groups) [], GO (Gene Ontology) [], NR (Non-Redundant Protein Database databases), and Swiss-Prot []. The Venn diagram was constructed by Vennerable R package []. The CAZy family of glycosyltransferases was determined based on the Carbohydrate-Active enZYmes database []. […]

Pipeline specifications

Software tools SMRT-Analysis, tRNAscan-SE, RNAmmer, GeneMarkS, Vennerable
Databases CAZy UniProt KEGG
Application Genome annotation
Organisms Pseudomonas stutzeri, uncultured marine bacterium, Pseudomonas aeruginosa PAO1
Diseases Basal Ganglia Diseases
Chemicals Tyrosine