Computational protocol: Comparative Genomics and Metabolic Analysis Reveals Peculiar Characteristics of Rhodococcus opacus Strain M213 Particularly for Naphthalene Degradation

Similar protocols

Protocol publication

[…] The whole genome sequence from strain M213 was obtained and assembled as described in our previous study []. Gene prediction and annotation was accomplished using the Integrated Microbial Genomes Expert Review (IMG/ER) []. Additionally, genome analysis was performed via RAST (http://rast.nmpdr.org/) and NCBI’s Prokaryotic Genome Automatic Annotation Pipeline (PGAAP) server (http://www.ncbi.nlm.nih.gov/genome/annotation_prok/). The functions of the predicted protein-coding genes were annotated using NCBIs non-redundant (NR) and Clusters of Orthologous Groups of proteins (COGs) databases, respectively.Additionally, the draft genome sequence of strain M213 was reordered using the complete genome sequence of the type strains R. opacus B4 and R. jostii RHA1 as the reference genomes. For this purpose, r2cat (related reference contig arrangement tool) was used [], which facilitated the comparative genomics of strain M213, especially at the syntenic level. [...] Resources utilized in this study for genome predictions and comparisons of strain M213 with other sequenced rhodococcii included those offered by NCBI (http://www.ncbi.nlm.nih.gov/) and Integrated Microbial Genomes Expert Review (https://img.jgi.doe.gov/cgi-bin/er/main.cgi). COGs from M213 and other rhodococii were compared using the Function Category Comparison tool using IMG/ER; each COG represents a protein that is an ortholog or direct evolutionary counterpart among genomes as they evolve over time. As with PFAM, IMG computes top COG hits using RPS-BLAST on PSSM's provided by CDD. Genes encoding phage integrases, transposases, and insertion sequence (IS) elements, transporters, transcriptional regulators, chaperones as well as selective biodegradative gene classes and other functional genes were identified manually using the annotated genome of strain M213 with the help of IMG-ER and NCBI portals. Additionally, gene sequences for the large subunits of naphthalene dioxygenase (narAa), phthalate 3,4 dioxygenase (phtAa) were mined from the whole genome sequence of strain M213 and phylogenetic trees were constructed using AromaDeg []. AromaDeg is a web-based repository of catabolic protein families such that when queried using a protein sequence of choice, AromaDeg builds a phylogenetic tree revealing clustering of the query sequence with a given catabolic protein family (http://aromadeg.siona.helmholtz-hzi.de).Circular maps of the M213 genome and genomic islands (GEIs) were generated using the Web-based CGview program []. The average nucleotide identity (ANI) was calculated using the Web-based JSpecies program (http://imedea.uib-csic.es/jspecies/about.html). Clustered regularly interspaced short palindromic repeat (CRISPR) gene sequences were located in the genome of strain M213 from a publicly accessible CRISPRs database and software (http://crispr.u-psud.fr/Server/CRISPRfinder.php). The Clusters of Karlin signature skew, cumulative GC skew, and GC content were depicted using Artemis tools (sact_v9.0.5) []. Island Viewer was used to identify chromosomal deviations in GC content, known as genomic islands (GEIs) (http://www.pathogenomics.sfu.ca/islandviewer/) []. Additionally, the newly developed genomic island prediction software (GIPSy) [] was utilized to evaluate the presence of different classes of GEIs in strain M213.Venn diagrams, phylogenetic comparisons, core genome, pan genome, synteny plots and other comparative genomic features such as orthologous genes and distinction between core genes or singletons were analyzed using EDGAR (https://edgar.computational.bio.uni-giessen.de/cgi-bin/edgar_login.cgi?cookie_test=1&open=1) []. To further infer the evolutionary relatedness of M213 with close phylogenetic relatives, the genome of M213 was aligned with other rhodococii genomic sequences using Mauve (http://darlinglab.org/mauve/mauve.html) [], which facilitates multiple genome alignments such that rearrangements and inversions from evolutionary events can be identified and comparatively visualized. Because genomic recombination events result in rearrangements, orthologous genomic regions of a bacterial strain may be reordered or inverted relative to another genome, which are clearly identified during Mauve analysis such that conserved genomic segments that appear to be internally free from rearrangements are shown as Locally Collinear Blocks (LCBs). Mauve was also used to produce dotplots showing chromosomal synteny between selected Rhodococcus strains.In addition to the above approaches, Cloud Virtual Resource (CLoVR) was also utilized to run comparative genomics between M213 and other closely related Rhodococcus species. CLoVR- Virtual Machine (CLoVR-VM) module via the Data Intensive Academic Grid (DIAG) was used with the CloVR-Microbe pipeline (http://clovr.org/methods/clovr-microbe/), which facilitated gene finding, homology searches as well as automatic annotation (http://ae.igs.umaryland.edu/). […]

Pipeline specifications

Software tools RAST, PGAP, r2cat, BLASTN, CGView, JSpeciesWS, CRISPRFinder, IslandViewer, GIPSy, Mauve, CloVR
Databases Pfam AromaDeg NMPDR
Applications Genome annotation, Phylogenetics, WGS analysis, Nucleotide sequence alignment, Genome data visualization
Organisms Rhodococcus opacus, Rhodococcus jostii RHA1
Diseases Oculocerebrorenal Syndrome