Computational protocol: Physiological and Comparative Genomic Analysis of Arthrobacter sp. SRS-W-1-2016 Provides Insights on Niche Adaptation for Survival in Uraniferous Soils

Similar protocols

Protocol publication

[…] Initial comparative genomics of strain SRS-1-W-2016 with closest taxonomic relatives was performed by EDGAR []. To further infer the evolutionary relatedness of SRS-1-W-2016 with its closest taxonomic relatives, multiple alignments were performed using the progressive Mauve algorithm (http://darlinglab.org/mauve/mauve.html) []. Mauve facilitates multiple genome alignments such that rearrangements and inversions from evolutionary events can be identified and comparatively visualized. Because genomic recombination events result in rearrangements, orthologous genomic regions of a bacterial strain may be reordered or inverted relative to another genome, which are clearly identified during Mauve analysis such that conserved genomic segments that appear to be internally free from rearrangements are shown as Locally Collinear Blocks (LCBs).Average nucleotide identity was obtained as shown previously [] (http://enve-omics.ce.gatech.edu/ani/) and dDDH was estimated by using the Genome-to-Genome Distance web service [] (http://ggdc.dsmz.de/home.php), respectively. The ANI calculated utilized both best hits (one-way ANI) and reciprocal best hits (two-way ANI) between two genomic datasets, as shown by Goris et al. []. Typically, the ANI values between genomes of the same species are above 95%. GGDC runs comparisons of a query genome relative to a reference genome and generates an intergenomic distance under three different distance formulae. Distances are inferred using three distinct formulas from the set of high-scoring segment pairs (HSPs) and maximally unique matches (MUMs) obtained by comparing each pair of genomes with the chosen software. These distances are transformed to values analogous to DDH using a generalized linear model (GLM) inferred from an empirical reference dataset comprising real DDH values and genome sequences. Model-based confidence intervals are specified in square brackets but can also be obtained via bootstrapping. Logistic regression (a special type of GLM) is used for reporting the probabilities that DDH is ≥70% and ≥79%. Percent G+C content cannot differ by >1 within a single species but by ≤1 between distinct species. […]

Pipeline specifications

Software tools Mauve, GGDC
Applications Phylogenetics, Nucleotide sequence alignment
Organisms Arthrobacter sp., Bacteria
Diseases Onchocerciasis, Ocular
Chemicals Adenosine Triphosphate, Cadmium, Cobalt, Nickel, Uranium, Zinc