Computational protocol: Comparative genomics reveals new evolutionary and ecological patterns of selenium utilization in bacteria

[…] To analyze the occurrence of different Se traits, we used Escherichia coli SelD, SelA and SelB sequences as queries to search for components of the Sec trait (), SelD and YbbB for the SeU trait (; ), and SelD, YqeB and YqeC for the Se-cofactor trait (). TBLASTN () was initially used to identify genes encoding homologs with a cutoff E-value of 0.01 and the alignment coverage of at least 30%. Orthologous proteins were defined using the conserved domain database (Pfam or CDD) (; ) and bidirectional best hits (). If necessary, orthologs were also confirmed by genomic location or phylogenetic analyses. The distribution of SelD and the three known Se utilization traits in different bacterial taxa was presented by using the online Interactive Tree Of Life (iTOL) tool (, ) based on a recently developed phylogenetic tree of life (). [...] Standard approaches were used to reconstruct phylogenetic trees of each component of Se utilization traits. Multiple sequence alignments were performed using CLUSTALW () with default parameters. Phylogenetic trees of protein families were reconstructed by PHYLIP programs () using neighbor-joining method, and were further evaluated by MrBayes (Bayesian estimation of phylogeny) tool (). The vector graphics editor Inkscape software (version 0.91) () was used to further refine the fonts and colors of the phylogenetic trees. […]

Pipeline specifications

Software tools TBLASTN, iTOL, Clustal W, PHYLIP, MrBayes, Inkscape
Databases Pfam CDD
Applications Miscellaneous, Phylogenetics, Amino acid sequence alignment
Chemicals Selenium, Selenocysteine