Computational protocol: Long-Term Persistence of Bi-functionality Contributes to the Robustness of Microbial Life through Exaptation

Similar protocols

Protocol publication

[…] We next asked whether the bi-functionality of HisA is an ancient feature that has been conserved in certain extant enzymes. To this end, we computationally reconstructed three HisA precursors as described in the following. It has been shown that concatenating related sequences increases the strength of the phylogenetic signal available for tree construction []. Thus, we concatenated species-wise HisA with HisH and HisF sequences. The respective genes were most likely part of the LUCA genome [] and have remained elements of the histidine operon since then. Bacterial and archaeal genomes were scanned for the occurrence of hisA genes, and species were selected for which hisA, hisF, and hisH were gene neighbors. We picked sequences from Euryarchaeota (5 species), Crenarchaeota (20), Bacteroidetes (8), Firmicutes (11), Spirochaetes (5), and the α-, β-, γ-, and δ-Proteobacteria (21, 5, 1, 5). Moreover, we added 22 actinobacterial sequence sets, by selecting genes whose products contain the above mentioned PriA active site sequence motif.The resulting MSAHisFAH comprised 103 concatenations (species names listed in ). After preprocessing this input, a phylogenetic tree was determined and assessed by means of PhyloBayes v3.3 []. Four independent MCMC samplings of length 50,000 were computed using pb and compared to ensure convergence. Several parameters confirmed the validity of our approach: Convergence and mixing were checked by means of the discrepancy index maxdiff; for the pairwise comparison of all chains, the maxdiff value was at most 0.06. The effective size was at least 100, as determined by means of tracecomp. A consensus tree was deduced from the concatenation of these four chains (). The posterior probability of edges interlinking ancestors of phyla or classes was at least 0.87, which testifies to the high quality of the tree.This tree and the corresponding MSAHisFAH were used to deduce a predecessor of the actinobacterial enzymes (CA-Act-HisA) by means of FASTML []. In order to exclude any effect of the 22 actinobacterial sequences (and especially their active site motif) on the reconstruction of more ancient predecessors of HisA, these sequences were removed from MSAHisFAH. The resulting MSAHisFAH-Act, which contained the remaining 81 non-actinobacterial sequences, was used to calculate a second tree (). Applying FASTML, the sequences of the common ancestors of Proteobacteria (CA-Prot-HisA) and of Bacteria (CA-Bact-HisA) were determined. A schematic representation of the two trees is given in . The archaeal sequences served as an outgroup in both reconstructions. […]

Pipeline specifications

Software tools PhyloBayes, FastML
Application Phylogenetics
Chemicals Tryptophan