Computational protocol: Divergence of Borrelia burgdorferi sensu lato spirochetes could be driven by the host: diversity of Borrelia strains isolated from ticks feeding on a single bird

Similar protocols

Protocol publication

[…] We sequenced 14 genomic loci for each Borrelia strain: i.e., 16S rRNA, a 5S-23S IGS, a 16S-23S ITR, flagellin, p66, ospC and housekeeping genes, clpA, clpX, nifS, pepX, pyrG, recG, rplB, and uvrA. Twelve loci were ultimately included in the phylogenetic analysis; ospC and 16S-23S ITS were excluded due to the high levels of polymorphism and recombination.Sequences were aligned using Clustal X []. Data were evaluated for fit to 24 evolutionary models using MrModeltest []. The most-parameterized model that best fits the data at each locus was selected and evaluated by either the likelihood ratio test or Akaike Information Criterion []. Phylogenetic analyses were performed using Bayesian reconstruction methods, with the underlying model of evolution set to the chosen model in the program MrBayes 3.1. Selected models were: GTR + G for clpA (579 bp) and nifS (564 bp) loci, GTR + I + G for clpX (624 bp), pepX (570 bp), pyrG (603 bp), recG (651 bp), rplB (624 bp) and uvrA (570 bp), HKY + G for 5S-23S IGR (275 bp), HKY + I for flagellin gene (487 bp), and GTR + I for p66 (315 bp) and 16S rRNA (1363 bp) [,]. The Markov Chain Monte Carlo (MCMC) analysis was run for 10 × 106 generations, sampling trees every 1000 generations, using 4 Markov chains (default heating values). Stationarity of the MCMC was evaluated using the “Are We There Yet” (AWTY) software [] that plots the cumulative posterior probabilities for each tree. Two to three thousand burn-in trees generated before the point, at which these values stabilized, were discarded. The fifty percent majority rule consensus tree for the estimated posterior distribution of trees (with burn-in trees truncated) was assembled for each locus, using MrBayes []. The consensus trees for each of twelve genes (excluding ospC and 16S-23S ITS) were not congruent, and thus an overall pattern of relatedness could not be inferred using these gene-trees alone.The most common approach to inferring relationships across multiple genetic loci is to combine outcomes of individual gene trees into multi-locus analysis. The Bayesian estimation of concordance among gene trees (BUCKy) approach [], which makes no assumptions about the source of reticulation in gene tree histories was used here. BUCKy uses, as input data, the complete tree files generated by the Bayesian analysis of each individual locus, in the format generated by MrBayes []. BUCKy generates a sample of gene trees from the joint distribution of gene trees, from which concordance factors (CFs) are estimated with credibility intervals. The CF ranges from 0.0 to 1.0. BUCKy implements a consensus method based on unrooted quartets and which consistently identifies the species tree []. We ran BUCKy at several levels of α to evaluate how much effect choice of this parameter value would have on the results. The final analysis selected for use was run with an α of 1, a reasonable intermediate between 0 and infinity [], using 4 heated chains in the MCMC analysis. […]

Pipeline specifications

Software tools Clustal W, MrModelTest, MrBayes, AWTY, BUCKy
Databases PepX
Application Phylogenetics
Organisms Borreliella burgdorferi, Periplaneta americana
Diseases Borrelia Infections, Infection, Lyme Disease, Sprains and Strains