Computational protocol: Contrasting Patterns in Mammal–Bacteria Coevolution: Bartonella and Leptospira in Bats and Rodents

Similar protocols

Protocol publication

[…] Bacterial and host species sequences were imported from GenBank into Geneious Pro 5.0.4. Sequences for each bacterial genus and their corresponding bat and rodent hosts were each aligned using default parameters in MUSCLE as implemented in Geneious . Outgroup taxa, obtained from GenBank, were included in each alignment, and were chosen based on previous species-level phylogenies. The outgroup for Bartonella was Brucella melitensis , for Leptospira was Leptonema illini , and for the bat and rodent hosts was the duck-billed platypus, Ornithorhynchus anatinus HQ379861 . In order analyze the difference in host-specificity between Old and New World geographic regions, each alignment was further divided into Old and New World. Alignments were inspected visually and ends were trimmed and gaps found in only one non-outgroup sequence were deleted due to high likelihood of sequencing error. After these edits, this resulted in 1,133 base pairs (bp) for cytb bat sequences, 338 bp for gltA Bartonella sequences, and 1,246 bp for 16S Leptospira sequences.Maximum likelihood (ML) phylogenetic trees were generated using RAxML 7.0.4 implemented with the Cyberinfrastructure for Phylogenetic Research (CIPRES) Portal ( using the substitution model GTRMIX, which determines an optimal tree by comparing likelihood scores under a GTR+G model. The number of bootstrap replicates were determined using the previously described stopping criteria. In order to corroborate the phylogenies as determined through ML, Bayesian inference (BI) host phylogenies were also generated using MrBayes 3.1.2 . We utilized a GTR+I+G substitution model, with 10,000,000 generations, sampling every 5000th generation with 4 heated chains and a burn in length of 1,000,000. [...] To visualize host-bacteria associations, tanglegrams were generated from the best ML trees in TreeMap 3.0 . For cophylogenetic analyses, we utilized both global fit as well as event-based methods. We selected programs that are capable of accounting for evolutionary patterns given association of parasite species to multiple hosts, as well as the presence of multiple parasites in a single host.Global-fit methods were used to quantify the degree of congruence between two given host and parasite topologies, and identify the individual associations contributing to the cophylogenetic structure . First, global-fit analysis was tested using distance-based ParaFit , using matrices of patristic distances calculated from maximum likelihood host and parasite phylogenies in R 3.0.1 . With an additional matrix of host-parasite links, ParaFit analyses were also performed in R using package ape with 999 permutations to implement a global test as well as individual links. Each individual host-bacteria interaction is determined to be significant if either its ParaFit 1 or Parafit 2 p-value≤0.05, and these significant interactions are shown in solid lines in the tanglegrams.As ParaFit tends to be liberal with its values, we also implemented newly developed program Procrustean Approach to Cophylogeny (PACo) in R using packages ape and vegan in order to obtain, and potentially corroborate, comparable global goodness-of-fit statistics with Parafit global values. PACo differs from ParaFit by utilizing Procrustean superimposition, in which the parasite matrix is rotated and scaled to fit the host matrix. Thus, PACo explicitly tests the dependence of the parasite phylogeny upon the host phylogeny.We then used event-based program Jane 4 to determine the most probable coevolutionary history of the associated host and parasites, again using the ML host and bacteria trees as input. We assigned different relative costs to 5 possible evolutionary events, in a method similar to previous research efforts . We performed analyses with 100 generations, population sizes of 100, and a default cost setting matrix of 0 for cospeciation, 1 for duplication of parasites, 2 for duplication and host switch, 1 for loss of parasite, and 1 for failure to diverge. In further runs, we changed one of the possible events to a cost of 10 each time, rendering that event prohibitively expensive. By further exploring the parameter space this way, we determined how these changes affected the overall costs of the optimal evolutionary history. […]

Pipeline specifications