Computational protocol: Adaptive Change Inferred from Genomic Population Analysis of the ST93 Epidemic Clone of Community-Associated Methicillin-Resistant Staphylococcus aureus

Similar protocols

Protocol publication

[…] Genomic DNA was extracted from all isolates and subjected to whole-genome shotgun sequencing using an Illumina HiSeq-2000 with 100-bp, paired-end TruSeq chemistry. Sequence reads were submitted to National Center for Biotechnology Information GenBank and are available under BioProject PRJNA232112. A read mapping approach was used to align the reads from these isolates to the S. aureus ST93-IV MRSA JKD6159 reference genome using SHRiMP v2.0 (). Those positions in the reference that were covered by at least three reads from every isolate defined a core ST93 genome. Single-nucleotide substitution polymorphisms (SNPs) and indels up to 10 bp and their predicted consequences on gene function were identified using Nesoni v0.109, a Python utility that uses the reads from each genome aligned to the core genome to construct a tally of putative differences at each nucleotide position (including substitutions, insertions, and deletions) (http://bioinformatics.net.au/, last accessed February 10, 2014) (supplementary table S1, Supplementary Material online). To define the ST93 pan genome and investigate the total S. aureus ST93 gene content, sequence reads for each isolate were subjected de novo assembly using Velvet v1.2.06 (), and resulting contigs were aligned to the JKD6159 genome with MUMmer (). Sequences of at least 100 bp that were not present in the reference genome were extracted from contigs and appended to the JKD6159 genome to construct a pan genome sequence, which was annotated using Prokka (http://bioinformatics.net.au/, last accessed February 10, 2014). The proportion of the length of each annotated gene covered by reads was assessed for each isolate and a map summarizing all variable genes and their distribution in each strain produced (, supplementary table S2, Supplementary Material online). [...] Phylogenetic analyses were undertaken using several approaches. Split decomposition and neighbor-joining analyses were performed using uncorrected P distances as implemented in SplitsTree4 (v 4.13.1) (). The inputs for each method were the nucleotide sequence alignments of the concatenated variable nucleotide positions for the core genome among all isolates, prepared using Nesoni as described earlier and managed with SeaView v4.3.3 (). A phylogeny was also inferred by maximum likelihood (ML) as implemented in RAxML (), using the general time reversible (GTR) model of nucleotide substitution. Path-O-Gen was used to investigate the linear association between ML root-to-tip branch lengths and year of isolation () (http://tree.bio.ed.ac.uk/software/pathogen/, last accessed February 10, 2014). BEAST v1.7.4 () was used to infer the evolutionary dynamics of ST93 S. aureus with a GTR + Γ nucleotide substitution model and tip dates defined as the year of isolation. Multiple analyses were run with both constant population size and Bayesian skyline demographic models, in combination with either a strict molecular clock or a relaxed clock with uncorrelated lognormal distribution. Both demographic models, using lognormal or strict clock, yielded almost identical results. For sampling the posterior probability distributions and analyses of all model combinations (demographic and clock), ten Markov chain Monte Carlo (MCMC) chains of 100 million generations each were run to ensure convergence, with samples taken every 1,000 MCMC generations. Replicate analyses were combined and parameter estimates calculated with Tracer v1.5 (). Fig. 1.—For phylogeographic analysis, location (the state within Australia from which the isolate was obtained) was treated as a discrete character trait, and ML ancestral reconstruction was used to estimate the likely location of unsampled, ancestral forms of ST93. The same results were obtained using alternative implementations of ML ancestral reconstruction in R—the ace function in the ape package (plotted as pie graphs in C) and pml functions in the phangorn package.Previously established mean exotoxin expression levels for Hla, and PSMα3 (Chua KYL, Monk IR, Lin Y, Seemann T, Tuck KL, Porter JL, Stepnell J, Coombs GW, Davies JK, Stinear TP, Howden BP, submitted for publication), together with Hld expression levels and oxacillin MICs determined in this study were visualized alongside the ML phylogeny (). The mean expression levels based on biological triplicate measurements were also compared between pairs of strains using a two-sided, unpaired t-test on log10 transformed exotoxin expression values (GraphPad Prism V6). The null hypothesis (no difference between means) was rejected for P < 0.05. For each MRSA isolate, mean exotoxin expression and oxacillin MIC were compared using the nonparametric Spearman correlation analysis (GraphPad Prism V6). Fig. 2.—For correlation of phenotypes with phylogenies, Felsenstein's phylogenetically independent contrasts (PIC) method was employed (), and the PIC test was performed using the R+ package “ape” (http://ape-package.ird.fr/, last accessed February 10, 2014). Conventional correlation tests assume data points (here, strains) are independent, which is not the case for phylogenetically related strains, as closely related strains may be expected to be more alike in phenotype than distantly related strains. The PIC method takes this into account by instead assuming that variation in an observed variable is due to Brownian motion along the branches in the phylogenetic tree (). […]

Pipeline specifications

Software tools Nesoni, Velvet, MUMmer, Prokka, SplitsTree, SeaView, RAxML, TempEst, BEAST, Phangorn
Applications Phylogenetics, Nucleotide sequence alignment
Organisms Staphylococcus aureus, Epipremnum aureum
Diseases Staphylococcal Infections
Chemicals Erythromycin, Methicillin, Oxacillin, Tetracycline, Trimethoprim