Computational protocol: Phylogenetics and Differentiation of Salmonella Newport Lineages by Whole Genome Sequencing

[…] Bacterial cells were pelleted from one ml of pure Tryptic-Soytone-Broth from overnight culture by centrifugation and DNA prepared using the DNeasy Blood & Tissue Kit (Qiagen, Valencia, CA) according to the manufacturer’s instructions. We sequenced 26 S. Newport strains using Roche 454 GS-FLX Titanium sequencer (Roche, Branford, CT) to obtain 16–24 × coverage of draft genomes (except strain from canine_AZ_2003 with 9 × coverage). This platform provides longer read lengths than other sequencing platforms to obtain raw sequences. De novo assemblies were performed using the Roche Newbler (v 2.3) software package. Annotation of resulting contigs was finished by NCBI according to Prokaryotic Genomes Automatic Annotation Pipeline (PGAAP) . Phylogenetically informative SNPs were identified via two independent alignment methods: 1) multiple genome alignment of whole genome sequencing contigs using MAUVE , and 2) clustering of annotated open reading frames (ORFs) using reciprocal best Basic Local Alignment Search Tool (BLAST, hits with a 70% sequence identity setting followed by alignment with Multiple Sequence Comparison by Log-Expectation (MULCLE) . [...] Parsimony phylogenetic tree was constructed based on 147,780 concatenated informative SNPs by TNT with finding minimum tree length 20 times and 100,000 iterations. We extracted seven housekeeping genes to perform MLST analysis. Concatenated housekeeping gene sequences were analyzed by TNT with finding minimum tree length 20 times and 100,000 iterations. Moreover, we performed multiple sequence alignment using MULCLE in SeaView 4 and collected concatenated sequences of cas genes (cas1, cas2, cas5, cse1, cse2, cse3 and cse4) with around 6k bps. Strains from frog_Vietnam, fish_Hong_Kong, fish_Vietnam, canine_AZ_2003 and pig_ear_CA were not involved in this analysis. We performed TNT with finding minimum tree length 20 times and 100,000 iterations to display evolutionary relatedness of cas genes. [...] We used ClonalFrame to analyze effects of recombination events on the evolutionary history of S. Newport Lineages II and III. S. DublinCT_02021853 was used as an outgroup genome to display the recombination events and substitutions between S. Newport Lineages II and III, which showed close relatedness to both lineages. All 29 Salmonella genomes were aligned using progressive MAUVE with the default settings. We used the stripSubsetLCBs (locally collinear blocks) ( script to extract core blocks, which created core alignments longer than 500 bp that included all 29 genomes. We obtained total 510 LCBs. Given the computational demands necessary to analyze all 510 blocks simultaneously, we created three separate datasets each consisting of 50 randomly selected blocks. We ran ClonalFrame on each of these three datasets with estimated parameters based on 200,000 generations of which the first 100,000 generations served as burn-in. The thinning interval was set to 100. We then used the Gelmin-Rubin statistic to determine whether the independent runs had converged on similar parameter estimates, which also provided evidence that random subsets of the genome did not bias our results. Furthermore, we used MAUVE to compare the genomic organizations. […]

Pipeline specifications

Software tools Newbler, PGAP, Mauve, BLASTN, SeaView, ClonalFrame
Applications Phylogenetics, WGS analysis, Nucleotide sequence alignment
Organisms Salmonella enterica subsp. enterica serovar Newport
Diseases Salmonella Infections