Computational protocol: Complete Genome Sequence of Salmonella enterica subsp. enterica Serovar Thompson Strain RM6836

Similar protocols

Protocol publication

[…] Salmonella enterica subsp. enterica is a major cause of food-borne illnesses associated with a wide variety of foods, including meat, eggs, fruits, vegetables, nuts, and spices. Based on the serologic identification of O (lipopolysaccharide) and H (flagellar) antigens, S. enterica subsp. enterica has been classified into a variety of groups and specific serovars (). Among S. enterica subsp. enterica strains, the most common O-antigen serogroups are A, B, C1, C2, D, and E, and these serogroups cause approximately 99% of Salmonella infections in humans (). S. enterica subsp. enterica serovar Thompson is in serogroup C1 and has been the cause of food-borne outbreaks associated with cilantro, arugula, chicken, beef, bread, and smoked salmon (, –). S. Thompson strain RM6836 was isolated from lettuce in 2002 and serotyped by the FDA Center for Veterinary Medicine as part of the USDA, Agricultural Marketing Service, Microbiological Data Program.Genome sequencing was performed using shotgun and paired-end (8 to 12 kb) libraries and was generated on a Roche 454 FLX+ sequencing system with Titanium chemistry. The Roche Newbler assembler (version 2.3) was used to assemble 187,876 shotgun and 103,498 paired-end reads into 64 contigs and a single scaffold. Genome closing utilized a combination of steps. The contigs were aligned to the other genomes of S. enterica subsp. enterica, including serovar Typhimurium LT2 and serovar Enteritidis strain P125109, using the software Mauve () to find unexpected gaps. Scaffold gaps were filled by a combination of referenced assemblies of 1,907,370 Illumina MiSeq reads to the Newbler contigs using Geneious version 6.1.6 and the identification of repeated contigs using the Perlscript contig_extender2. Certain gaps were validated using PCR amplification and Sanger sequencing. All base calls were validated using the Illumina MiSeq reads, which provided an additional 100× coverage.The S. Thompson RM6836 genome size is 4,707,648 bp, with a G+C content of 52.2%. The genome sequence was annotated using the NCBI Prokaryotic Genomes Automatic Annotation Pipeline (PGAAP) (http://www.ncbi.nlm.nih.gov/genomes/static/Pipeline.html) and was deposited with GenBank. The RM6836 genome is predicted to carry 4,621 genes, 7 ribosomal RNA operons, and 79 tRNAs. Bacteriophages were identified using PHAST (), including one identified as Gifsy 1 and four remnant prophages. The S. Thompson RM6836 genome is highly syntenic to other S. enterica subsp. enterica serovars, with variable positions of prophage and bacteriophage remnants in the different serovars. RM6836 does not possess a virulence plasmid, which is common to many other S. enterica subsp. enterica strains. To our knowledge, this is the first complete S. Thompson genome sequence to be released into the public domain. […]

Pipeline specifications

Software tools Newbler, Mauve, Geneious, PGAP, PHAST
Application Nucleotide sequence alignment
Diseases Salmonella Infections