Computational protocol: Draft Genome Sequence of Pectobacterium wasabiae Strain CFIA1002

Similar protocols

Protocol publication

[…] Pectobacterium spp. cause soft rot disease of a wide range of crops and ornamental plants both in the field and in storage, resulting in significant global economic losses in agricultural production (). Species of the genus Pectobacterium were previously classified as Erwinia, a genus containing numerous species and subspecies of phytopathogenic bacteria varying in molecular and biochemical characteristics and host range. Pectobacterium spp. are characterized by their ability to produce large quantities of pectolytic enzymes involved in the maceration of parenchymal tissue of their plant hosts ().Pectobacterium wasabiae (formerly Erwinia carotovora subsp. wasabiae) was originally described as causing soft rot of Japanese horseradish () and was later identified as the causal agent of potato tuber decay in New Zealand (, ), the United States (), and Iran (). A recent study demonstrated that P. wasabiae also causes blackleg-like symptoms in potato plants (). The pathogen possesses diverse regulatory systems with known virulence factors, including genes encoding pectolytic enzymes and the type III secretion system (T3SS), and it has many additional pathogenicity and virulence determinants acquired by horizontal gene transfer (, ). Therefore, comparative genomics of P. wasabiae strains infecting potato and other hosts from different geographical locations would help identify the specific virulence factors involved in pathogenicity and host specificity.P. wasabiae strain CFIA1002 was isolated from a blackleg-diseased potato stem sample in Canada (). The draft genome sequences of P. wasabiae strains WPP163 () and SCC3193 (, ), isolated from infected potato tubers in the United States and Europe, respectively, and the type strain P. wasabiae CFBP3394, isolated from horseradish in Japan, are available at GenBank (, ). The draft genome sequence data for P. wasabiae strain CFIA1002 were generated using paired-end Illumina HiSeq sequencing technology with TruSeq version 3 chemistry at the National Research Council Canada (Saskatoon, Saskatchewan, Canada). Sequencing resulted in 8,682,640 reads (insert size, 300 bp) totaling 876,946,640 bp, each 101 bp in length. The sequencing data provided approximately 175× genome coverage. After quality checking using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/), initial de novo assembly using ABySS () produced 78 contigs contained in 69 scaffolds, of which scaffolds with lengths of <300 bp were removed. SSPACE () and GapFiller () were applied on the scaffolds to extend and merge them into larger scaffolds and to close the gaps between the short scaffolds. The final draft genome is 5,008,535 bp in length, with 324 Ns, and consists of 42 scaffolds. The G+C content of the draft genome is 50.59%.Annotation conducted on the RAST server using the Glimmer 3 option () predicted 4,615 protein-coding genes (96 noncoding RNAs). A number of predicted virulence factors, phage loci, and motility and chemotaxis genes were identified, which may facilitate pathogenicity in specific environments. The variable genomic regions, especially pathogenicity-related loci, were highly correlated with different environmental factors, including the host species. Further comparison of the genome sequences of strains from different hosts and geographic regions will provide further insights on virulence, functionality, and plant/pest interactions, as well as contribute to the development of specific assays for accurate identification and detection of the pathogen. […]

Pipeline specifications

Software tools FastQC, ABySS, SSPACE, GapFiller, RAST, Glimmer
Organisms Pectobacterium wasabiae, Solanum tuberosum