Computational protocol: Draft Genome Sequence of 16SrIII-J Phytoplasma, a Plant Pathogenic Bacterium with a Broad Spectrum of Hosts

Similar protocols

Protocol publication

[…] Phytoplasma diseases are widely distributed around the world and many of the species have been reported in South America (, ). Among them, particularly the 16SrIII-J ribosomal subgroup has been detected in many crops, from herbaceous to woody plants, mainly in Argentina, Brazil, and Chile (). Two efficient insect vectors of 16SrIII-J phytoplasma were found in Chile and different weeds have been reported as reservoirs of the same phytoplasma (). Therefore, knowledge of the genome of this phytoplasma is of great concern for the agronomy sanitary status in South America. DNA extract was obtained from single infected periwinkle plants according to a previous report (). The extraction protocol was modified by an RNase A digestion step, before adding phenol-chloroform. Sequencing was done in two steps. First, DNA extracts from infected and healthy periwinkle were analyzed using Illumina paired-end sequencing. Phytoplasma reads were obtained as described (). Sequences from infected periwinkle were de novo assembled using VELVET () and mapped against negative periwinkle reads, discarding all contigs with coverage lower than 10× and those that matched with one or more pair of reads, obtaining 475 contigs. All the reads contained in the contigs were separated and assembled using CELERA assembler (), obtaining 176 scaffolds with a total length of 720,327 bp. In a second step, genome sequencing was continued using Ion Torrent PGM mate-pair sequencing using only infected periwinkle DNA extract. Total reads were trimmed by coverage lower than 10× and then were mapped against the total reads of the phytoplasma obtained in step 1. The final 101,204 reads were used for scaffolding, obtaining 29 scaffolds with a total length of 687,253 bp, N50 of 45,393, and a minimum and maximum size of 2,702 and 113,168 bp. G+C content reached 27.72%. Six hundred ninety-six coding sequences (CDSs), with 303 coding for hypothetical proteins, and 34 tRNAs, were identified using RAST, according to instructions (http://rast.nmpdr.org). The IdpA gene, coding for an immunodominant membrane protein specific for the X-disease group, was identified with 85% of nucleotide identity with Vaccinium witches’ broom phytoplasma, isolate VAC, and 85% of positives in amino acid sequence (76% of identity) with Italian clover phyllody phytoplasma, isolate MA1. Using BLASTx in web and local databases, 11 CDSs containing the SEC translocase complex signal peptide were identified but no SAP11, SAP54, PHYL, and TENGU homologous genes were found. To our knowledge, this is the first draft sequence of phytoplasma 16SrIII-J. Further studies are in progress to obtain the complete genome sequence. […]

Pipeline specifications

Software tools Velvet, Celera assembler, RAST, BLASTX
Databases NMPDR
Application Membrane protein analysis