Computational protocol: Campylobacter group II phage CP21 is the prototype of a new subgroup revealing a distinct modular genome organization and host specificity

Similar protocols

Protocol publication

[…] Whole-genome sequencing was performed by LGC Genomics (Berlin, Germany) as previously described []. Library generation for 454 FLX sequencing was carried out according to the manufacturer’s standard protocols (Roche/454 Life Sciences, Branford, Connecticut, USA). Phage DNA was sheared by nebulization into fragments ranging in size from 500 to 1000 bp. The fragments were end-polished and the 454 A and B adaptors required for the emulsion PCR and sequencing were added to the ends of the fragments by ligation. The concentration of the resulting fragment library was measured by fluorometry (Qubit 2.0, Life Technologies, Darmstadt, Germany) and sequencing was performed on 1/16 picotiterplate (PTP) on the GS FLX using Roche/454 Titanium chemistry. A total of 55,386 sequence reads were assembled using the Roche/454 Newbler software at default settings (454 Life Sciences Corporation, Software release 2.3 (091027_1459)). Assembly resulted in three independent contigs (CP21_C1, 94,267 bp; CP21_C2, 56,052 bp and CP21_C3, 28,698 bp) with an average sequence coverage of more than 105 per consensus base. Gaps were closed by PCR and Sanger sequencing. To determine the sequences of repeat regions (RR), PCR primers were deduced from their flanking sequences. The repeat regions were amplified by PCR and used as targets for in vitro transposon mutagenesis using the Epicentre EZ < TNekTET-1 Insertion Kit (Biozym, Hessisch Oldendorf, Germany) according to the manufacturer’s recommendations. Mutagenized PCR products were inserted into pLitmus38 (Apr, New England Biolabs, Frankfurt am Main, Germany) and introduced into E. coli strain Genehogs (Invitrogen, Karlsruhe, Germany). After transformation, transformants were selected on agar containing tetracycline (12.5 μg/ml). Nucleotide sequences of eight to twelve transformants with transposon insertions at different positions within each repeat region were used to determine the whole sequences of the repeat regions. Sequencing was performed using primers deduced from the marker gene of the EZ < TNekTET-1 Insertion Kit (Biozym). The initial nucleotide sequence of the CP21 draft genome was submitted to EMBL under the accession number HE815464 []. As the 454 sequencing technique is prone to frameshift errors at homopolymer stretches [], the genome sequence was verified at 70 critical regions. An updated CP21 genome sequence was submitted to GenBank in March 2015. Sequence analyses and alignments were carried out using the Accelrys DS Gene software package of the Accelrys Inc. (USA). Putative open reading frames (ORFs) were suggested using the algorithm of the RAST server [–]. Similarity and identity values were calculated using different BLAST algorithms (http://www.ncbi.nlm.nih.gov/BLAST/) at the NCBI homepage []. Putative Rho-independent transcription terminators were identified using TransTerm [] and Arnold (http://rna.igmors.u-psud.fr/toolbox/arnold/). […]

Pipeline specifications

Software tools Newbler, RAST
Databases Transterm
Application WGS analysis
Diseases HIV Infections