Computational protocol: Whole Genome Sequencing of Three Clonal Clinical Isolates of B. cenocepacia from a Patient with Cystic Fibrosis

[…] Illumina adapters were removed from raw fastq files using cutadapt [], after which reference based mapping to the Burkholderia J2315 reference [] was performed, separately for each sample, using bowtie2 [] with options:—phred33—local—dovetail—maxins 850. Variant positions were called using samtools [], and to ensure high confidence, were filtered based on five parameters: (i) minimum read depth of five with at least one read in each of the forward and reverse direction; (ii) maximum depth not greater than the highest 2.5% of the distribution for the sample; (iii) minimum root-mean-square read mapping quality of 30; (iv) minimum of 75% of reads supporting the consensus call; (v) calls required to be homozygous under the diploid model assumed by samtools; using a bespoke Python script. De novo assembly was performed on the unmapped reads following assembly with bowtie2, as well as the entire read set, using Velvet [] in conjunction with the VelvetOptimiser ( All contigs >500bp (mean 200 and 1412 per sample for unmapped and all reads respectively) were used in downstream analyses. To identify large regions varying between samples and not present in J2315, contigs produced from the unmapped reads were entered into Mauve [] and NUCmer [] to identify unique regions. Confirmation of any regions identified was then performed using BLAST searches of the region against the Velvet contigs made from the entire read set, and if the region was not present in every sample it was deemed a variant. Larger variants between samples and present in the reference, were identified using a window based approach on the reference based assembly, where any region of 1kb with >500bp different between two samples was identified. BLAST Ring Image Generator (BRIG) was used to generate visual genome comparisons with the J2315 reference [] with low-confidence calls in the query being assigned the corresponding base from the reference sequence. […]

Pipeline specifications

Software tools cutadapt, Bowtie2, SAMtools, Velvet, VelvetOptimiser, Mauve, MUMmer, BRIG
Organisms Burkholderia cenocepacia, Burkholderia cepacia, Escherichia coli, Homo sapiens
Diseases Cystic Fibrosis, Infection, Pulmonary Disease, Chronic Obstructive
Chemicals Nucleotides