Computational protocol: Whole-genome sequencing to investigate a non-clonal melioidosis cluster on a remote Australian island

[…] The high-quality, finished genome of the water supply storage tank isolate MSHR0491 [] was used as the reference (GenBank accession numbers CP009485.1 and CP009484.1; SRA accession number ERR311054). Comparative genomic analysis was performed using bwa v0.6.2 [] for read alignment, SAMTools v1.2 [] and Picard ( for alignment processing, gatk 3.2.2 [] for single-nucleotide polymorphism (SNP), small insertion/deletion (indel) identification and comparative analysis of variants, and SnpEff v4.1 [] for variant annotation. These programs were executed as part of the SPANDx pipeline v3.1.2 [], which wraps these tools for ease-of-use and reproducibility. Regions of recombination were identified using Gubbins v1.4.1 [] and ClonalFrameML []. Phylogenetic reconstruction was performed using maximum parsimony in paup v4.0a142 [] or maximum likelihood in RAxML v8.1.17 [], with resultant trees visualized and manipulated using FigTree v1.4.0 ( were assembled to improved high-quality draft standard [] using the mgap pipeline (, which consists of Trimmomatic [] and Velvet [], with parameters optimized using VelvetOptimiser (, and subsequent draft improvement with sspace [], Gapfiller [], image [] and icorn2 ]. MLST profiles were extracted from whole-genome assemblies using BIGSdb [], which is available on the B. pseudomallei PubMLST website ( […]

Pipeline specifications

Software tools BWA, SAMtools, Picard, GATK, SnpEff, SPANDx, Gubbins, ClonalFrame, PAUP*, RAxML, FigTree, Trimmomatic, Velvet, VelvetOptimiser, SSPACE, GapFiller, iCORN, BIGSdb
Databases PubMLST
Applications Phylogenetics, De novo sequencing analysis, Nucleotide sequence alignment
Organisms Burkholderia pseudomallei, Homo sapiens
Diseases Infection, Leukemia, Lymphoid, Melioidosis