Computational protocol: Virulent Clones of Klebsiella pneumoniae: Identification and Evolutionary Scenario Based on Genomic and Phenotypic Characterization

[…] For each MLST locus, an allele number was given to each distinct sequence variant (confirmed by at least two chromatogram traces), and a distinct sequence type (ST) number was attributed to each distinct combination of alleles at the seven genes. Allele and profile numbers were incremented successively in the order in which they were discovered. In order to define the relationships among isolates at the microevolutionary level, we performed allelic profile – based comparisons using a minimum spanning tree (MStree) analysis with the BioNumerics v5.10 software (Applied-Maths, Sint Maartens-Latem, Belgium). MStree analysis links profiles so that the sum of the distances (number of distinct alleles between two STs) is minimized . Isolates were grouped into clonal complexes (clonal families), defined as groups of profiles differing by no more than one gene from at least one other profile of the group . Accordingly, singletons were defined as STs having at least two allelic mismatches with all other STs.Split decomposition analysis was performed using SplitsTree version 4.10 , . Neighbor-joining tree analysis was performed using MEGA v4 . Nucleotide diversity indices were calculated using DNAsp v4 . ClonalFrame analysis was performed with 50,000 burn-in iterations and 100,000 subsequent iterations.The relative contribution of recombination and mutation on the short term was calculated using eBURST and the clonal diversification method , . For each pair of allelic profiles that differ by a single allelic mismatch (single locus variants, or SLVs), the number of nucleotide changes between the alleles that differ is counted. A single nucleotide difference is considered to be likely caused by mutation, whereas more than one mutation in the same gene portion is considered to derive from recombination, as it is considered unlikely that two mutations would occur on the same gene while the other genes remain identical. No correction was made for single nucleotide differences possibly introduced by recombination.The population recombination rate was estimated by a composite-likelihood method with LDhat . LDhat employs a parametric approach, based on the neutral coalescent, to estimate the scaled parameter 2Ner where Ne is the effective population size, and r is the rate at which recombination events separate adjacent nucleotides. The crossing-over model L was used for the analysis of biallelic sites, with frequency of the less frequent allele >0.1. […]

Software tools BioNumerics, SplitsTree, MEGA-V, DnaSP, ClonalFrame, BURST
Applications Phylogenetics, WGS analysis, GWAS
Organisms Klebsiella pneumoniae, Mus musculus, Homo sapiens, Klebsiella pneumoniae subsp. ozaenae
Diseases Hematologic Diseases, Infection, Liver Diseases, Pneumonia, Respiratory Tract Infections, Rhinoscleroma