Computational protocol: Molecular Strain Typing of Mycobacterium tuberculosis: a Review of Frequently Used Methods

Similar protocols

Protocol publication

[…] Almost all higher eukaryote genomes possess tandemly repeated sequences that are dispersed by thousands of copies (). Their repeat numbers are highly variable in many loci and therefore are called “variable number tandem repeat” (VNTR) loci (). Small repetitive DNA sequences with different unique characters were found in M. tuberculosis and other mycobacterial genomes by different scientists (). In 1997, Supply et al. () identified a novel minisatellite-like structure in the M. tuberculosis genome composed of 40- to 100-bp repetitive sequences and named them “mycobacterial interspersed repetitive units” (MIRU). These are scattered in 41 locations throughout the genome of M. tuberculosis H37Rv. Among those 41 locations, 12 show polymorphisms in copy number of non-related M. tuberculosis isolates (). These MIRUs are located mainly in intergenic regions and are dispersed throughout the mycobacterial chromosome. They have different characteristics from other repetitive sequences. For example, there are no obvious palindromic sequences; rather, they are direct tandem repeats. Orientation occurs in one direction relative to transcription of the adjacent gene, and they contain small open reading frames (ORFs) (). In 2001, Supply and colleges proved the usefulness of MIRUs in mycobacterial strain identification for epidemiologic study by developing an automated PCR method with computerized automation of the genotyping. The principle of the typing system is PCR analysis of 12 variable tandem repeat loci with specific primers complementary to the flanking regions followed by gel electrophoresis. The size (bp) of the amplicon reflects the tandem repeat unit and is converted into numerical code to get digital format results in which each digit represents the number of copies at a particular locus ().Typing by comparison of these numeric codes makes the method easier to handle a large number of strains. Furthermore, the MIRU-VNTR technique is a reliable genotyping method, as it is 100% reproducible (), sensitive, and specific for M. tuberculosis complex isolates. At the same time, a website was set up for the analysis of M. tuberculosis MIRU-VNTR genotypes via the Internet (). Development of this method has been used for the real-time tracing of transmission, comparison of inter-laboratory data, and global database construction (). The way for worldwide epidemiologic surveillance of tuberculosis therefore has been opened.The MIRU-VNTRs are remarkably stable, and their evolution rate is slightly slower than that of IS6110-RFLP, so that the method is appropriate for long-term epidemiologic analyses (). The discriminatory power of MIRU-VNTR typing based on 12 loci is slightly less than that of IS6110-RFLP analysis when M. tuberculosis isolates have high copy numbers of IS6110 (, ) but is more discriminatory than the IS6110-RFLP if isolates have low copy numbers of IS6110 (). Another drawback of 12 loci MIRU-VNTR is its limited power for discriminating the Beijing family strains of M. tuberculosis (, ). In 2006, Supply et al. () selected 15 loci (including 6 previously investigated) as the new standard for epidemiologic discrimination of M. tuberculosis as well as 24 loci (including 12 previously investigated) as a high-resolution tool for phylogenetic studies. This newly proposed set of 15 loci of MIRU-VNTR for strain typing has been accepted as highly discriminatory in a population dominated by Beijing family strains () and 24 loci as equal in discriminatory power to that of IS6110-RFLP (, ). Additional use of hypervariable loci such as VNTRs 3232, 3820, and 4120 has been recommended to be added to standardized loci set for second-line typing if more detailed genotyping is necessary to differentiate Beijing strains ().While doing amplification of tandem repeats, PCR failure was found in some loci, which occurred repeatedly even though the tests were performed under different PCR conditions and with different primer sets. This can be assumed to be characteristic of certain strains. It may be attributable to the presence of unexpectedly large numbers of repeat units or deletion of this region (). In some cases, double alleles are found at a single locus in MIRU-VNTR profiles, and they are considered to be a clonal variant of the same strain. But when double alleles are found at two or more loci of the same isolate, the sample should be considered a mixed infection ().Identical MIRU-VNTR patterns are considered to be in a cluster. Using the dice coefficient and the un-weighted pair group method with arithmetic averages (UPGMA), dendrograms can be generated. The discriminatory power of strain typing methods is calculated using the method described by Hunter and Gaston (). Free software named MIRU-VNTRplus is appropriate to analyze the multi-locus variable number tandem repeat analysis (MLVA) and spoligotyping, large sequence polymorphism, and single nucleotide polymorphism data. A weighted combination of these markers has been applicable from the Web since 2010; it is now widely used. By this Web tool, strain similarity search, generating phylogenetic trees and minimum spanning trees, and geographic information mapping all can be done (). A schematic diagram of the MIRU-VNTR genotyping method is shown in . [...] In recent years, WGS has been used for genotyping of M. tuberculosis and is especially useful to examine outbreaks by identifying transmission events where strains are genetically indistinguishable by current methods (). WGS-based genotyping also offers an optimal resolution of M. tuberculosis complex isolates in molecular epidemiologic studies and can provide additional information (e.g., on drug resistance) (). Bryant et al. () performed a randomized controlled trial of tuberculosis treatment using WGS and MIRU typing as molecular tools and reported that WGS enables the differentiation of relapse and re-infection with better resolution than MIRU-VNTR. A recent population-based study using a long-term large-scale WGS approach in a high-prevalence area provided strong evidence for differences in transmission patterns and virulence in various M. tuberculosis lineages ().One major obstacle to WGS-based genotyping is the difficulty of data standardization and integration into a readily accessible and expandable database. A number of studies were carried out to improve the standardization of WGS genotyping and the creation of Web-assessable databases for global TB surveillance (). Bio-informative programs such as CLC Genomics Workbench (www.clcbio.com/products/clc-main-workbench/), and UniproUGENE (http://ugene.unipro.ru/) can be used for WGS data analysis (). With the decreasing price of WGS, which has a higher discriminatory power and a single rapid analysis for identification, drug resistance prediction, and epidemiologic typing, the method is expected to become a new standard for routine typing of M. tuberculosis in the near future. […]

Pipeline specifications

Software tools MIRU-VNTRplus, CLC Genomics Workbench, CLC Assembly Cell, CLC Main Workbench, Unipro UGENE
Applications Phylogenetics, WGS analysis, Sanger sequencing
Organisms Mycobacterium tuberculosis
Diseases Tuberculosis