Similar protocols

Pipeline publication

[…] nitig files hold the assemblies that were generated without taking the pair information into account. The contigs are the assemblies generated with the pair information taken into account. Scaffolds consist of merged contigs based on read pairs, and differ from the contigs in that they may contain unresolved repeats and spacers. For the downstream analysis, the contigs from the assemblies with the largest N50 contigs were used. These contigs are presented in Supplementary File ., Larger rearrangements, insertions and deletions were determined by comparing the assembled contigs with the M. pneumoniae M129 reference genome (NCBI accession number NC_000912.1). This alignment was performed with MUMmer (version 3.23) (Kurtz et al., ). These alignments are presented in Supplementary File . Analysis of the alignments was performed in R (version 3.2.2)., In addition to the de novo assembly, the reads were aligned to the M. pneumoniae M129 reference genome (NCBI accession number NC_000912.1) using BWA (Li and Durbin, ) (version 0.5.9) to detect smaller variants. Single-nucleotide variants (SNVs) and short insertions and deletions (InDels) were determined relative to the reference strain with SAMtools mpileup (Li et al., ) (version 0.1.16) (Supplementary Table and Supplementary File ). For each sample, the frequencies of the SNVs and InDels relative to the total number of reads at that position were determined as well., The strains were clustered based on their SNV/InDel profiles. SNV/InDels that were not present in at least 20% of the reads at a position in a single strain were removed from the analysis. These filtered SNV/InDels were considered low-abundant and may be caused by intrastrain variations or technical errors. The strains were then clustered based on their SNV/InDels profiles using hierarchical clustering. The distances between the SNV/InDels profiles were calculated with th […]

Pipeline specifications

Software tools MUMmer, BWA, SAMtools