Computational protocol: Genome-Wide Analysis of the First Sequenced Mycoplasma capricolum subsp. capripneumoniae Strain M1601

[…] The complete sequence was analyzed using Glimmer 3.0 () for open reading frames containing >30 predicted amino acid residues. Transfer RNA (tRNA) and ribosomal RNA (rRNA) genes were predicted using tRNAscan-SE () and Aragorn (), and RNAmmer (), respectively. Insertion and deletion (InDel) detection was conducted using LASTZ software () to compare M1601 with Mycoplasma capricolum subsp. capricolum (Mcc) reference strain 27343. The best match results (<10 bp) were then extracted by using axtBest to obtain the preliminary InDel results. The 150 bp (3 × SD) from upstream and downstream of the reference sequence InDel sites were aligned and validated with the sample sequencing reads by BWA software (). After filtering, the reliable InDel sites were obtained. The genomic islands and insertion sequences were found by using Path-DIOMB () and ISfinder (, respectively.The function annotation of the predicted protein-coding genes was conducted by blasting based on the COG, KEGG, Swiss-Prot, TrEMBL, and NCBI-NR databases. Pseudogenes were detected by BLASTN analysis, and then the annotation was revised manually.The putative virulence genes were identified by gene annotation and reference studies (; ; ; ; ). BLASTP searches (E-value <1e−5) against the NCBI database were applied, and the results were filtered by selecting the highest score of alignment (homology identity >40% and minimal alignment length percentage >40%). Core genes and specific genes were analyzed by CD-HIT software () for clustering similar proteins with a threshold of 50% pairwise identity and 0.7 length difference cutoff in amino acids. [...] Genomic alignment of Mccp strains M1601 and F38 was conducted using MUMmer () and LASTZ (). Genomic synteny was performed based on the alignment results. Multiple sequence alignments of single-copy of core genes among 31 Mycoplasma strains were performed using MUSCLE (). The phylogenetic tree was constructed by TreeBeST () using the maximum likelihood method with 1000 bootstrap replicates. The genome sequences of other Mycoplasma strains were downloaded from the NCBI database. […]

Pipeline specifications

Software tools Glimmer, tRNAscan-SE, ARAGORN, RNAmmer, LASTZ, BWA, BLASTN, BLASTP, CD-HIT, MUMmer, MUSCLE, TreeBest
Databases UniProt ISfinder KEGG
Applications Genome annotation, Phylogenetics, Nucleotide sequence alignment
Organisms Capra hircus, Mycoplasma mycoides
Diseases Pleuropneumonia
Chemicals Adenosine Triphosphate, Pyruvic Acid