Computational protocol: Minimal genome encoding proteins with constrained amino acid repertoire

[…] The starting list of orthologous genes shared by the majority of Mollicutes (List 1 or ‘computational minimal genome’) was constructed using the EdgeSearch algorithm (), and the proteins encoded by the following 25 mollicute genomes: Acholeplasma laidlawii PG 8A, Aster yellows witches broom phytoplasma, Mesoplasma florum L1, Mycoplasma agalactiae PG2, Mycoplasma arthritidis 158L3 1, Mycoplasma bovis PG45, Mycoplasma capricolum, Mycoplasma conjunctivae HRC 581, Mycoplasma crocodyli MP145, Mycoplasma fermentans M64, Mycoplasma gallisepticum F, M. genitalium G37, Mycoplasma hyopneumoniae J, Mycoplasma hyorhinis HUB 1, Mycoplasma leachii PG50, Mycoplasma mobile 163 K, Mycoplasma mycoides capri, Mycoplasma penetrans HF 2, Mycoplasma pneumoniae M129, Mycoplasma pulmonis UAB CTIP, Mycoplasma suis str. Illinois, Mycoplasma synoviae 53, Onion yellows phytoplasma OY M, Ureaplasma parvum serovar 3 str. ATCC 27815, Ureaplasma urealyticum serovar 10 ATCC 33699. List 2 (‘experimental minimal genome’) was taken from (). The union of the List 1 and List 2 that consists of 439 genes was the main subject of the study—List 3 or ‘the derived minimal genome’. The gene content of List 1, List 2 and List 3 is given in Supplementary Table S4. A gene may be covered by none or several (in case of multiple domains) orthologous groups, or several genes might belong to the same orthologous group, e.g. if they are mollicute-specific gene duplications. Multiple alignments of orthologs from List 4 were produced by the MUSCLE program (). The PSI-BLAST program () with inclusion cutoff (-h parameter) 0.01 and HHPred server () were used to search for distant homologs of proteins from List 1 that are not part of any mollicute orthologous group (MOG) or clusters of orthologous group (COG). Non-orthologous displacements by isofunctional proteins were identified by literature search, aided by the most recent catalog of the known displacements (). Assignment of COGs to functional categories was taken from the COG () and EggNOG () databases. […]

Pipeline specifications

Software tools MUSCLE, BLASTP, HHPred
Applications Protein structure analysis, Nucleotide sequence alignment
Chemicals Cysteine