Computational protocol: Recent dermatophyte divergence revealed by comparative and phylogenetic analysis of mitochondrial genomes

Similar protocols

Protocol publication

[…] Three mtDNA sequences of T. rubrum strain IP1817.89 deposited in GenBank (Accession numbers: X65223, X88896 and Y98476) already covered >80% of the mitochondrial genome[-]. Four pairs of primers for long-distance and accurate polymerase chain reaction (LA-PCR) were designed according to the known T. rubrum sequences and their locations to the complete mitochondrial genome of E. floccosum to create a mtDNA sequence scaffold for T. rubrum strain BMU016721 (data not shown). The LA-PCR system contained 0.5 μl LATaq polymerase, 5 μl 10 × LA PCR buffer, 8 μl dNTP mixture, 1 μl template DNA, 1 μl primer 1, 1 μl primer 2, and 33.5 μl dH2O to give a final volume of 50 μl. LA-PCR conditions were 94°C, 1 min; 98°C, 10 sec; 46.8°C, 15 min; 72°C, 10 min. Steps 2 and 3 were repeated for 30 cycles. All regents were from Takara. LA-PCR products were cloned into the PCR-XL-TOPO vector (Invitrogen). Recombinant plasmids were analyzed by restriction analysis to confirm the presence of the insert DNA. Primer-walking methods were then used to obtain the complete mtDNA sequence of T. rubrum.The complete T. rubrum mitochondrial sequence revealed that the genomes were colinear and the overall nucleotide sequence similarity between mtDNAs of T. rubrum and E. floccosum was >94%, indicating that the mitochondrial genomes of these 2 species are highly conserved. Based on this observation an optimized PCR strategy was devised for mtDNA sequence determination for the other dermatophytes. A selected set of 41 primer pairs used for T. rubrum mtDNA sequencing was applied to the sequencing of mtDNAs from the other dermatophyte species. In about half of cases the expected specific amplicons were generated (see Additional file ). Products of length <2 kb were sequenced after purification with QIAquick Gel Extraction Kit (QAIGEN); larger amplicons were cloned into the PCR-XL-TOPO vector (Invitrogen) prior to sequencing. Further species-specific primers were designed as required for each genome in order to cover remaining sequence gaps; the complete genomes were completed by primer walking.To confirm the authenticity of the sequences obtained, all 5 genomic sequences were confirmed by overlap PCR covering the complete mtDNA genome with direct sequencing in both directions. All sequencing was performed using an ABI3730 automated sequencer (Applied Biosystems). Sequences were assembled using the Phred/Phrap/Consed package [,] with Phred scores set at >20 corresponding to an error rate <1%. The overall sequence quality of each genome was further improved by applying the following 2 criteria to each nucleotide sequenced: coverage by at least 2 independent high-quality (Phred scores >20) reads and a final consensus quality score (Phrap) of >40.Potential open reading frames (ORFs) were identified using the ORF Finder program based on genetic code 4. Functional annotation employed BLASTP [] comparison of translations with the GenBank non-redundant protein database and manual curation. Ribosomal RNA genes were identified by comparison with the published rRNA sequences of E. floccosum (GenBank accession: AY916130). Transfer RNA genes were identified using the tRNAscan-SE program []. [...] Genomic comparisons of dermatophyte mtDNAs employed GenomeComp []. Orthologs between the mitochondrial genomes of T. rubrum and C. albicans were identified by bidirectional BLASTP comparisons. Fourteen of the 15 conserved proteins (excluding Rps5) were used for whole mitochondrial genome-based phylogenetic analysis of 18 filamentous fungi (including 6 dermatophytes) and 17 yeasts. The sequences of the selected proteins were extracted from the fungal mitochondrial genomes in the GenBank database. Protein sequence alignment was carried out for each individual protein using ClustalW []. Multi-alignments were then manually checked and trimmed with BioEdit (version 6.0, by Tom Hall, Department of Microbiology, North Carolina State University, Raleigh). The Datamonkey server was used to calculate the mean dN/dS values of protein-coding genes for dermatophytes [].The dataset, a concatenation of 14 proteins comprising 4,298 amino acids, was analyzed by TREE-PUZZLE software [] to construct the maximum likelihood (ML) tree. Before tree construction, the ProTest software [] was used to test and determine optimal model-fitting of the sequence data. The WAG model was adopted as optimal selection. The heterogeneity rate was estimated by gamma distribution with 8 rate categories and the α-parameter was estimated from the dataset. Reliability of the dataset was assessed by bootstrap. One hundred permutation datasets were generated using the SEQBOOT program from the PHYLIP package (version 3.68, by Joe Felsenstein, Department of Genome Sciences, University of Washington, Seattle). For each of the 100 datasets a ML tree was constructed using the same parameters as described above. TREE-PUZZLE was then used with the 'consensus of user-defined trees' option to generate a consensus tree. Using the 400 Ma ascomycota fossil [] as a primary calibration point the dating of dermatophyte divergence was estimated using MEGA 4.0 software []. […]

Pipeline specifications

Software tools Phrap, Consed, Open Reading Frame Finder, BLASTP, tRNAscan-SE, GenomeComp, Clustal W, BioEdit, Datamonkey, TREE-PUZZLE, PHYLIP, MEGA
Applications Genome annotation, Phylogenetics, Population genetic analysis
Organisms Trichophyton mentagrophytes, Epidermophyton floccosum, Trichophyton rubrum, Arthroderma uncinatum, Microsporum canis, Nannizzia nana
Diseases Mycoses