Computational protocol: Monophyly of clade III nematodes is not supported by phylogenetic analysis of complete mitochondrial genome sequences

Similar protocols

Protocol publication

[…] Twelve mitochondrial protein-coding genes and two ribosomal RNA genes of each species were identified by finding gene boundaries based on comparison with other nematode mitochondrial DNA sequences. Putative secondary structures of 22 tRNA genes from each of the three mtDNAs were identified using the tRNAscan-SE program [] or by manually finding potential secondary structures and anticodon sequences. Thirty-six complete nematode mitochondrial genomes including three newly sequenced in the present study were used for phylogenetic analysis with two arthropod species (Lithobius forficatus and Limulus polyphemus) as outgroups. A complete list of species, their taxonomy and GenBank accession numbers are provided in Additional file . For phylogenetic analysis, both nucleotide and amino acid sequence datasets from the 12 protein-coding genes were used. For multiple alignment of amino acid sequences, the nucleotide sequences of each of 12 protein-coding genes were first translated into amino acids using the invertebrate mitochondrial genetic code. The resulting amino acid sequences were then aligned for each gene using Clustal × with default options []. The nucleotide sequences of the 12 protein-coding genes were aligned based on the framework of their corresponding amino acid alignment using RevTrans, a web-based program for placing gaps in coding DNA based on amino acid alignments []. Alignments of individual genes were concatenated for phylogenetic analysis. A nucleotide dataset excluding 3rd positions of codons was also constructed. Phylogenetic analyses for the concatenated datasets (full nucleotide, nucleotide excluding 3rd positions, amino acids) were performed using two different tree-building methods. Bayesian inference was used for the amino acid dataset and conducted using the codon model for MrBayes version 3.1.2 []. MrBayes was run using four MCMC chains for 106 generations, and sampled every 1,000 generations. Each of the 12 genes was treated as a separate unlinked data partition. For the amino acid dataset, Bayesian posterior probability (BPP) values were determined after discarding the initial 200 trees (the first 2 × 105 generations) as burn-in. With the nucleotide dataset excluding 3rd positions, each of the 12 genes was treated as a separate unlinked data partition. MrBayes was executed on the Cipres Portal for the nucleotide datasets and using four MCMC chains for 4 × 106 generations, sampled every 4,000 generations. Bayesian posterior probability (BPP) values were determined after discarding the initial one-third of trees as burn-in. Maximum likelihood analysis was used for the two nucleotide datasets and conducted using RAxML 7.0.3 [] and the CIPRES web portal. For RAxML, each of the 12 genes was treated as a separate partition (with gamma rate heterogeneity and all gamma model parameters estimated for each partition by the program). Bootstrap ML analysis was performed using the rapid bootstrapping method (RAxML) with 1,000 replicates. Statistical tests for comparing alternative phylogenetic hypotheses were performed using both the complete nucleotide dataset and the dataset excluding 3rd positions of codons; alternative trees were evaluated using the likelihood-based Shimodaira-Hasegawa test [] as implemented in RAxML 7.0.3. Alternative trees for comparison were found based on RAxML searches (with gene partitions as detailed previously), but with the tree topology constrained to reflect the alternative hypothesis of choice (e.g., constrained for clade III monophyly).The best ML tree recovered for the complete nt dataset constrained for clade III monophyly was: [((Lithobius forficatus ,((Strongyloides stercoralis,(Steinernema carpocapsae,((((Enterobius vermicularis, Wellcomia siamensis),(Heliconema longisimum,(Setaria digitata,(Brugia malayi,(Dirofilaria immitis, Onchocerca volvulus))))),(Cucullanus robustus,((Anisakis simplex, Toxocara malaysiensis), Ascaris sum))),(((((Trichostrongylus axei, Cooperia oncophora),((Mecistocirrus digitatus, Haemonchus contortus), Teladorsagia circumcincta)),(Necator americanus,((Ancylostoma duodenale, Ancylostoma caninum),((Syngamus trachea, Strongylus vulgaris),(Chabertia ovina, Oesophagostomum dentatum))))),Metastrongylus pudendotectus),(Heterorhabditis bacteriophora,(Caenorhabditis elegans, Caenorhabditis briggsae)))))),(((((Agamermis sp., Hexamermis agrotis),(Romanomermis culicivorax, Strelkovimermis spiculatus)), Thaumamermis cosgrovei), Xiphinema americanum), Trichinella spiralis))), Limulus polyphemus);].The best ML tree recovered for the dataset excluding 3rd positions and constrained for clade III monophyly was: [((Lithobius forficatus,((Trichinella spiralis,((Thaumamermis cosgrovei,((Agamermis sp., Hexamermis agrotis),(Romanomermis culicivorax, Strelkovimermis spiculatus))), Xiphinema americanum)),(Strongyloides stercoralis,(Steinernema carpocapsae,((((Enterobius vermicularis, Wellcomia siamensis),((((Onchocerca volvulus, Dirofilaria immitis), Brugia malayi), Setaria digitata), Heliconema longisimum)),(Cucullanus robustus,(Anisakis simplex,(Ascaris sum, Toxocara malaysiensis)))),(((((Ancylostoma caninum, Ancylostoma duodenale),((Chabertia ovina, Oesophagostomum dentatum),(Syngamus trachea, Strongylus vulgaris))), Necator americanus),(Metastrongylus pudendotectus,(Trichostrongylus axei,(( Cooperia oncophora, Teladorsagia circumcincta),(Haemonchus contortus, Mecistocirrus digitatus))))),(Heterorhabditis bacteriophora,(Caenorhabditis briggsae, Caenorhabditis elegans)))))))), Limulus polyphemus);]. […]

Pipeline specifications

Software tools tRNAscan-SE, Clustal W, RevTrans, MrBayes, CIPRES Science Gateway, RAxML
Applications Genome annotation, Phylogenetics, Nucleotide sequence alignment
Organisms Caenorhabditis elegans, Cucullanus robustus, Wellcomia siamensis, Heliconema longissimum