Computational protocol: High genetic diversity and geographic subdivision of three lance nematode species (Hoplolaimus spp.) in the USA

Similar protocols

Protocol publication

[…] Contigs were assembled in Sequencher 5.1 (Genes code corp., Ann Arbor, MI). All sequences were checked and edited manually, and chromatograms were inspected to confirm base calling and to identify recombination sites. Consensus DNA sequences were then aligned using ClustalW (Thompson et al. ) including three out-group taxa: H. columbus sequences obtained from this study and sequences of Rotylenchus robustus (JX015440) and Rotylenchus paravitis (JX015415) from GenBank. The original alignment for ITS consisted of 1050 bp, but several indels were detected. Therefore, divergent and ambiguously aligned positions were removed and conserved blocks selected using the software Gblocks v0.91b (Castresana ) with default values. The resulting dataset comprised 550 bp of the ITS1 portion of the gene. For the mitochondrial region, the alignment consisted of 347 bp. The new generated haplotypes for both genes were deposited in GenBank (Table).For the COI marker, the aligned sequences were well defined and chromatograms had no double peaks, ambiguous positions or indels. Nonetheless, we tested for the occurrence of stop codons that could denote the presence of nuclear copies of mitochondrial-derived genes (numts) or COI pseudogenes (Zhang and Hewitt ; Song et al. ; Moulton et al. ). Numts are copies of mitochondrial genes moved to the nuclear genome that become nonfunctional and noncoding. Consequently, these numts can confuse phylogenetic analyses (Song et al. ; Moulton et al. ; Baeza and Fuentes ). To check for the presence of numts, we followed Song et al. () and did a basic local alignment search (blast) of all COI sequences in NCBI (National Center for Biotechnology Information) against the database nucleotide collection (nr/nt) and optimized for highly similar sequences to include only haplotypes that showed E-values ≥ 1.0e-45 and similarity ≥ 90% with plant-parasitic nematodes. All retrieved sequences were of plant-parasitic nematodes most commonly of the genera Rotylenchus and Scutellonema followed by Heterodera, Punctodera, and Meloidogyne. After this, the COI haplotypes were translated using the invertebrate mitochondrial code in Mega v.5 (Tamura et al. ) to verify the protein coding frameshifts and nonsense codons for each of the six putative reading frames in DNAsp (Librado and Rozas ). [...] For phylogenetic analysis, the most appropriate evolutionary model was selected for each gene dataset using the Akaike information criterion (AIC) in the software Modeltest v3.7 (Posada and Crandall ). For both genes, the best-fit model was GTR with invgamma-shaped rate variation (G) (0.7653 for COI and 2.1690 for ITS), and a proportion of invariable sites (I) (0.4489 for COI and 0.4330 for ITS), with nucleotide frequencies of A = 0.289, C = 0.064, G = 0.1788, T = 0.4686 for COI; and A = 0.2798, C = 0.2886, G = 0.2163, T = 0.2153 for ITS. Phylogenetic relationships were constructed for each gene separately using maximum-likelihood (ML) and Bayesian inference (BI). Maximum-likelihood analysis was performed in Treefinder (Gangolf et al. ) using the default parameters. Branch support was based on 1000 bootstrap pseudoreplicates (Felsenstein ), and clades were considered as well/strongly supported when bootstrap was >70%. Bayesian inference was implemented in the software MrBayes 3.1.2 (Ronquist and Huelsenbeck ). The analysis was conducted for 6 million generations, and trees were sampled for every 100th generation from the Markov Monte Carlo chain (MCMC) analysis. A burn-in period was set to discard the first 1250 trees with nodal support defined as posterior probabilities, and clades were considered strongly supported when values were > 0.95 (Alfaro et al. ). Additionally, for the COI sequences, a neighbor-joining (NJ) tree was constructed in MEGA v.5.0 (Tamura et al. ) using the Kimura two-parameter (K2P) model under the default settings. From this dataset, a pairwise distance matrix among haplotypes was generated to calculate intra- and interspecific genetic divergence among haplotypes, as it is typically performed for DNA barcoding studies (Hebert et al. ). [...] Genetic diversity analyses were based on mitochondrial DNA data because COI gene trees provided better resolution than ITS1. Mitochondrial haplotypes networks were constructed for each species using median-joining (MJ) networks in Network 4.5.1.0 (http://www.fluxus-engineering.com/sharenet.htm). Analysis of molecular variance (AMOVA) was conducted in Arlequin v. 3.11 (Excoffier et al. ) to infer genetic structure within species. Values of Fst were also calculated in Arlequin v. 3.11 to estimate genetic differentiation among clades and tested for significance by permuting haplotypes between species/populations (10,000 replicates). […]

Pipeline specifications

Software tools Sequencher, Clustal W, Gblocks, MEGA-V, DnaSP, ModelTest-NG, MrBayes, Arlequin
Databases Meloidogyne
Organisms Caenorhabditis elegans