Computational protocol: The Complete Mitochondrial Genome of Galba pervia (Gastropoda: Mollusca), an Intermediate Host Snail of Fasciola spp

Similar protocols

Protocol publication

[…] Sequences were assembled manually and aligned against the complete mt genome sequence of R. balthica and B. tenagophila using the computer program Clustal X 1.83 to identify gene boundaries. The open-reading frames and codon usage profiles of protein-coding genes were analysed by the Open Reading Frame Finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) using the invertebrate mitochondrial code. Translation initiation and translation termination codons were identified based on comparison with the mt genome of R. balthica and B. tenagophila. The amino acid sequences inferred for the mt genes of G. pervia were aligned with those of R. balthica and B. tenagophila by using Clustal X 1.83. Based on pairwise comparison, amino acid identity (%) was calculated for homologous genes. Codon usage was examined based on the relationships between the nucleotide composition of codon families and amino acid occurrence, where the genetic codons are partitioned into AT rich codons, GC-rich codons and unbiased codons. For analyzing ribosomal RNA genes, putative secondary structures of 22 tRNA genes were identified using tRNAscan-SE , of the 22 tRNA genes, 5 were identified using tRNAscan-SE, the other 17 tRNA genes were found by eye inspection, and rRNA genes were identified by comparison with the mt genome of R. balthica and B. tenagophila. [...] Phylogenetic relationship among the 20 Pulmonata species (), plus the mt DNA sequence of G. pervia obtained in the present study was reconstructed based on amino acid sequences of 13 protein-coding genes using the 2 Opisthobranchia species (Aplysia californica, GenBank accession number NC_005827 and A. dactylomela, NC_015088) as the outgroup. Each gene was translated into amino acid sequence using the invertebrate mitochondrial genetic code in MEGA 4 , and aligned based on its amino acid sequence using default settings, and ambiguously aligned regions were excluded using Gblocks online server (http://molevol.cmima.csic.es/castresana/Gblocks_server.html) using the options for a less stringent selection. The final amino acid sequences of the 13 protein-coding genes were then concatenated into single alignments for phylogenetic analyses. Three different inference methods, namely MP, ML, and Bayes, were used for phylogenetic analyses. MP analysis was performed using PAUP* 4.0b10 , with indels treated as missing character states. A total of 1,000 random addition searches using TBR were performed for each MP analysis. Bootstrap probability (BP) was calculated from 1,000 bootstrap replicates with 10 random additions per replicate in PAUP. ML analyses were performed using PhyML 3.0 , and the MtArt+I+G+F model with its parameter for the concatenated dataset was determined for the ML analysis using ProtTest 10.2 based on the Akaike information criterion (AIC) . BP value for ML trees was calculated using 1000 bootstrap replicates. Bayesian analyses were conducted with four independent Markov chains run for 1,000,000 metropolis-coupled MCMC generations, sampling a tree every 100 generations in MrBayes 3.1.1 . The first 2,500 trees were omitted as burn-in and the remaining trees were used to calculate Bayesian posterior probabilities (PP). Phylograms were drawn using the Tree View program version 1.65 . […]

Pipeline specifications