Computational protocol: Intraspecific rearrangement of mitochondrial genome suggests the prevalence of the tandem duplication-random loss (TDLR) mechanism in Quasipaa boulengeri

Similar protocols

Protocol publication

[…] The tRNA genes were identified by using both tRNAscan-SE v.1.21 (http://lowelab.ucsc.edu/tRNAscan-SE) and MITOS (http://mitos.bioinf.uni-leipzig.de). To avoid misannotated tRNA genes, we predicted the secondary structure for each. We extracted and aligned the duplicated tRNA genes and their pseudogene residues.We aligned sequences of each fragments using ClustalW in MEGA6 []. DnaSP v.5.10 [] was used to determine DNA polymorphisms and divergences. To estimate the time-tree, we constructed phylogenies using cox1 and cob, and partitioned these genes by codon position. Six species of Quasipaa, including Q. verrucospinosa (KF199147), Q. shini (KF199148), Q. yei (KJ842105), Q. spinosa (FJ432700), Q. jiulongensis (KF199149) and Q. exilispinosa (KF199151), were chosen as outgroup taxa. The best-fit substitution model for each partition was estimated using the Akaike information criterion (AIC) implemented in PartitionFinder v1.1.1 []. The best model of each partition was chosen for maximum likelihood (ML) and Bayesian inference (BI) analyses, which were performed with RAxML BlackBox web-servers (http://phylobench.vital-it.ch/raxml-bb/index.php) [] and MrBayes v.3.1 [], respectively.BI as implemented in BEAST2 v.2.1.2 [] was used to obtain an ultrametric time-tree for Q. boulengeri. Each locus was assigned its own partition with unlinked substitution model but with linked clock and tree models. We assumed a substitution rate ranging from 0.65 to 1.00% per Ma for the cox1 and cob based on evolutionary rates commonly proposed for frogs [, , ]. Lacking fossil evidence, we calibrated our phylogeny using the published divergence time to the most recent common ancestor (TMRCA) between the Q. jiulongensis and Q. exilispinosa of about 9 Ma []. We ran BEAST for 20 million generations while logging trees every 1000 generations for a total of 20,000 trees. We determined a 10% burn-in length using Tracer v.1.5 and retained the maximum clade credibility tree using TreeAnnotator v.2.1.2.A Perl script named mtGordV0.5.pl was written by YZ to obtain the gene-orders of mitochondrial records deposited in GenBank, based on the annotation of the sequence. Records were downloaded together as a single file, which was used as the input file of the script. For each record with more than one gene, items in the order of accession number, sequence length, species name, gene names in their original order, and total number of genes were saved in an individual line to the output file. Items were separated from each other by a tab. The script was applied to two major groups of vertebrates, amphibians and squamate reptiles. For amphibians, all 126,638 mitochondrial records were downloaded on 02 Nov 2015, and the output file contained 17,559 records. For squamates, all 110,064 records were downloaded on 25 Sep 2015, and the output file contained 21,045 records. The output files were opened using Microsoft Excel and records were aligned according to species names. The records were manually checked for intraspecific and intrageneric cases of random loss of genes after duplication. As the script did not include all variation of annotations for all mitochondrial genes, errors from missing genes were expected for a small number of records. However, when a potential case was detected, the related GenBank full records were carefully checked. More importantly, this script made such a scan possible, analyses could be conducted within a reasonable amount of time, a few days for each group in our case, and it could be applied to other groups of taxa. Regarding the speed of the script itself, the data for squamates were processed within 3 min on a ThinkPad X200 laptop computer. […]

Pipeline specifications