Computational protocol: Small Mammal Investigation in Spotted Fever Focus with DNA-Barcoding and Taxonomic Implications on Rodents Species from Hainan of China

Similar protocols

Protocol publication

[…] All sequences were aligned using CLUSTALW and manually confirmed. The COI and Cyt-b gene sequences of specimens of Rattus and Niviventer were aligned separately, and trimmed to a common length before concatenation. Neighbour Joining (NJ) trees based on COI sequences were generated using K2P distances, calculated in Paup*4b . Missing data were ignored for distance calculation, and ties were broken at random. Phylogenies were generated from the complete dataset using Maximum likelihood (ML) and Bayesian inference (BI) approaches. Micromys minutus (HM217360, HM217482) was selected as the outgroup in all analyses. For use in model based tree inferences, the best fit substitution models were determined for the two partitions (COI and Cyt-b) using Likelihood ratio tests , implemented in Jmodeltest0.1 . The TPM1uf+G model was selected for COI of Rattus and Niviventer species, and the TIM2+I+G model was selected for Cyt-b sequences. ML trees were inferred using Garli v2.0 , a software allowing the implementation of partitioned evolutionary models. The best fit model for each gene was input via the starting model option (the ‘streefname’ option given in the configuration file), and these values fixed. Then a partitioned search was performed with otherwise default settings. Node support was obtained via bootstrapping, with the topology termination threshold (parameter: genthreshfortopoterm) reduced to 1000 to increase search speed. Bayesian trees were inferred using MrBayes v3.1.2 , again with a partitioned model. The Bayesian search was run for 2 million generations, sampling every 500, with two independent runs performed, each consisting of three heated and one cold chain. Convergence was assessed using the standard deviation of split frequencies, and the estimated sample sizes (ESS) of the sampled parameters, as calculated using Tracer .Molecular delineation was carried out on a dataset from which identical haplotypes were removed (according to the algorithm given by ). The dereplicated dataset consisted of 52 sequences, with a total length of 1789 bp. A NJ tree was first generated, using Paup*4b . Genetic distances were calculated under the K2P model, where missing data were ignored in distance calculation, and ties broken at random. ML and BI trees were also inferred for the dereplicated dataset, using the same method as used for the analysis of the complete dataset. The phylogenies from the three different methods were clock constrained using r8s 1.71 . The root node was fixed at an arbitrary value of 1.0, then ultrametric trees formed by penalized likelihood (PL) and non-parametric rate smoothing (NPRS). For PL, smoothing parameters were compared by cross calibration (r8s command: divtime method = pl crossv = yes cvstart = −3 cvinc = 1 cvnum = 9), with the optimal value (10), used in further analyses. Finally, the putative species units on the ultrametric trees were determined using the general mixed Yule coalescent (GMYC) method . This procedure detects the switch in the rate of lineage branching in a tree, from interspecific long branches to intraspecific short branching, and identifies clusters of specimens corresponding to putative species. A threshold (T) is optimized with the GMYC model so that nodes before the threshold are considered as species diversification events, therefore the number of species can be estimated. Significance was assessed by likelihood ratio test against a null model of a single coalescent population. This test was implemented using R code provided by T. G. Barraclough. […]

Pipeline specifications

Software tools Clustal W, jModelTest, GARLI, MrBayes, r8s
Application Phylogenetics
Organisms Homo sapiens, Suncus murinus, Callosciurus erythraeus, Tupaia belangeri, Rattus rattus
Diseases Cytochrome-c Oxidase Deficiency