Computational protocol: Genetic Structure and Demographic History Should Inform Conservation: Chinese Cobras Currently Treated as Homogenous Show Population Divergence

Similar protocols

Protocol publication

[…] Sequences were translated to amino acids with the program SQUINT to verify if a functional mitochondrial DNA sequence was obtained and that nuclear pseudogenes were not being amplified. We compiled and aligned sequences using MEGA 5.05 . We used ARLEQUIN 3.5 to identify haplotypes and estimate genetic diversity within populations by haplotype (h) and nucleotide diversities (π) . We tested for substitution saturation in cytochrome b (whole gene and each codon position separately). Within N. atra, signs of saturation were not present at any codon position; therefore, saturation was not considered to be a significant factor and all nucleotide positions were used in subsequent analyses.We reconstructed a phylogenetic tree based on the maximum likelihood (ML), using N. kaouthia (GenBank Accession No. AF217835) as the outgroup. ML analysis was carried out by a heuristic search of 10 random addition analyses with tree-bisection-reconnection (TBR) branch swapping using PAUP 4.0 beta . The GTR+I+G substitution model was selected by MODELTEST 3.7 based on Akaike information criterion (AIC) . The confidence level of the nodes in the ML tree was estimated using 1000 bootstrap pseudoreplicates. We also conducted a median-joining network (MJN) approach to depict relationships among haplotypes. This approach has been shown to yield the best-resolved genealogies relative to other rooting and network procedures . The MJN was estimated using NETWORK 4.5.0.0 .We used mismatch distributions to test demographic signatures of population expansions within mtDNA lineages . To compare observed distributions with those expected under the expansion model we calculated the sum of square deviation (SSD) and the Harpending's raggedness index . Tajima's D test and Fu's Fs test were used to test equilibrium of the populations in ARLEQUIN 3.5. The statistics was expected to have large negative values under demographic expansion. The equation τ = 2ut was used to estimate the approximate expansion time in generations (t), where τ is the date of the growth or decline measured in units of mutational time and u is the mutation rate per sequence and per generation. The approximate time of expansion in years was calculated by multiplying t by the generation time of N. atra. The generation time for large snakes was estimated as four years based on the approximate time at which animals mature , . The substitution rate of mtDNA sequences had been calibrated in studies of lizards and other vertebrates as approximately 0.65% per million years. Based on geological events (the final emergence of the Isthmus of Panama), Wüster et al. (2002) suggest a substitution rate of 0.007 site−1 myr−1 (95% confidence interval: 0.005–0.009) for cytochrome b within the Viperidae. We used the upper and lower values (0.005–0.009) to estimate the overall range of potential dates. Although this dating must be taken with extreme caution due to the lack of calibration of the substitution rate in N. atra and to the sensible overestimation of timing recent events induced by the time-dependency of molecular rates , it provides an approximate time frame. [...] All microsatellite loci were screened for null alleles and large allele dropouts using MICRO-CHECKER 2.2.3 . CONVERT was used to detect private alleles, which were alleles present in one population and not shared with any other. The mean number of alleles (NA) per locus and observed (HO) and expected heterozygosities (HE) were calculated using ARLEQUIN 3.5. FSTAT 2.9.3.2 was used to test linkage disequilibrium and to calculate allelic richness (AR) on a minimum of 17 individuals. Deviations from Hardy-Weinberg equilibrium across all loci for each population were assessed using the exact probability test in GENEPOP 4.0 . Significance values for multiple comparisons were adjusted using the Bonferroni correction.Genetic distances were computed as Slatkin's (1995) genetic distance (FST/1–FST) derived from pairwise FST, which were estimated for 15 comparisons between six populations . Geographical distances were measured as the shortest overwater distances between pairs of locations . The significance of each test was assessed using 30000 data randomizations.A Bayesian clustering method, STRUCTURE , , was used to detect genetic clustering in the whole data set. Under STRUCTURE 2.3.3 the range of possible clusters (K) tested was set from 1 to 10, and 10 independent runs were carried out for each using no prior information, assumed admixture and correlated allele frequencies. The lengths of MCMC iteration and burn-in were set at 300 000 and 50 000, respectively. The true K is selected using the maximal value of the log likelihood [Ln Pr(X/K)] of the posterior probability of the data for a given ΔK . Further, theΔ⊿K statistic, the second-order rate of change in the log probability of the data between successive values of K, was also estimated .Demographic history based on microsatellites was assessed using two different and complementary methods. First, the Wilcoxon's sign rank test was used to examine whether populations exhibit a greater level of heterozygosity than predicted in a population at mutation-drift equilibrium. This test is most sensitive to detecting bottlenecks occurring over approximately the last 2–4 Ne generations and, for most parameters, has more power to detect more recent bottlenecks (e.g. 0.2–1Ne generations ago). Second, a mode-shift test was carried out to detect a distortion of the expected L-shaped distribution of allele frequency. This test is most appropriate for detecting population declines which have occurred more recently, specifically over the last few dozen generations , . Heterozygosity deficiency, heterozygosity excess and mode-shift tests were implemented in BOTTLENECK 1.2.02 . We performed 10000 simulations in six populations and three genetic clusters under the stepwise mutation model (SMM) and the two-phase model (TPM), with 95% single step mutations and 5% multi-step mutations and a variance of 12 as recommended by Piry et al. (1999) . P-values from the Wilcoxon's test were used as evidence for bottlenecks and were assessed for significance at the 0.05 level. […]

Pipeline specifications