Computational protocol: Population Structure and Evolution after Speciation of the Hokkaido Salamander (Hynobius retardatus)

Similar protocols

Protocol publication

[…] We genotyped 12 microsatellite loci of H. retardatus following to the method described by Matsunami et al. []. From the resultant genotypes of individuals, we calculated observed and expected heterozygosities (HO and HE), the number of alleles (NA), and the inbreeding coefficient (FIS) with GENALEX 6.5 software []. We performed tests for deviation from Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) with Genepop ver. 4.2 software [–]. Existence of null alleles often leads to overestimation of FST [–]. We used MICRO-CHECKER V2.2.3 [] to check for the presence of null alleles. [...] To infer the phylogenetic relationships among the five populations, pairwise genetic distance and a phylogenetic tree based on the DA distance of Nei et al. [] with 1000 bootstrap iterations were calculated with POPTREE2 software []. Genetic population structures were analyzed with STRUCTURE version 2.3.2 [], which uses a Bayesian clustering method to assign individuals to genetic population units. To explore the genetic structure of each population, we conducted multiple analyses while varying the number of Bayesian clusters (K) from 2 to 9. We also used admixture models for Markov Chain Monte Carlo (MCMC) inference with prior information on the locality of samples (LOCPRIOR) and correlated allele frequency. We ran 1,000,000 MCMC repetitions after discarding the first 100,000 iterations as burn-in, and took 10 repeated simulations for each K estimation. To estimate the optimal K value, we analyzed our results according to the method of Evanno et al. [], which is implemented in the STRUCTURE HARVESTER web tool []. In this method, log-likelihood values of the 10 repeated runs for each K value and their variances are used to calculate ΔK. The average of each replicate cluster analysis was calculated by CLUMPP version 1.1.2 software []. The results of the calculations were visualized with DISTRUCT version 1.1 software [].To infer subgroups among the five local populations, we estimated the effective population size of each group and the migration rate among the groups. We used two programs to estimate gene flow: BAYESASS+ version 1.2 [] estimates gene flow by a genetic assignment method, and MIGRATE version 3.6.11 [] estimates gene flow and effective population sizes by a coalescent method. Of relevance here is that the genetic assignment method adopted by the BAYESASS+ typically measures recent migration rates, whereas the coalescent method adopted by MIGRATE estimates long-term migration rates []. BAYESASS+ uses fully Bayesian MCMC resampling to estimate the migration rate (m), allele frequency (P), and inbreeding value (F) as variable. We ran a total of 3,000,000 MCMC iterations and sampled the chain every 2000 iterations, discarding the first 1,000,000 iterations as burn-in. MIGRATE uses coalescent simulations to estimate parameter M, which is proportional to the pairwise migration rate (M = m/μ, where m = migration rate per generation and, μ = mutation rate per generation) and parameter Θ, which is proportional to the effective population size Ne (Θ = 4Neμ) of each population. We ran 10 short chains with 50,000 generations, sampled every 20 generations, and 8 long chains with 5000 generations, sampled every 20 generations. […]

Pipeline specifications