Computational protocol: Gene flow and demographic history of leopards (Panthera pardus) in the central Indian highlands

Similar protocols

Protocol publication

[…] We measured genetic diversity by estimating the number of alleles per locus (A), observed (Ho), and expected (He) heterozygosity in CERVUS 2.0 (Marshall et al. ). We conducted tests for deviations from Hardy–Weinberg equilibrium (HWE) and linkage disequilibrium (LD) using GENEPOP 4.0.10 (Rousset ) with a Bonferroni correction (Rice ) applied for multiple comparisons. We also used GENEPOP to calculate the effective number of migrants Barton and Slatkin () between each pair of populations at a contemporary timescale. We used ARLEQUIN with 10 000 permutations to test the statistical significance of pairwise FST values (Weir and Cockerham ) as a measure of genetic differentiation among the four TRs.We used BAYESASS1.3 (Wilson and Rannala ) and MIGRATE (Beerli and Felsenstein ; Beerli ), respectively, to estimate contemporary (past few generations) and historical gene flow (much longer period of time, approximately 4 Ne generations in the past; Beerli and Felsenstein ). We take contemporary timescale to be about five generations, or 20–25 years (assuming a generation time of 4–5 years for leopards) and the historical timescale to be several hundreds of years before present (YBP).BAYESASS uses a Bayesian method with Markov Chain Monte Carlo (MCMC) to measure gene flow by identifying population-specific inbreeding coefficients and genotypic disequilibrium. It assumes linkage equilibrium and that populations have not been subjected to genetic drift within the past 2–3 generations before sampling and allows deviations from Hardy–Weinberg expectations within populations. We used 3 × 106 iterations, with a burn-in of 106 iterations, and a sampling frequency of 2000 to ensure that the model's starting parameters were sufficiently randomized (confirmed by checking changes in likelihood values). Delta values were adjusted to optimize terminal proposed changes between chains (40%–60% of the total iterations) to ensure sufficient parameter space was searched (Wilson and Rannala ). We performed five runs using different starting-seed values to ensure consistency between runs. BAYESASS provides mean and 95% confidence intervals (CI) expected for uninformative data that can be used to assess the reliability of data and also identifies first- and second-generation migrants in a population. We used ML-Relate (Kalinowski et al. ) to establish the relatedness between second-generation migrants.MIGRATE uses coalescent theory to jointly estimate mutation-scaled migration rate [(M), M = mh/μ, where mh is historical migration rate, μ is mutation rate per generation [10−2, (Driscoll et al. )], and mutation-scaled effective population size [(Θ), Θ = xμNe, where x is 4 for nuclear data and Ne is historical effective population size]. MIGRATE allows for unequal population sizes and asymmetrical migration, thereby more closely reflecting biological reality than traditional F-statistics-based methods (Palstra et al. ). It also allows deviations from Hardy–Weinberg expectations, but assumes that populations are in migration–drift equilibrium. We ran three replicates of MIGRATE using a Brownian motion mutation model with constant mutation rates and starting parameters based on FST calculations. We used slice sampling and uniform prior distribution to estimate Θ (range = 0–20, mean = 10) and M (range = 0–500, mean = 250, and delta = 250). Following a burn-in of 50 000 iterations, each run visited a total of 1 000 000 parameter values and recorded 20 000 genealogies at a sampling increment of 50. We used a static heating scheme at four temperatures (1, 1.5, 3, and 6) to efficiently search the genealogy space. In the results, we report the mean and 95% credible intervals for Θ and M. We also calculated the effective number of migrants using the relation (Nm = Θ × M/4) and added bidirectional values to be comparable to contemporary estimates.We used the statistical approach developed by Ciofi et al. () in the program 2-MOD, to test the likelihood of two models of population history: pure drift versus immigration–drift equilibrium. In the pure drift model, allele frequencies in each population are solely the product of drift, and the effect of migration between populations is negligible. Conversely, in the immigration–drift equilibrium model, allele frequencies within populations are determined by a balance between drift and immigration. 2-MOD uses an MCMC procedure to compare likelihoods of the two scenarios and produce probabilities of the data fitting each model. The MCMC simulation was run for 100 000 iterations, and we discarded the initial 10% of data to avoid biases introduced by the starting conditions. This program was run thrice to confirm the results.We estimated contemporary effective population size using the programs LDNe1.31 (Waples and Do ) and ONESAMP1.1 (Tallmon et al. ), both of which require data from a single sampling session, unlike traditional Ne estimators that require temporally spaced genetic samples (Wang and Whitlock ). LDNe uses linkage disequilibrium (LD) information among alleles at different loci caused by genetic drift in populations. The linkage disequilibrium (LD) method is based on the expectation that LD will increase due to genetic drift generating nonrandom associations among unlinked loci more substantially in small compared with large populations (Hill ).This method does not assume random mating and corrects for biases associated with small sample sizes (England et al. ; Waples and Do ). We estimated Ne for varying levels of inclusion of rare alleles (Pcrit values 0.05, 0.02, and 0.01) to compare consistency across results, but report estimates from alleles with a frequency ≥0.02 because this criterion provides a good balance between maximizing precision and minimizing bias with highly polymorphic loci like microsatellites (Waples and Do ). ONeSAMP calculates eight summary statistics and uses approximate Bayesian computation (ABC) to estimate Ne from a single population sample. We used different priors for Ne (2–50, 2–100, and 2–200) to verify that the results were robust to these changes.In order to quantify habitat loss and fragmentation at a very coarse scale, we used the Anthrome 2.0 dataset that maps and characterizes global anthropogenic transformation of the terrestrial biosphere from 1700 to 2000 (Ellis et al. ). The global patterns of these anthropogenic transformations were classified into nineteen classes using a priori anthrome classification algorithm. We used ArcGIS 9.2 (Redlands, CA, USA) to delineate our study landscape from the global Anthrome 2.0 dataset and calculated the area under the consolidated three major land-cover/land-use classes (dense settlement, villages and croplands, and semi-natural wildlands) that are relevant to this study (1700, 1800, 1900, 2000 CE).To see if we could detect signatures of a population bottleneck, we used the program BOTTLENECK (Piry et al. ), which tests for heterozygote excess as compared with that expected under mutation–drift equilibrium. First, we used the qualitative approach of allele frequency distribution test. If any deviation is observed from the normal L-shaped allele frequency distribution that normally arises in a population, a bottleneck may be suspected. We then used a quantitative approach based on the principle that allelic diversity in a population reduces faster than heterozygosity after going through a bottleneck, resulting in a relative excess of heterozygotes (Cornuet and Luikart ; Spencer et al. ). Significance of observed deviations was determined by a two-tailed Wilcoxon signed-rank test (Luikart and Cornuet ), under the two-phase mutation (TPM) model (with 30% of SMM) because it has been suggested to be the most suitable for microsatellites. We also used M-ratios to detect any genetic bottleneck in our data. M-ratio is the ratio of k/r, with k representing the number of alleles and r representing the allelic size range. As rare alleles are lost, k is reduced faster than r, and therefore, a low M-ratio relative to a critical value indicates population declines. We calculated the population-specific M-ratio with the software M_P_val (Garza and Williamson ). We compared this empirical value of M-ratio to the commonly used bottleneck threshold (0.68; Garza and Williamson ) as well as a simulated equilibrium distribution based on the two-phase model of microsatellite mutation. This simulated critical value (Mc) was calculated by simulating 10 000 replicates in critical_M (Garza and Williamson ) using the mean size of nonstepwise mutations (Δg) = 3.5, and the proportion of stepwise mutations (ps) = 90% as recommended by the authors.Because both BOTTLENECK and M-ratio detect only recent and severe population declines (Girod et al. ; Peery et al. ), we also used MSVAR v 1.3 (Storz and Beaumont ), which detects long-term changes in population sizes using an MCMC-based simulation approach of the mutation-coalescent history to present-day genotypes by characterizing the posterior distribution of the parameters N0 (current population size), N1 (ancestral populations size), μ (mutation rate of all loci), and t (time since change in population size). We used different and wide priors for each locus for N0, N1, μ, and t. Each run was 2 × 109 steps, with a burn-in of 10 000 steps and output every 10 000 steps. We used Tracer v1.5 (Rambaut and Drummond ) to estimate posterior distributions and the highest posterior densities (HPDs) of current and ancestral population size. […]

Pipeline specifications

Software tools Cervus, Genepop, Arlequin
Application Population genetic analysis
Organisms Panthera pardus, Panthera tigris