Computational protocol: Genetic Evidence for Restricted Dispersal along Continuous Altitudinal Gradients in a Climate Change-Sensitive Mammal: The American Pika

Similar protocols

Protocol publication

[…] Since our sampling strategy involved the use of a non-conventional DNA source, we applied a multi-tube approach in which each PCR was repeated at least twice for heterozygote genotypes and at least three times in order to confirm homozygotes. Genotyping errors were quantified and consensus genotypes were obtained from repeated genotypes (up to five repeats per individual for 25% of samples) using Pedant 1.0 . Additional tests for stuttering and allelic dropout were undertaken using Microchecker 2.2.3 . Incidence of repeated genotypes were identified using GenAlex 6.4.1 and duplicated multilocus genotypes were removed. Deviations from Hardy-Weinberg equilibrium (HWE) as well as linkage disequilibrium (LD) within sampled sites were assessed using Arlequin 3.5.1.2 with the following settings: 1,000,000 steps in MCMC, and 100,000 dememorisation steps. Significance of deviations from HWE and LD were determined after sequential Bonferroni correction for multiple comparisons.Within population genetic variation was quantified using observed and expected heterozygosity (HO, HE) and the number of alleles (Na) calculated in Arlequin 3.5.1.2 . In addition, we used a measure of allelic richness (AR) based on a rarefaction index accounting for differences in sample sizes as implemented in Fstat 2.94 . This index of allelic richness scales estimates based on the number of alleles observed in the sample with the lowest number of individual. The inbreeding coefficient (FIS) was also calculated for each site and its deviation from zero was assessed using 100,000 permutations between loci using Fstat 2.94 . [...] Genotypic differentiation between pikas in different study sites was tested using a log-likelihood G-test not assuming HWE within samples using Fstat 2.94 and based on 100,000 permutations. In this case, the software tests for the significance of population differentiation by permuting genotypes among sites, and comparing the random estimate to the observed. In order to shed additional light on underlying population genetic structure, we used discriminant analysis of principle components (DAPC) . This model-free approach extracts information from genetic data by transforming the genotypes into uncorrelated components using principal components analysis (PCA). A discriminant analysis is then applied to a number of principal components retained by the user in order to maximize the among-population variation and minimize the variation within predefined groups. The fact that these methods lack underlying assumptions such as HWE and LD make them applicable to a wide range of situations where such assumptions are not met that may preclude the use of more conventional approaches such as Structure . Recent studies that used both DAPC and Structure to shed light on population genetic structure provided similar outcomes based on the two methods , . We ran the DAPC analyses using the R package Adegenet 1.3-1 . Specifically, we applied the a.score and a.optim.score functions to identify the optimal number of principal components to be retained. We used 16 principal components representing 88% of the total genetic information and using sites as a priori populations for the DAPC.To estimate rates and direction of recent migration events between each sample site, we used the Bayesian method implemented in BayesAss+ 1.3 with the following parameters: 2 independent chains of 3,000,000 iterations with sampling every 2000 iterations, 999,999 burnin and delta values of 0.15 for p, m and F (corresponding to allele frequency, migration rate and inbreeding coefficient, respectively). This method makes use of gametic disequilibrium information in a Bayesian inferential framework to estimate recent (last two generations) migration rates from one population into another. This approach has one major assumption, namely that the loci used are in linkage equilibrium, but it does not require that populations are in HWE.To test for associations between local environmental conditions and observed population genetic structure, we conducted Mantel tests to characterize the association between pairwise Fst and the environmental variables using mantel.rtest function of the R package ade4 with 9,999 permutations. Our framework included the comparison of genetic differentiation (Fst) with population geographical isolation , elevation and five other environmental variables (mean annual temperature, mean annual precipitation, precipitation as snow, mean maximum summer temperature and heat-to-moisture ratio) calculated for each site using climateBC 3.1 . This software downscales and interpolates PRISM 1961–1990 monthly normal data (2.5×2.5 arcmin) into 100 m×100 m resolution and outputs a number of measured and derived variables. Initially, we targeted the 39 annual and seasonal environmental variables available through climateBC. In order to remove redundant information from this large number of variables, we performed a principal component analysis (PCA) and calculated correlation coefficients between each pair of variables using the R packages ade4TkGUI and Rcmdr respectively. Variables were considered as redundant if they produced a correlation coefficient higher than 0.8, as suggested by Manel et al. , in which case the variables that were least biologically relevant (e.g. derived variables or variables that a priori do not affect the species) were removed from further analyses.Additionally, we applied a hierarchical Bayesian method that estimates local Fst values and relates them to environmental variables of interest as implemented in Geste 2 . Generalized linear models were run using all seven factors and resulting in a total of 27 (128) models. The method evaluates the posterior probabilities of each factor and their combinations in shaping the observed population genetic structure using reversible jump Markov chain Monte Carlo (MCMC). For example, the model that compares only genetic differentiation and geographic isolation can be viewed as a null model for isolation by distance, while other models can incorporate more complex scenarios. We followed the approach of Gaggiotti et al. in that we first ran Geste using all seven factors and then performed a second round of analyses using the three factors with the highest cumulative posterior probabilities from the first round of analyses. […]

Pipeline specifications

Software tools GenAlEx, Arlequin, adegenet, PrISM.1
Application Population genetic analysis
Organisms Homo sapiens