Computational protocol: Multilocus genetic analyses and spatial modeling reveal complex population structure and history in a widespread resident North American passerine (Perisoreus canadensis)

Similar protocols

Protocol publication

[…] We edited and aligned sequences from chromatograms using mega v 5.0 (Tamura et al., ). To assess population structure and evaluate relationships among haplotypes, we constructed a statistical parsimony network (95% probability) using tcs v 1.21 (Clement, Posada, & Crandall, ). We measured genetic variation within populations and haplogroups by calculating haplotype (Hd) and nucleotide (π) diversity using arlequin v 3.11 (Excoffier, Laval, & Schneider, ). To examine population structure and assess genetic differentiation among populations and haplogroups, we calculated pairwise ΦST values (an analogue of Wright's fixation index F ST) using arlequin v 3.11 (Excoffier et al., ). We corrected significance values using a Benjamini–Hochberg correction (Benjamini & Hochberg, ) to control for false discovery rate (FDR). We examined genetic structure within and among populations by performing an analysis of molecular variance (AMOVA) in arlequin v 3.11 (Excoffier et al., ) and used a spatial analysis of molecular variance (SAMOVA; Dupanloup, Schneider, & Excoffier, ) approach to assess barriers between gray jay populations.To reconstruct the phylogenetic relationship among populations, we used the Bayesian inference program MrBayes 3.2 (Ronquist et al., ). For our analyses, we analyzed all CR haplotypes using a GTR G+I model as this was the best‐fit model, as determined in JModelTest (version 0.1.1; Posada, ). We ran the analyses for 10 million generations using four chains, sampling every 100th generation. We used a burn‐in percentage of 25%, using the remaining trees to construct consensus trees, which we viewed using FIGTREE 1.3.1 (Rambaut & Drummond, ). [...] Allelic richness was calculated in fstat v2.9.3 (Goudet, ). Allele frequencies, observed (H o) and expected (H e) heterozygosities, and pairwise F ST values (Wright, ) were calculated with 1000 permutations using arlequin v 3.11 (Excoffier et al., ). We corrected p values for multiple tests using a Benjamini–Hochberg correction (Benjamini & Hochberg, ) to control for FDR.Bayesian clustering analyses were conducted using Structure v2.3.3 (Falush, Stephens, & Pritchard, ; Pritchard, Stephens, & Donnelly, ); we used the following settings for our initial run examining all 27 populations: a burn‐in of 100,000 followed by 500,000 runs, admixture assumed, correlated allele frequencies without population information as an a priori. Ten replicates were performed for each value of K. In structure, it can be difficult to decide when K captures major structure in the data due to similar lnP(X|K) values, thus structure harvester (Earl & von Holdt, ) was used to confirm the most parsimonious clustering of groups. Following our initial run that included all 27 populations, we tested for hierarchical structure, following the procedure used by Adams and Burg (). For these runs, we used the same settings as our initial run, although we used a burn‐in of 50,000 followed by 100,000 chains. [...] We used two separate approaches to examine the factors that influence genetic structure. First we used the program BARRIER to identify potential barriers that may contribute to genetic structure. BARRIER uses Delaunay triangulation and Monmonier's distance matrix to identify potential barriers. We identified the first 10 genetic barriers using both our mtDNA and microsatellite datasets; distance matrices were created using pairwise ΦST and F ST values. We identified barriers with each dataset separately, so that we could compare patterns between markers and determine if similar barriers influence historical and contemporary genetic patterns.Next, we used a distance‐based redundancy analysis (dbRDA) to test the role of ecological variables on genetic variation. We ran two separate analyses, one for mtDNA genetic variation and a second for microsatellite genetic variation. DbRDA is a multivariate approach to test the effect of multiple predictor variables on one or more response variables (Legendre & Legendre, ). Although Mantel tests are often used to measure the relationship between genetic matrices and other distance matrices, recent studies have suggested that canonical statistical approaches like dbRDA are better suited for examining questions where distance matrices are not applicable (Legendre & Fortin, ). This approach is especially useful for studies examining the influence of environmental variation or other abiotic factors because it allows for the testing of those variables directly.To construct our dbRDA models, we used the “capscale” function in the R package Vegan (R Core Team, ). We performed this analysis at the individual level so that we could examine the full‐extent genetic variation in both mtDNA and microsatellite patterns. For our response variable, we calculated Nei's genetic distance between all individuals for mtDNA and microsatellite datasets using GenAlEx (Peakall & Smouse, ). We examined six predictor variables in our models, including geographic location (latitude and longitude) for each individual and geographic distance. For our geographic distance, we used the first principal coordinate for each individual; similar to our genetic response variables, we performed a principal coordinate analysis in GenAlEx on a geographic distance matrix following the approach of Kierepka & Latch, (). For our remaining four variables, we used information obtained from our spatial distribution models. We examined the influence of mean annual temperature and precipitation during the coldest quarter, as these were the two most important variables that predicted gray jay distributions in those models. Additionally, we examined the role of altitude, which we obtained from the BIOCLIM dataset. All three variables were obtained using “the point sampling” tool in QGIS (Quantum GIS Team, ). Finally, we examined the effect of glaciation by scoring an area as glaciated or unglaciated based on the results of our spatial distribution modeling results from the last interglacial. […]

Pipeline specifications