Computational protocol: Sex-specific genetic analysis indicates low correlation between demographic and genetic connectivity in the Scandinavian brown bear (Ursus arctos)

Similar protocols

Protocol publication

[…] We analyzed genetic population structure using STRUCTURE v.2.3.4 [, ] in a hierarchical manner to identify both, genetic clusters and substructure within clusters. Assuming population admixture and correlated allele frequencies, we set the maximum number of populations to K = 10 with ten independent runs for each K. The burn-in period was 100,000 Markov-Chain-Monte-Carlo (MCMC) iterations with a subsequent sampling of 1,000,000 MCMC iterations. We processed the results using Structure Harvester [], which implements the ad hoc approach of Evanno et al. [], and assigned the individuals to one of the inferred clusters, using a membership value of q≥0.7 as a threshold value [, ]. We reanalyzed each inferred cluster separately to test for additional substructure. In this analysis, we set the maximum number of inferred populations to K = 5; all other parameters were as described above. [...] We analyzed genetic differentiation between pairs of individuals in a hierarchical manner by estimating the relationship between genetic and geographic distance on the small, i.e., within genetic clusters, and the large scale, i.e., across the entire sampling area. Across the entire sampling area, we used the kinship coefficient by Loiselle et al. [], implemented in the program SPAGeDi v.1.4c []. This coefficient is supposed to suffer less bias in the presence of low allele frequencies [] and has low sampling variance []. We did this using a distance class size of 40 km.Within clusters, we performed a spatial autocorrelation analysis using GenAlEx 6.501 [, ], which also offers a heterogeneity test for the detection of sex-biased dispersal and has been shown to work well [, ]. A within-cluster analysis may potentially underestimate the strength of the genetic-geographic distance relationship by excluding admixed individuals and first-generation migrants. To test whether this was a problem in our approach, we reran the analysis based on pooling the individuals according to sampling location into five regions (). Similarly, pooling data from spatially distant and genetically differentiated populations may potentially inflate the genetic correlation coefficient, r, of neighboring samples, because genetic distances between individuals across the genetically differentiated sampling area are comparably much larger []. To tackle this issue within the clusters where we found substantial substructure, we reran the analysis of spatial autocorrelation using the multiple-populations-approach [], treating the detected subclusters and admixed individuals as population units. [...] We calculated the number of alleles and observed and expected heterozygosity using GenAlEx 6.501 [, ] and inbreeding coefficient using Genetix 4.05.2 []. For the estimation of population differentiation, it is recommended to estimate several different estimators and execute caution in their interpretation in order to avoid erroneous conclusions [–]. Following this recommendation, we used GenAlEx 6.501 [, ] to estimate FST, GST [–], G’ST [], and D []. The program calculates GST using the corrections proposed by Nei & Chesser [], calculates G’ST according to Hedrick [] and follows the formulae given in eq.2 in Meirmans & Hedrick [] to calculate Dest [, ]. To test for the dependency of GST on locus diversity, and thus gain insight on the influence of the high mutation rates of microsatellite markers on population differentiation, we used the software CoDiDi, developed by Wang []. [...] To estimate migration and gene flow among clusters, we used Genepop v4.0 [], which implements the private allele method to estimate the number of effective migrants Nm [] and corrects the estimate for number of samples by using a regression line, as described by Barton and Slatkin []. The private allele method is a global estimate, and thus may better reflect long-term rather than current gene flow, even though it is expected to react more quickly to an increase in gene flow rates than FST-based estimates []. Therefore, we also applied the Bayesian software BAYESASS 3.0 [], which is based on the population assignment method and is supposed to better reflect current migration and gene flow [, ]. We estimated the migration rates among geographic regions (see for regions) in ten independent runs using 21x106 iterations with a burn-in period of 2x106 iterations. We adjusted the delta values to 0.07 (allele frequency), 0.05 (inbreeding coefficient), and 0.15 (migration) and started each run with a random seed [], as well as enabled the trace file option in order to test for convergence of the Bayesian estimations of migration rates afterwards []. Although being relatively free from assumptions, this method may have some limiting factors regarding the reliability of the results, such as low population structure and/or a high migration rate, leading to convergence problems []. Many empirical studies indeed show signs of convergence problems and it is recommended to perform multiple runs and estimate Bayesian deviance as a criterion to find the run with the best fit; we followed this recommendation []. […]

Pipeline specifications

Software tools Structure Harvester, SPAGeDi, GenAlEx, GENETIX, Genepop
Applications Phylogenetics, Population genetic analysis
Organisms Ursus arctos
Diseases Genetic Diseases, Inborn