Computational protocol: Do stressful conditions make adaptation difficult? Guppies in the oil-polluted environments of southern Trinidad

Similar protocols

Protocol publication

[…] Our goal in conducting genetic analysis was to examine population structure, gene flow, and evolutionary history. For this purpose, highly variable microsatellite markers were deemed appropriate. Future analyses of the genomic basis of adaptation to oil pollution would require other approaches.Genomic DNA was extracted from fin clips collected in 2011 from the four southern sites (nMR.oil = 38, nMR.np = 21, nVR.oil = 65, nVR.np = 36) using the protocol of Elphinstone et al. (), with modifications to accommodate use of a Perkin Elmer MPII liquid-handling robot. All individuals were genotyped at 10 highly variable tetranucleotide microsatellite markers (Pre8, Pre9, Pre15, Pre26, Pre27, Pre28, Pre38, Pre80, g145, and g289; for details see Paterson et al. ). Within-population tests for linkage disequilibrium for all pairwise locus combinations and for deviations from Hardy–Weinberg equilibrium (HWE) were performed using the software GENEPOP (Raymond and Rousset ; Rousset ), with all p-values corrected for multiple comparisons based on the false discovery rate (Benjamini and Hochberg ). None of the pairs of loci showed significant linkage disequilibrium (all P > 0.4, exact test for genotypic disequilibrium), and summary statistics for all 10 loci are given in Table S1. Two loci (Pre27 and Pre28) showed significant deviations from HWE in two populations (Table S1), and we therefore performed all analyses twice: once for the full set of 10 loci and once for the subset of eight loci confirmed to be in HWE (excluding Pre27 and Pre28). Results did not change qualitatively between the two analyses, and we therefore only report the results based on the full set of 10 loci.Genetic variation and population structure among the four southern populations were examined in several analyses. First, FST (Weir and Cockerham ) and Jost' s D (Jost ) were calculated between all population pairs and tested for significance using permutations (n = 1000) of genotypes across individuals. Second, discriminant analysis of principal components (DAPCs) was implemented using the R package adegenet (Jombart ; R Development Core Team ; Jombart et al. ). DAPC first centers and scales genetic data and performs a principal component analysis from which the axes of maximal variance are extracted. These variables are then subjected to linear discriminant analysis allowing the representation of populations in genotypic space. DAPC is robust against HWE deviations and makes no assumptions regarding the underlying data structure or population genetic model (Jombart et al. ). Third, we conducted a Bayesian admixture model analysis using STRUCTURE 2.3 (Pritchard et al. ) with sample sites as informative priors (LOCPRIOR, Hubisz et al. ). STRUCTURE was run for 10 separate MCMC simulations over 50 000 burn-ins with 100 000 repeats for each k, including k = 1 to k = 6 (the assumed number of populations plus two). The most likely number of k clusters was estimated based on the Δk criterion (Evanno et al. ), and the respective MCMC runs were merged using the software CLUMPP (Jakobsson and Rosenberg ).Gene flow between oil-polluted and not-polluted habitats within each river system was assessed as the ratio of immigration rate to mutation rate (M = m/μ). We used a Bayesian framework in the software package MIGRATE (Beerli and Felsenstein ; Beerli and Palczewski ), which estimates the mutation-scaled population size parameter Θ and the mutation rate parameter M, which can then be used together with the mutation rate (μ) to calculate effective population size (Ne) via Θ = 4 Ne μ, as well as the number of new variants introduced by immigration relative to mutation (m) via M = m/μ. For our calculations of Ne, we used a mutation rate of μ = 5.56 × 10−4, a common value for microsatellites (Whittaker et al. ; Yue et al. ). We applied a Brownian motion approximation to the stepwise mutation model for microsatellite data and performed a Bayesian inference search strategy using a constant mutation rate and an exponentially distributed prior. A slice-sampling MCMC algorithm was used with a burn-in of 100 000 iterations and 10 000 recorded steps. We tested the migration models with a Θ value estimated from FST calculations, and migration between oil-polluted and not-polluted sites in the respective rivers was free to vary across all loci. An additional analysis of migration rates within each river was conducted based on an isolation-with-migration model implemented in IMa2 (Hey and Nielsen ) applying a geometric heating scheme for 106 steps after a burn-in period of 106. Five independent IMa2 runs starting with varying random seeds produced similar posterior distributions. […]

Pipeline specifications

Software tools Genepop, adegenet, CLUMPP
Application Population genetic analysis
Organisms Homo sapiens, Poecilia reticulata