Computational protocol: Geographic variation in hybridization across a reinforcement contact zone of chorus frogs (Pseudacris)

Similar protocols

Protocol publication

[…] We examined characteristics of the microsatellite markers for each species in all populations with n = 20 or larger (Table ). Samples were pooled by species and county for all analyzes, with four exceptions. These were cases where a single individual was obtained from a county and thus was pooled with the sample from a neighboring county (ECM0180 pooled with Harford Co., MD; ECM5125 pooled with P. feriarum from Dorchester Co., SC; ECM5100 and ECM5095 pooled with P. nigrita from Dorchester Co., SC). The 51 groups are referred to as populations for analyzes below. Detailed analyzes were conducted for each microsatellite locus in the two largest reference allopatric populations: one of P. feriarum from Macon Co., Alabama (n = 83) and one of P. nigrita from Walton Co., Florida (n = 36; Tables ).We tested the assumption of linkage equilibrium (LD) across loci using GENEPOP version 4.2 (Raymond & Rousset, ; Rousset, ; 1,000 dememorizations and one million steps of the Markov chain, 1,000 batches with 1,000 iterations per batch). We tested the assumptions of Hardy–Weinberg equilibrium (HWE) using GenoDive version 2.0b25 (Meirmans & Van Tienderen, ) using the heterozygosity‐based Gis statistic (Nei, ). Expected and observed heterozygosities as well as inbreeding coefficients were also calculated in GenoDive (Tables S2–). We utilized Micro‐Checker version 2.2.3 (Van Oosterhout, Hutchinson, Wills, & Shipley, ) to assess genotyping errors, such as allelic dropouts, stuttering, or null alleles. [...] Hybridization frequencies were estimated for all 1,118 individuals across populations using the basic admixture model in STRUCTURE (Pritchard, Stephens, & Donnelly, ) with the following settings: no linkage, correlated allele frequencies, burn‐in length 50,000, and 150,000 steps after burn‐in; default settings were employed for other parameters. Analyzes were run from K = 1 to K = 10 with 10 replicates of each value of assumed clusters. The optimal K value was determined using the method of Evanno, Regnaut, and Goudet (), implemented in Clumpak (Kopelman, Mayzel, Jakobsson, Rosenberg, & Mayrose, ). STRUCTURE plots were visualized using the Destruct for many K's feature in Clumpak.A hybrid index was estimated for each of the five focal sympatric regions with large sample sizes (R2; R5–R6; R8–R9), as well as for five sympatric regions of n < 30 (R1 n = 10; R3 n = 24; R4 n = 7; R7 n = 29; R10 n = 26; Table ) using the maximum likelihood‐based method GenoDive developed by Buerkle (). Briefly, this method utilizes the allele frequency distributions of two parental species (a reference population and an alternative population) and the genotype of the putative hybrid to estimate the hybrid index. To set a reference and alternative population, all allopatric P. feriarum samples were pooled into one reference group (n = 188), and all allopatric P. nigrita were pooled into a second reference group (n = 80). A hybrid index was then estimated for each of the ten sympatric focal regions with these references in separate analyzes.Individuals were classified as either a hybrid or a parental species using two methods. In the first method, laboratory‐created F1 hybrids from Lemmon and Lemmon (; parental P. feriarum females × P. nigrita males from Liberty Co., FL, USA) were genotyped for the same microsatellite loci used in this study, and their hybrid index was estimated using the same methods and reference populations as above. The boundaries of F1 hybrid versus pure genotypes were then set based on the range of hybrid index values exhibited by these control samples (hybrid index of laboratory hybrids ranged from 0.5 to 0.75; therefore, the boundaries were set at 0.25–0.75). Thus, all wild‐sampled individuals with hybrid indices falling within the range of the laboratory hybrid controls were classified as putative F1 hybrids, although we are aware that this hybrid index range may also have included some backcross and introgressed progeny as well. Precision in estimation of F1 hybrid index is expected to improve with the inclusion of additional markers (Buerkle, ). In the second method, individuals were classified as hybrids of undetermined class (including but not limited to F1 hybrids) if their 95% confidence intervals estimated in GenoDive using the Buerkle () approach did not extend to 0 or 1, where 0 represents the index of the first parental species, and 1 represents the index of the second parental species (following Als et al., ). Hybridization frequency was also estimated using NewHybrids (Anderson and Thompson ) under default settings. Although this program additionally provides estimates of hybrid class, we do not present these results due to insufficient power of our data to provide robust estimates as a consequence of low marker sample size.To determine whether the frequencies of hybridization differ across the five large focal regions, we conducted a series of pairwise randomization tests. In these analyzes, we compared the proportion of individuals classified as (1) F1 hybrids and (2) any type of hybrid, using the two methods above, across the five regions. Tests were performed in the R statistical environment version 3.1.0 (R Core Team ). Test statistics were calculated as the difference in proportion of hybrids between pairs of populations and compared against null distributions generated from 100,000 randomizations. For each replicate from the null distribution, individuals were randomized between the pair of focal regions without replacement. A total of 10 pairwise tests were conducted using each hybrid classification method, and a sequential Bonferroni correction was performed to correct for multiple (10) tests (Rice, ).Although exact dating of hybrid zone formation is beyond the scope of this study, relative timing of contact between species across regions was derived from phylogeographic data, which support recent expansion of P. feriarum northward into Virginia and surrounding areas and suggests relatively younger contacts in Regions 7–9 (R7–R9; Lemmon & Lemmon, ). This interpretation is based upon multiple statistical analyzes of P. feriarum mitochondrial data using a spatially explicit random‐walk model of migration across a landscape (Lemmon & Lemmon, ). Moreover, the ages of all contact regions examined are a minimum of 100 years old, based on morphological examination of early records of both species in museum collections (Lemmon, Lemmon, Collins, & Cannatella, ). In terms of the age of RCD in different populations, acoustic data obtained in the 1960s and 1970s for both species (Fouquette, ) indicate that RCD of male acoustic signals to current levels occurred a minimum of 50 years ago (H. Milthorpe and E. M. Lemmon, unpub. data). [...] For individuals identified as F1 hybrids using the microsatellite‐based method above (for either the 12‐locus or 10‐locus datasets), the maternal parent was characterized through Sanger sequencing of a fragment of the 16S rRNA gene of the maternally inherited mitochondrion. The methods employed follow Moriarty and Cannatella (), Lemmon, Lemmon, & Cannatella, () and Lemmon, Lemmon, Collins, Lee‐Yaw, J. A., & Cannatella, D. C. (). Briefly, partial sequence of the 16S gene (~700 bp) was obtained through amplification via polymerase chain reaction using the 16sc/16sd primers (Moriarty & Cannatella, ). Sequencing was performed with the 16sc primer using the ABI Big Dye terminator ready‐mix on an ABI 3730 Genetic Analyzer (Applied Biosystems). Sequences were aligned using MAFFT 7.127b (Katoh, Misawa, Kuma, & Miyata, ; Katoh & Standley, ) to the large number of previously published sequences for the two species for this gene region (Lemmon, Lemmon, & Cannatella, ; Lemmon, Lemmon, Collins, Lee‐Yaw, J. A., & Cannatella, D. C. ; Moriarty & Cannatella, ), and a genus‐wide phylogeny was generated using RAxML‐III version 8.0.0 (Stamatakis, Ludwig, & Meier, ; GTRCAT model, 1,000 bootstrap replicates, Hyla chrysoscelis as outgroup) with up to five published reference sequences per species to establish the species of origin for the mitochondrial genome in each F1 hybrid. Of the 190 F1 hybrids identified using microsatellites, sufficient DNA remained to sequence 185 for the 16sc mitochondrial regions. Five additional putative F1 hybrids (based on morphology and acoustic data) from R6 (n = 1) and R8 (n = 4) were also sequenced, though not genotyped. To determine whether there was evidence for asymmetric introgression (i.e., whether the two possible maternal parents occur in unequal frequencies), exact binomial tests were performed on localities with the number of F1 hybrids >15 individuals: (1) Florida R2 individuals (n = 109), (2) Virginia R8 (n = 26), (3) Virginia R9 (n = 16), (4) Georgia R10 (n = 18), and (2) all regions combined (n = 190). [...] To further examine genetic differentiation within species, principal coordinates analyses (PCoAs) were performed on microsatellite data (binned fragment lengths) from: (1) both species together, (2) P. feriarum only, and (3) P. nigrita only. Analyzes were conducted in GenAlEx 6.5 (Peakall & Smouse, , ) on a genetic distance matrix (R st; Slatkin, ) using the covariance‐standardized PCoA method. Scores from the first three PCoA axes were saved, and graphs were plotted in R.The degree of isolation‐by‐distance (IBD; correlation between genetic and geographic distance) was tested using a Mantel test (Mantel, ; Smouse, Long, & Sokal, ; with 10,000 permutations) in Arlequin version 3.5 (Excoffier & Lischer, ). The test was performed using F st values (Wright, , , ) between populations with n ≥ 5 calculated in Arlequin and with Euclidean geographic distances between populations calculated in Geographic Distance Matrix Generator v1.2.3 (Ersts, ) using GPS coordinates. Prior to analysis, all hybrids identified using both hybrid index methods described above were removed from sympatric populations. IBD analyzes were performed separately on the two species. In order to test for significantly lower IBD among allopatric population pairs, a randomization test was performed in which the residual F st values from the IBD analysis were computed and the test statistic was calculated as the difference between the average residual of comparisons involving sympatric populations (sympatric–sympatric or sympatric–allopatric) and the average residual of comparisons involving only allopatric populations (allopatric–allopatric). The null distribution was estimated by recomputing the test statistic after randomizing the assignment of allopatry or sympatry to each locality. A total of 200 randomizations were performed, and the test statistic was compared to the null distribution. […]

Pipeline specifications

Software tools Genepop, Genodive, Clumpak, NewHybrids, MAFFT, RAxML, GenAlEx, Arlequin
Applications Phylogenetics, Population genetic analysis
Organisms Homo sapiens