Computational protocol: Investigating Hybridization between the Two Sibling Bat Species Myotis myotis and M. blythii from Guano in a Natural Mixed Maternity Colony

Similar protocols

Protocol publication

[…] Multilocus genotypes were analysed using a Bayesian clustering method implemented in the NEWHYBRIDS software (version 1.1 beta; []). This method assigns individual multilocus genotypes to genetic clusters based on a Markov chain Monte Carlo (MCMC) simulation procedure to estimate the posterior distribution reflecting the membership of each individual.The sample is taken from a mixture of pure individuals and hybrids []. All the individuals were genotyped with the same set of microsatellites, amplifying both M. myotis and M. blythii DNA. However, allele frequencies are known to vary between the two parental species, and their potential hybrids []. The programme estimates the allele frequencies in two putative parental populations determined by the software without prior information. The posterior probability of being of pure or hybrid origin is then estimated for each genotype. NEWHYBRIDS obtained the posterior distributions based on an MCMC procedure with a burn-in of 105 steps, followed by a sampling period of 106 steps. Under this model, the posterior probability q describes the probability that an individual belongs to each of the different genetic clusters. Two threshold values (Tq ≥ 0.75 or 0.90) were used with two different rules of assignment [, ]: (1) all individuals with a q ≥ Tq were considered purebred parentals, and all others were considered hybrids (no individual remained unassigned; 3rd criterion); (2) all hybrid categories (F1, F2, backcrosses) were combined to identify admixed individuals without distinguishing hybrid categories (2nd criterion); and individuals with a q < Tq for either purebred or hybrid categories were then unassigned. We omitted the most restrictive criterion (1st criterion) in which the threshold value is applied to each category (purebreds, F1, F2, backcrosses) separately because only 12 markers were used in the study, which is too few to confidently assign all of the hybrid categories [].The possibility that the results obtained from the NEWHYBRIDS analyses could be observed by chance was tested by simulation studies following the protocol used by Burgarella et al. []. Simulated datasets were used to determine which method (Tq ≥ 0.75 or Tq ≥ 0.90; 2nd or 3rd criterion) provided the most reliable results to avoid the false assignment of individuals based on characteristics of the observed dataset []. Two subsamples, including individuals with the highest q-values (30 for M. myotis and 11 for M. blythii), were created. Datasets were simulated based on the allele frequencies calculated in the two subsamples with HYBRIDLAB 1.0 software: 10,000 genotypes were generated for both parental species, and 10,000 for each type of hybrid (F1, F2, and backcross). Genotypes were then randomly selected without replacement using the R 3.1.0 software to create a sample of 200 individuals with different proportions of hybrids (0%, 5%, 10%). For each hybrid proportion, 20 different simulated datasets were generated. The size of the simulated sample (200 individuals) and the hybrid proportions were chosen to represent the sample collected in this study. Each simulated sample was analysed with NEWHYBRIDS according to the same setting conditions, threshold values and criteria as those described above. The following measures were then calculated to evaluate the performance of the methods [, ]: (1) the hybrid proportion (HP) (i.e., the number of individuals classified as hybrids over the total number of individuals in the sample); (2) the efficiency in detecting the true hybrid/purebred status of individuals (i.e., the number of correctly identified individuals for a category over the actual number of individuals of that category in the sample); (3) the accuracy (i.e., the number of correctly identified individuals for a category over the total number of individuals assigned to that category); and (4) the type I error (i.e., the number of individuals wrongly identified as hybrids over the total number of actual purebreds in the sample).Because we collected faeces instead of examining individuals, we used bat corpses to help to assign purebred clusters to the two Myotis species. Bat corpses were collected as soon as they were found within the roost on the different sampling dates. Morphometric and morphological criteria were used identify the species of each bat corpse [, ]. The multilocus genotypes of the four individuals collected (2 M. myotis and 2 M. blythii) were included in a clustering analysis to assign clusters to the parental species and excluded for the analysis of guano samples. [...] Once all of the bats were assigned as M. myotis, M. blythii or hybrids, genetic diversity was assessed for the three inferred bat types. The allelic richness per bat type was estimated based on a rarefaction procedure implemented in the R package hierfstat []. Genotypic linkage disequilibria between all pairs of loci and conformation to the Hardy—Weinberg equilibrium (HWE) for each locus separately and over all loci were tested within each bat type by exact tests using Markov chain methods in GENEPOP software version 4.1.4 []. Corrections for multiple tests were performed using the false discovery rate (FDR) approach using the R software. The genetic differentiation between the three bat types was then quantified by computing the Weir and Cockerham [] estimator of FST using GENEPOP. […]

Pipeline specifications