Computational protocol: Population Genetic Structure and Demographic History of Atrina pectinata Based on Mitochondrial DNA and Microsatellite Markers

Similar protocols

Protocol publication

[…] The COI sequences were initially aligned using CLUSTAL X2 . Molecular diversity indices such as haplotype diversity (h), and nucleotide diversity (π) of the COI sequences were calculated in DnaSP 5.10 . Genealogical relationships among haplotypes were further assessed using a minimum spanning tree constructed by Arlequin 3.5 . Microsatellite alleles were scored using GENEMARKER software version 2.2.0 (SoftGenetics, State College, PA, USA). The number of unique alleles (U), observed (HO) and expected (HE) heterozygosity were calculated by using the Excel Microsatellite Toolkit . Hardy-Weinberg equilibrium (HWE) and genotypic linkage disequilibrium (LD) were performed by GENEPOP 4.0 . Allelic richness (AR), inbreeding coefficient (FIS) were calculated with FSTAT2.9 . The software MICRO-CHECKER 2.2.0 was used to test for technical artefacts such as null alleles, stuttering and large allele dropout. To investigate genetic differentiation among populations, analysis of molecular variance (AMOVA) was performed for both the nuclear microsatellite and mitochondrial COI data using Arlequin 3.5 with 10,000 permutations. Pairwise genetic divergence values between populations were estimated using F ST values for microsatellite data and Φ ST values for COI sequences with Arlequin 3.5 , and significance was adjusted using a Benjamini–Yekutieli correction based on the false discovery rate approach . D A distance were calculated using POPTREE2 for microsatellite data. Population pairwise F ST values and D A distance for microsatellite data were displayed in two dimensions via multidimensional scaling analysis using the SPSS16.0 software. To identify population structure, the software STRUCTURE 2.3 , was used to identify clusters of genetically similar populations using a Bayesian approach for the nuclear microsatellite. Ten replicates were run for all possible values of the maximum number of clusters (K) up to K = 9, and for each run, 1,000,000 iterations were carried out after a burn-in period of 100,000 iterations. To detect the number of genetically homogeneous groups (K) that best fit the data, we used Structure Harvester website , which implements the Evanno method . Assignment test was also used to clarify the geographical differentiation. Assignment methods have proven to be useful tools in detecting the influence of marine currents on population genetic structure , . The likelihood of an individual originating from a given population was estimated by using a Bayesian-based method implemented in the program GeneClass2 . To obtain a conservative estimate of recent migration, an individual was excluded from its sampling site when the probability of exclusion was greater than 99% (P or α <0.01). Potential source locality of the excluded individuals were identified based on probabilities larger than 0.1 .Isolation-by-distance (IBD) analyses were conducted for both COI and microsatellite data. Shoreline distances between sampled populations were estimated in km using Google Earth version 4.3 and plotted against genetic distance, pairwise Φ ST/(1–Φ ST) and F ST/(1–F ST) for COI and microsatellites, respectively . IBD regression analysis was performed online using the IBD web service with 10,000 randomizations of the data. Inferences on historical demographic history were obtained by neutrality tests, mismatch distribution, and Bayesian Skyline Plot based on COI data. As for neutrality test, Tajima’s D test and Fu’s Fs test were calculated using Arlequin 3.5 with 10,000 permutations. Mismatch distribution was constructed for each geographic population to test a model of exponential population growth . A goodness of fit test was performed to test the validity of the sudden expansion model, using a parametric bootstrap approach based on the sum of square deviations (SSD) between the observed and expected mismatch distributions. The raggedness index which measures the smoothness of the mismatch distribution was calculated for each distribution. The demographic expansion parameter (τ), were calculated with Arlequin 3.5 . Changes in effective population size across time were also inferred using Bayesian Skyline method implemented in the program BEAST1.7.5 with 20 groups. Chains were run for 100 million steps that yielded effective sample sizes (ESS) of at least 200 and first 10% was discarded as “burn-in” under the TN93 substitution model from Modeltest 3.7 , a strict molecular clock and a stepwise skyline model. All operators were optimized automatically. Results of the analyses were visualized using Tracer 1.5 .To investigate whether any of the populations experienced recent genetic bottlenecks, Wilcoxon sign-rank test for heterozygosity excess was applied under three different models, namely, infinite alleles model (IAM), two-phase model (TPM) and stepwise mutation model (SMM), using the program Bottleneck 1.2.02 . Furthermore, the qualitative test of model shift was performed to calcuate the allele frequency distribution using Bottleneck 1.2.02 . […]

Pipeline specifications