Computational protocol: Population subdivision of the surf clam Mactra chinensis in the East China Sea: Changjiang River outflow is not the sole driver

Similar protocols

Protocol publication

[…] The mitochondrial COI gene was amplified for a subset of individuals (14 to 23 specimens) in each population with the primers LCO-1490 and HCO-2198 (). Each polymerase chain reaction (PCR) was performed in 50-µL volumes containing 2 U Taq DNA polymerase (Takara, Otsu, Shiga, Japan), 50–100 ng of genomic DNA, 0.25 µM of each primer, 0.2 mM dNTP mix, 2 mM MgCl2 and 5 µL 10× PCR buffer. PCR was carried out on a GeneAmp® 9700 PCR System (Applied Biosystems, Carlsbad, California, USA) based on the conditions in . Amplification products were confirmed by 1.5% TBE agarose gel electrophoresis and then purified using EZ Spin Column PCR Product Purification Kit (Sangon, Shanghai, China) following described protocol. The cleaned product was prepared for sequencing using the BigDye Terminator Cycle Sequence Kit (ver. 3.1; Applied Biosystems) and finally analysed on an ABI PRISM 3730 automatic sequencer. Sequences were assembled and aligned using the DNAstar software suite (DNASTAR, Madison, Wisconsin, USA). Haplotypes were defined using the DnaSP 5 (), and their relationships were inferred using a maximum parsimony network in the TCS 1.21 package (). The best-fit model of sequence evolution was determined with jModelTest 2 (). GTR + I model was selected under the Akaike information criterion and used in subsequent analysis. Molecular diversity indices such as number of haplotypes (n), haplotype diversity (h) and nucleotide diversity (π) were calculated for each population in the ARLEQUIN 3.5 ().A spatial analysis of molecular variance (SAMOVA) was used to define the best population grouping strategy based on FCT values (number of groups K: 2 to 8) in SAMOVA 1.0 (). A hierarchical analysis of molecular variance (AMOVA; ) was conducted with 10,000 permutations in ARLEQUIN 3.5 to estimate the partitioning of genetic variation. As the GTR model was not available in ARLEQUIN, the closest model was used. Pairwise ΦST was also calculated under this model with 1,000 random replicates followed by a standard Bonferroni correction (). The mantel test (1,000 randomizations) for isolation by distance (IBD) was performed between genetic similarity (FST/(1 − FST)) () and Euclidean geographical distances using IBDWS version 3.23 ().Historical demography of each population was investigated using Tajima’ D () and Fu’s FS neutrality tests () as implemented in ARLEQUIN 3.5 with 10,000 bootstrap replicates. Once a test yielded a value that was significantly different from zero, mismatch distribution was performed to further characterize the expansion. The sum-of-squared-differences (SSD) statistic was used to test the goodness-of-fit between the observed mismatch distribution and that expected under a sudden expansion model (10,000 replicates).The coalescent approach implemented in IMa () was used to parameterize gene flow and divergence time among three groups (namely G1, G2 and G3, see results). Divergence time (t) was individually estimated for two group pairs G1:G2 and G1:G3 because G2 and G3 seemed to derive from G1 separately. Initial runs were analysed to determine the appropriate upper bounds for the migration (m) and divergence time (t). Twenty heated Markov chains were run with a burn-in period of 10 million steps, and all runs were consisted of 100 million steps (recording every 1,000 steps). Heating parameters (g1 = 0.8 and g2 = 0.9) were used to provide good mixing of the Markov chains. Each procedure was repeated at least for three times with different random seeds. The analyses were considered to converge upon a stationary distribution if the independent runs reported similar posterior distributions () and the ESS for each run was >200 (). The mutation-scaled parameter t can be converted into the real time (T) based on following formula: T = t × g/u (g, generation time; u, mutation rate per locus per year) (). For M. chinensis, the generation time was 1 year (). However there was neither an accurate mutation rate nor a clear fossil record available for the species. Former molluscan studies using mutation rates estimated from deep splits of interspecific phylogeny were recently questioned because accelerated molecular rate estimates were suggested in short evolutionary timescales, known as the “time dependency molecular rates” hypothesis (). Under the hypothesis, mutation rate can be an order of magnitude faster than that based on a phylogenetic calibration (). So we adopted here a tenfold faster mutation rate of 12% myr−1 than the upper boundary (1.2% myr−1) used in former studies (e.g., ; ) to shed light on a recent demographic scenario. [...] Microsatellite data were analysed to validate the population structure revealed by mitochondrial COI. In our previous study, we had genotyped eight populations (DD, ZH, QH, PL, WD, HY, RZ and LY) with nine polymorphic microsatellites (). Here we screened 60 individuals in PT population with the same microsatellite loci. A detailed methodology of PCR and genotyping conditions can be found in .The expected heterozygosity (HE) and observed heterozygosity (HO) were calculated for each population using the program MICROSATELLITE ANALYSER (MSA; ), and the mean allele richness (AR) was calculated with FSTAT version 2.9.3 (). ’s () FST (θ) was calculated in the program MSA (). Significance of values was tested using 1,000 permutations with the Bonferroni correction (). The mantel test (1,000 randomizations) for IBD was performed using the combined data set of 501 individuals in IBDWS.We used two different kinds of analyses to assess the population structure. First, the chord distance DC was calculated and an unrooted neighbor-joining (NJ) tree was generated using the software POPULATIONS v1.2.31 (). Supports for nodes were assessed by bootstrapping with 1,000 replicates. Second, population structure was inferred with a Bayesian algorithm as implemented in STRUCTURE v.2.3 (). Tested K ranged from 1 to 12 (sampled populations plus three). For each value, 20 replicates were run using the admixture model, correlated allele frequencies and the prior population information with a burn-in period of 10,000, followed by 100,000 steps. The most appropriate value of K was determined by the statistic ΔK introduced by using Structure Harvester v0.6.92 (). We averaged all 20 replicates for the best K by Greedy method implemented in CLUMPP (), and finally visualized the results with DISTRUCT (). AMOVA analysis was conducted with 10,000 replicates in ARLEQUIN to check for hierarchical structure of variability. […]

Pipeline specifications

Software tools DnaSP, jModelTest, Arlequin, SAMOVA, IBDWS, Structure Harvester, CLUMPP, DISTRUCT
Applications Phylogenetics, Population genetic analysis
Organisms Mactra chinensis