Computational protocol: Genetic Diversity and Population Genetics of Mosquitoes (Diptera: Culicidae: Culex spp.) from the Sonoran Desert of North America

Similar protocols

Protocol publication

[…] Total genomic DNA was extracted from each mosquito using the DNeasy (QIAGEN Inc., Valencia, CA, USA) protocol. Samples not analyzed immediately were stored at −20°C. The polymerase chain reaction (PCR) was used to amplify a segment of the COI gene using the primer pair LCO1490f/HCO2198r and standard assay conditions []. Sequencing reactions were performed on an Applied Biosystems (Foster City, CA, USA) ABI 3730XL DNA sequencer at the Genomic Analysis and Technology Core Facility, University of Arizona, Tucson, USA, using the amplifying primers. Sequences were proofread and aligned in either Sequencher 4.1 (GeneCodes Corp., Ann Arbor, MI, USA) or ClustalX 1.81 [] followed by manual editing. Sequences were trimmed to remove ambiguous sites, resulting in a final segment of 624 bp in 23 of the 25 Cx. tarsalis (see ) and 611 bp in Cx. quinquefasciatus and Culex sp. 1 and sp. 2. The first nucleotide in the 624 bp segment of Cx. tarsalis corresponds to position no. 1527 in the complete mitochondrial genome of Drosophila yakuba (GenBank Accession no. NC001322). The first nucleotide position in Cx. quinquefasciatus and Culex sp. 1 and sp. 2 corresponds to position no. 1515 in D. yakuba. GenBank accession numbers for the new Culex COI sequences obtained here are JX297260–JX297304.With one exception, all individuals of Cx. quinquefasciatus possessed the same COI haplotype. To obtain a preliminary estimate of population structure in Cx. quinquefasciatus, therefore, we also analyzed four microsatellite loci (CQ16, CQ26, CQ29, and CQ41) as described by Fonseca et al. []. Most of the 134 specimens of Cx. quinquefasciatus analyzed for microsatellites were not the same as those used for COI analyses. Several individuals from Hermosillo (N = 6), Guaymas (N = 2), and Santa Rosalía (N = 2), however, were analyzed for both molecular markers. Genetic diversity for each locus in each of the seven populations (), as well as over all loci and populations, was quantified using Microsatellite Analyser (MSA) version 4.00 [] and ARLEQUIN version 3.5.1.3 []. Deviations from Hardy-Weinberg equilibrium (HWE) were tested for each locus and over all loci in ARLEQUIN using a Markov chain approximation []. All estimates were assessed for significance using a test analogous to Fisher's exact test, with 100,000 steps in the Markov chain and 5000 dememorization steps. Significance for all estimates was placed at the 0.05 level. Other details on the microsatellite protocol are given elsewhere []. Calculations of Kimura's [] 2-parameter genetic distances (d) were carried out in MEGA version 5.0.5 []. Genetic diversity indices were calculated in DnaSP version 5.00.07 []. Neutrality tests (Tajima's D [] and Fu's F S []) were carried out in ARLEQUIN. Fu's F S test is also useful for detecting signatures of population expansions, which lead to large negative values in the test statistic [, ]. The significance of F S at the 0.05 level is indicated when P values are <0.02 []. Networks for COI haplotypes were constructed using statistical parsimony implemented in TCS version 1.21 []. The connection limit among haplotypes was set to the default value of 95%, unless indicated otherwise. [...] Relationships among COI haplotypes in Sonoran Desert Culex were examined using maximum parsimony (MP) and Bayesian inference. For all phylogenetic analyses, sequences for Cx. tarsalis were trimmed from 624 to 611 bp to correspond to the sequence length of the other samples (). We also incorporated GenBank sequences for several different species of Culex into the data matrix, including Cx. (Neoculex) territans Walker and Cx. (Culiciomyia) nigropunctatus Edwards. All other Culex species treated here are presently assigned to the subgenus Culex []. Culiseta inornata (Williston) from the tribe Culisetini was used as the outgroup based on results of previous molecular studies of Culicidae [, ]. Maximum parsimony analyses were carried out in MEGA using the CNI heuristic search option and 100 random additions of sequences. Relative support for tree topology was obtained by bootstrapping [] using 1000 pseudoreplicates. Bayesian analyses were implemented in MrBayes version 3.1 []. The model of nucleotide substitution that best fitted the data set, determined with jModelTest 0.1.1 [] using the Akaike Information Criterion was, TVM + G. The substitution model was set to nst = “2” and rates = “gamma”, and the analysis was run for 1,000,000 generations, sampled every 250th generation (4,000 trees sampled), using the default random tree option to begin the analysis. We also conducted an analysis with nst = “6,” used for the more highly parameterized GTR substitution model, and obtained the same tree topology and similar clade support values. Clade support, expressed as posterior probabilities, was estimated utilizing a Markov chain Monte Carlo (MCMC) algorithm. [...] Analysis of molecular variance (AMOVA) [], performed in ARLEQUIN, was used to test for population structure among populations of Cx. quinquefasciatus and Cx. tarsalis. The significance of population pairwise comparisons of the fixation indices, ΦST for COI and F ST for microsatellites, was based on 10,000 permutations of the data matrix and assessed at α = 0.05 (Cx. tarsalis) or using a sequential Bonferroni correction [] for multiple comparisons of Cx. quinquefasciatus. Estimates of the number of migrants per generation (N m) among populations were also calculated in ARLEQUIN. The demographic history of Cx. tarsalis from the Sonoran Desert was inferred by performing three different tests of the sequence data. For all demographic tests, we chose a value of 2.3% pairwise sequence divergence per million years for COI []. This resulted in a neutral mutation rate per site per generation (μ) of 1.15 × 10−8 assuming a single generation per year (see ). A mismatch distribution analysis [, ] of COI sequence data was performed in ARLEQUIN. The significance of the estimated parameters of the sudden expansion model of the mismatch distribution is obtained from the sum of square deviations (SSD) statistic and the raggedness statistic (rg) and their corresponding P values. The sudden expansion model is rejected at P < 0.05. A Bayesian skyline analysis, which provides an estimate of changes in effective population size through time utilizing MCMC sampling of sequence data, was conducted in BEAST version 1.3 []. Because the TVM substitution model is not available in BEAST, analyses were run using both the HKY + G and GTR + G substitution models (four gamma categories) for five million iterations sampled every 1000 iterations. Bayesian skyline plots generated with TRACER version 1.5 [] were essentially identical in the two analyses. A maximum-likelihood estimate of the exponential population growth parameter (g) and the mutation parameter θ in Cx. tarsalis was obtained with the program FLUCTUATE version 1.4 [] using the program settings described previously []. […]

Pipeline specifications

Software tools Sequencher, Clustal W, Arlequin, MEGA, DnaSP, MrBayes, jModelTest, BEAST
Applications Phylogenetics, Population genetic analysis
Organisms Culex quinquefasciatus
Diseases Encephalitis, California, Mitochondrial Diseases