Computational protocol: High Cryptic Diversity across the Global Range of the Migratory Planktonic Copepods Pleuromamma piseki and P. gracilis

Similar protocols

Protocol publication

[…] Sequences were edited using Geneious (v.5.1.6, Biomatters), and base calls were confirmed by aligning both strands. To assess phylogenetic relationships among specimens, maximum likelihood (ML) and Bayesian trees were inferred from 259 unique haplotypes that appeared to be functional gene copies based on the absence of premature stop codons in the amino acid translation. ML analyses were conducted using RAxML v.7.2.8 and the general time reversible (GTR) model of nucleotide substitutions with the invariant (I) model of rate heterogeneity was selected as the best-fit model for the data (using MEGA version 5; ). The data set was partitioned into two regions: mtCOII (396-bp) and the intergenic spacer +7-bp of ATP8 (152 to 177 bp), in order to allow substitution rates to be estimated and optimized for each partition individually. Nodal support was assessed by bootstrapping across nucleotide sites with 1000 replicates. Bayesian trees were inferred with MrBayes v.3.2.0 , . We selected substitution models for the different partitions using the Akaike information criterion corrected for finite sample sizes (AICc) as implemented in MEGA5 . The best fitting model was found to be GTR with gamma (G) distributed rates for the first partition and GTR+G+I for the second partition. Four independent chains were run with a heating parameter of 0.2, and 25% of trees were discarded for the burn in. Clades with bootstrap support or Bayesian posterior probabilities of >70% were considered well supported. A second Bayesian tree was inferred that included cDNA sequences that are known to be functional. The purpose of this tree was to verify clades that are real taxonomic entities, by comparing the placement of cDNA and gDNA sequences. All trees were rooted with Pareucalanus attenuatus (Calanoida, Eucalanidae). To quantify genetic distances between clades, the evolutionary divergence over sequence pairs between and within clades were averaged using the Kimura 2-parameter model (in MEGA5). Evolutionary divergence values are reported as the percentage of base substitutions per site, averaged over all sequences pairs within/between clades. [...] Global distribution patterns of P. piseki – P. gracilis genetic clades were examined in a number of ways. First, we mapped the distribution of all clades in the global ocean by plotting the presence/absence of each genetic clade for each collection site (using m_map, Matlab v. 7.14). These plots give initial insights into the biogeographic distribution of these genetic populations, but are preliminary, particularly in ocean regions where clades were absent and sampling coverage was low. Second, a contingency analysis was conducted to statistically test for non-random spatial structure in the frequency of genetic clades across all oceans (using R v.2.10.1). Clades A, B, C, F, and G were included, and geographic regions were categorized as Indian, North Pacific, South Pacific, North Atlantic, and South Atlantic Ocean.Finally, we examined the population structure within clade A across the Indian, Pacific, and Atlantic Oceans, given the cosmopolitan distribution and higher sample size for this clade (total N = 204). We used the estimator θST to make pairwise comparisons among our 13 localities and an analysis of molecular variation (AMOVA, ) to quantify the amount of genetic variation partitioned between different levels of hierarchical subdivision. Populations were grouped by ocean basin: Indian Ocean - four stations, North Pacific Ocean – six stations, North Atlantic Ocean – three stations. The number of permutations used for hypothesis testing was 20,000. Analyses were conducted using Arlequin v3.5.1.3 . Q-values were determined for all P-values <0.05 to determine the false discovery rate (FDR) in tables of pairwise comparisons. To calculate Q-values, we used the software Q-VALUE written in R and available from John Storey (http://genomics.princeton.edu/storeylab/qvalue/). […]

Pipeline specifications

Software tools Geneious, RAxML, MEGA, MrBayes, Arlequin
Applications Phylogenetics, Population genetic analysis
Organisms Prosopocoilus gracilis