Computational protocol: Molecular Phylogeny and Historical Biogeography of the Neotropical Swarm Founding Social Wasp Genus Synoeca (Hymenoptera: Vespidae)

Similar protocols

Protocol publication

[…] The mesosoma and/or hind legs were removed from each specimen and DNA was extracted using the phenol-chloroform method following the Han and McPheron protocol []. We amplified three mitochondrial (16S, cytochrome b, and cytochrome c oxidase I) and one nuclear (wingless) gene fragments by PCR using specific primers and amplification conditions (). PCR products were purified using exonuclease I and shrimp alkaline phosphatase and directly sequenced in an ABI Prism 3730 (Applied Biosystems) sequencer (Laboratório de Biotecnologia da FCAV—UNESP de Jaboticabal, SP). PCR products were sequenced in both directions and sequence contigs were assembled using Sequencher 5.1 (Gene Code Corp., Ann Arbor, MI, USA). DNA sequences were aligned using Muscle 3.7 [] (with default parameters) in MEGA 5.10 [], with each of the four genes aligned separately. All sequences are deposited in GenBank and accession numbers are listed in . [...] We included DNA data from 26 Synoeca samples with each of the five Synoeca species represented by multiple collection localities. We used species from four other genera as outgroups. Most phylogenetic analyses were based on a concatenated data matrix (1829 base pairs) of the four gene fragments. The most appropriate model of nucleotide evolution and the best-fitting partitioning scheme were selected using PartitionFinder v1.1.1 [] under the Bayesian information criterion (BIC) (). Phylogenetic inference was conducted by Bayesian inference (BI) performed using MrBayes v3.2.2 [] and maximum likelihood (ML) performed using GARLI v2.0 []. BI was also performed on each of the three mitochondrial gene fragments (wingless contained little phylogenetic information) separately to examine potential conflicts in phylogenetic signal among genes. All BI analyses consisted of two independent runs of 50 million generations each with four chains (temp = 0.1) and sampled every 1000 generations. The burn-in, convergence, and stationarity were assessed using Tracer v1.5 []. We removed the first 20% of sampled generations and combined the remaining generations to produce the maximum credibility tree. We conducted 1000 ML bootstrap replicates in GARLI under the same partitions and nucleotide models as in BI. Trees from all analyses were visualized using FigTree v1.4.0 program [].We conducted an additional analysis in which we combined the information from all gene trees into a single tree (species tree), since data from multiple genes and multiple individuals per species can be useful for resolving species trees [–]. We used *BEAST v1.8.0 [] to infer species tree. The recognized five species and S. septentrionalis samples from AF were assigned as ‘‘species” in the analysis. The *BEAST run consisted of 50 million generations, a Yule process for the species tree prior, a piecewise linear and constant root model for population size, randomly generated starting trees for each gene, and a burn-in of 20%. [...] We inferred divergence times under a Bayesian framework using BEAST v.1.8.0 []. We generated the input file in BEAUTi using the two mitochondrial protein-coding genes (COI + CytB) and the substitution model (GTR + Γ) as selected by PartitionFinder. Only nucleotide data from ten specimens were included in these analyses in order to avoid missing data. We employed an uncorrelated lognormal relaxed clock model []. Clock models were unlinked, and substitution and tree models were linked among partitions. A Yule speciation process with a random starting tree was used for the tree prior. Given the poorly known fossil record for social wasps in general, with none belonging to Synoeca [], we applied the Brower [] mutation rate of mitochondrial genes (under a normal distributed prior). This mutation rate estimated at 2.3% My-1 was based on a set of seven studies that provided age estimates of lineage splits ranging from 300 to 3,250,000 years ago. Two independent Markov chain Monte Carlo (MCMC) searches were conducted with 100 million generations each, with parameters sampled every 10,000 steps and a burn-in of 20%. We checked for convergence between runs and analysis performance with Tracer v1.5 using effective sample size (ESS) scores. The resulting trees were combined using TreeAnnotator v1.8.0 and the consensus tree with the divergence times was visualized in FigTree v1.4.0. [...] We performed the Bayesian Binary MCMC (BBM) method of biogeographical and ancestral state reconstruction implemented in RASP (Reconstruct Ancestral State in Phylogenies) 2.1b []. We used the tree obtained from *BEAST (species tree) and published occurrence data for the analyzed species [] as input files for RASP. Thus, we assigned species distribution areas to geographical regions as follows: (A) Amazonian forest, (B) Middle America, (C) Atlantic forest, (D) Dry Diagonal (Cerrado, Chaco and Caatinga). The BBM analysis was run applying the model F81 + Γ and no outgroup was defined. We ran the analysis for 5 millions generations, sampled every 1000 generations with the first 1000 samples being discarded as burn-in. […]

Pipeline specifications