Computational protocol: Gradual and contingent evolutionary emergence of leaf mimicry in butterfly wing patterns

Similar protocols

Protocol publication

[…] To take phylogenetic and branch-length uncertainty into account in our analyses, we generated Bayesian trees by combining three recently published datasets [,,] and confirmed that our phylogeny was consistent with that proposed previously (Additional file : Figure S3). We used eight nuclear (wingless, ef-1α, RpS5, GADPH, ArgKin, CAD, IDH and MDH) and one mitochondrial (cox1) gene sequences to reconstitute the phylogenetic tree of the species included in the analysis. Multiple alignment was performed using ClustalW [] in MEGA5 [] as previously described []. In brief, we aligned the nucleotide sequences based on their translated amino acid sequences, and the aligned sets of genes were concatenated for use in subsequent analyses. Species names and GenBank accession numbers of sequences used in this study are provided in Additional file : Table S3. The original images of voucher specimens are cited in the NSG’s DNA sequences database (http://nymphalidae.utu.fi/db.php). Six species (Adelpha bredowii, Apatura iris, Asterocampa idyja, Eurytela dryope, Hamadryas februa, and Heliconius hecale) were used as the outgroup taxa. We constructed datasets composed of 7,342 nucleotide sites from nine concatenated genes.We used PartitionFinder [] to identify nucleotide substitution models and partitioning strategies for the dataset. Breaking down the nucleotide data by codon position resulted in 27 partitions (the first, second, and third codon positions for each gene), which were combined to result in nine partitions. A nucleotide substitution model was selected for each partition using the number of sites as the sample size based on the Bayesian information criterion (BIC) (see Additional file : Data S1: the attached nexus file for the alignment, partitioning and substitution models). The sequence data as well as phylogenetic analysis are also available at TreeBASE (Submission ID: 16541). We used MrBayes 3.1.2 for the Bayesian inference of phylogenetic trees, which includes the assumption of proportional branch length among the partitions. We ran four concurrent analyses of 2 × 107 generations with eight chains each (seven heated and one cold) using different random starting trees, and sampled every 100 generations. Runs of all procedures were checked for stationarity, convergence, and adequate mixing of the Markov chains using Tracer version 1.5 []. From each data set, we discarded the first 60,000 samplings as burn-in and combined the resulting MCMC tree samples for subsequent estimation of posteriors. [...] Reconstruction of ancestral character states was performed in a Bayesian framework using BayesTraits ver. 2.0 (www.evolution.rdg.ac.uk/BayesTraits.html) []. In contrast to the optimality criterion (parsimony and likelihood), the Bayesian Markov chain Monte Carlo (MCMC) method has the advantage of investigating the uncertainty of the phylogeny and the parameters of the model for trait evolution []. BayesTraits implements the program MULTISTATE, which calculates the posterior probability of states in all nodes across the posterior distribution of trees that are hypothetical ancestors of the taxa of interest. This calculation uses reversible-jump (rj)-MCMC simulations to combine uncertainty about the existence of a node and its character state, which enables sampling of all possible models of evolution (rather than just the rate parameters as in conventional MCMC) in proportion to their posterior probabilities [,]. Reconstructions were performed using the most recent common ancestor (MRCA) approach; when the node of interest did not exist, the minimal node that contained all terminal taxa of the clade defined by our node of interest (plus one or more extra taxa) was reconstructed instead. In these analyses, polymorphic character states were accounted for, as they were considered as occurrences with an equivalent probability for calculation [].To run the rj-MCMC chain, 4,000 trees were subsampled from each of the four codon-partitioned MrBayes runs (a total of 2 × 105 trees). To allow adequate mixing and achievement of stationary, the rj-MCMC chain was run for 5.005 × 107 iterations with the first 5 × 104 iterations discarded as burn-in and a sampling interval of 1000 iterations, for a final sample of 5 × 104 iterations. We used a uniform prior for the analyses. To avoid autocorrelation and to allow exploration of ample parameter space, the ratedev parameter was automatically adjusted for each analysis to maintain an acceptance rate of 30%, to vary the amount by which the rate parameters were allowed to change between iterations of the Markov chain (ratedev), as recommended in the BayesTraits manual []. We examined the output in Tracer version 1.5 [] to confirm the stationarity of the log-likelihood. Manipulation of trees was conducted using the ‘ape’ package [] in R. […]

Pipeline specifications

Software tools Clustal W, MEGA, PartitionFinder, MrBayes, BayesTraits, APE
Databases TreeBASE
Application Phylogenetics