Computational protocol: Phylogeography of a Habitat Specialist with High Dispersal Capability: The Savi’s Warbler Locustella luscinioides

Similar protocols

Protocol publication

[…] The ND2 sequences were aligned and edited in Geneious 5.1 and did not show any evidence for double peaks nor unexpected stop codons suggesting that they were mitochondrial rather than nuclear copies. Molecular diversity indices, Tajima’s D and Fu’s Fs neutrality tests were calculated in Arlequin using a bootstrap procedure with 1000 replicates, whereas the R2 test was calculated in DnaSP software . Mismatch distributions and the time since population expansion (τ = 2 µT) were also calculated in Arlequin and compared with models of sudden population expansion (estimated from 100 bootstrap replicates) using the sum of squared deviations test.A haplotype network was calculated in TCS using a parsimony algorithm and all the Savi’s Warbler’s sequences. A maximum likelihood tree including all the 57 haplotypes found in Savi’s Warblers and GenBank sequences of its sister species was implemented in PHyML using the best model of molecular evolution as determined by the program jModelTest according to the Akaike information criteria (TrN + I). Maximum parsimony and medium-joining trees calculated in Geneious produced similar results and are not shown.In addition, we used the coalescent analysis implemented in Beast 1.6.1 to assess whether the ND2 sequences evolved in a clock-like manner, for which the mean and 95% HPD of ucld.stdev was compared to zero in a model where the relaxed uncorrelated lognormal clock was used as prior. Beast was also used to determine the age of the major nodes in two models: (1) a constant population model that includes all Savi’s Warbler haplotypes plus the River Warbler sequences; and (2) a population expansion model, in which all sequences of Savi’s Warblers (including repeated haplotypes) were analysed. For both models, lognormal and strict molecular clocks were implemented with a uniform prior for substitution rates varying from 0.0095 to 0.0115 . A uniform distribution varying from 0.0095 and 0.025 was also used because of the uncertainty associated with the molecular evolution of ND2, which has been reported to be greater than the 2.1% divergence rate of the cytochrome b gene [34–37; but see 38]. The models were run multiple times in Beast and evaluated in the program Tracer 1.5, in order to assess the influence of different priors and chain length on the effective sample size and convergence. Final models were run for 20 million generations and sampled every 2000 generations, which resulted in 10000 trees that were summarized and visualized with TreeAnnotator 1.6.1 and FigTree 1.3.1 (discarding the first 10% of trees). Beast was also used to model the demographic changes of the two major clades using the Bayesian skyline analysis , for which a chain length of 80 million generations was used for Clade A and 50 million for clades B and B1. Results were analysed and visualized in Tracer. [...] The microsatellites were first evaluated with Micro-checker to determine the presence of null alleles, scoring errors and large allele dropout, and then analysed for Hardy-Weinberg equilibrium (HWE), genetic diversity, population differentiation, AMOVA and isolation-by-distance (Mantel test) in GenAlEx 6.1 . Only 16 out of the 112 tests for HWE were significant (P<0.05). In one population (Valencia), four out of eight loci were not in HWE, which could have been due to the presence of migratory birds in our sample. The fixation index and the estimate of null alleles were low except for Gf05 (), which we showed earlier to have null alleles . As the analysis excluding this locus and the Valencia population produced similar results, they were used in the final analysis. Dest was calculated with SMOGD 1.2.5 , and all distance matrices were subjected to Principal Component Analyses (PCA) performed in GenAlEx for better visualization of the genetic relationships among populations.The individual population assignment method implemented in Structure 2.3.3 was also used to detect population structure , . This program was run three times for each K = 1 to K = 8 clusters using the admixture model with correlated allele frequencies, sampling location information prior, an initial burn-in of 104 and 106 Markov Chain Monte-Carlo interactions. Multiple runs varying the priors and number of interactions resulted in identical conclusions. The number of clusters was determined using the ad hoc statistics (?K) following Evanno et al. as implemented in the program Structure Harvester 0.3 . […]

Pipeline specifications