Computational protocol: Testing mitochondrial sequences and anonymous nuclear markers for phylogeny reconstruction in a rapidly radiating group: molecular systematics of the Delphininae (Cetacea: Odontoceti: Delphinidae)

Similar protocols

Protocol publication

[…] Sequences were aligned by eye (362 bp) and unique haplotypes identified. A model and parameters for the phylogenetic reconstruction were determined empirically using likelihood via Modeltest 3.7 []. The Akaike Information Criterion (AIC) indicated that the Tamura-Nei model of DNA evolution with a gamma correction (α = 0.3409), proportion of invariable sites = 0.3735, and empirical base frequencies (A = 0.3423, C = 0.2218, G = 0.1035, T = 0.3324) was most appropriate given the data.The alignment of control region haplotypes was analyzed in a likelihood framework using GARLI 0.96 []. The model and parameters determined above in Modeltest 3.7 [] were applied in GARLI (TrN +I +G). Two replicates were performed in order to assess convergence on a topology. Stop generation and stop time were set at 5,000,000; genthreshfortopoterm was set at 20000; scorethreshforterm was set at 0.05. The remaining options were set as default. The analysis was bootstrapped for 500 iterations.Additionally, the aligned control region haplotypes were analyzed in PAUP* 4.0b10 [] using distance methods. Using the above model and the neighbor-joining algorithm, a phylogenetic reconstruction was rendered and bootstrapped (with replacement) using 1000 iterations. Within species, between species and corrected between species genetic distances were estimated using MEGA 3.1 and the Tamura-Nei model with parameters as described above.The trees were rooted with a single outgroup haplotype (Lagenorhynchus acutus) although six more outgroup taxa (14 haplotypes) were included in the analysis (Tables , ). [...] An initial binary data matrix was compiled for all individuals and dominant AFLP markers. This presence-absence matrix was used as the basis for all AFLP analyses. Species-specific markers were defined directly from the raw binary data; a species-specific marker was shared by all individuals sampled from a particular species to the exclusion of all other taxa (synapomorphy).Relationships among taxa and individuals were defined using a neighbor-joining phylogram [,] which was built using Nei-Li distance derived from the binary data matrix and bootstrapped (with replacement) 1000 times using PAUP* 4.0b10 []. Distance-based methods were included because the parsimony criterion, in particular, may be inappropriate for use with dominant, anonymous markers due to the inherent faulty assumption of homology among shared absent markers and the possible parsimonious, but incorrect, reconstruction in which no markers are assigned to an ancestor at a given internal node [,].We also utilized the Bayesian phylogenetic inference method through MrBayes 3.1 [,]. Due to the binary nature of the AFLP data and the difference in probabilities between 1 to 0 and 0 to 1 state changes, we chose the restriction "noinvariantsites" option and the "noabsentsites" option. The analysis was run over 2 replicates to assure convergence on a topology; each run was performed over 10,000,000 generations (sampling at every 100 generations) and burn-in was set at 250,000 generations. The remaining options were set as default.Non-metric multidimensional scaling analysis (NMDS) was used to further clarify and visualize relationships among taxa outside the context of a bifurcating tree. NMDS is an ordination technique designed to portray relationships as defined by a Jaccard similarity matrix in three-dimensional space. A Jaccard similarity matrix was created from the initial binary data using NTSYS-pc []. Jaccard similarity values range from 0 (no similarity) to 1 (identical) and are based on shared presence of markers: where a is the number of polymorphic markers shared by individuals x and y, b is the number of markers present in x but absent in y, and c is the number of markers present in y but absent in x [].NMDS plots were created using NTSYS-pc []; three sets of principal coordinates analysis values were used as the initial configuration for better fit. The goodness of fit of the NMDS model was measured using a stress value ranging from 0-1 (0 = excellent fit, 1 = poor fit). […]

Pipeline specifications

Software tools ModelTest-NG, GARLI, PAUP*, MrBayes
Application Phylogenetics