Computational protocol: Marine Incursion: The Freshwater Herring of Lake Tanganyika Are the Product of a Marine Invasion into West Africa

Similar protocols

Protocol publication

[…] The orthologous DNA sequences obtained were aligned, using default settings, by CLUSTALW and optimized by eye. Optimization of rDNA gene fragment alignments was facilitated through the use of secondary structure models for teleost long and short subunit RNAs , . Regions of the optimized alignment which could not be reliably aligned were eliminated from analysis (data alignment available upon request), resulting in an alignment of 525 bp for 16S, 520 bp for 12S and 1,149 bp for the Cytb dataset, for a total of 2,194 bp. Data partitions were tested for substitution saturation using a non-parametric statistical test implemented by DAMBE 4.5.47 . Prior to concatenating the three sequence alignments, the congruency of data partitions was tested with a likelihood-based congruency test (α = 0.05; 10000 RELL bootstrap replicates) , using maximum likelihood (ML) topologies generated from individual gene analyses as well as the overall ML tree (see below).Neighbor-joining distance and maximum parsimony analyses were performed with PAUPV4b10 , with indels coded as missing data. Parsimony minimal analyses included a full heuristic search with random addition (50 replicates), the TBR branch swapping algorithm and the MULPARS option. For parsimony analyses, a transversion/transition weighting of three was used. Neighbor-joining analyses applied a GTR+I+G model of substitution , with transition rate matrix (1.9150 9.8250 3.6271 0.8214 17.2997), gamma shape parameter (0.5214), proportion of invariable sites (0.4838) and nucleotide frequencies (A: 0.2764; C: 0.2780; G: 0.2168; T: 0.2288) estimated from the dataset using Modeltest V3.7 . Reliability of phylogenetic signal was tested using 500 bootstrap replicates for both parsimony and NJ distance analyses. A single random addition of taxa was used for each replicate of the parsimony bootstrap.The overall ML tree topology for each gene and the concatenated dataset was determined using GarliV0.951 with model parameters as estimated by Modeltest. The initial tree topology was constructed by random addition, the stopgen and stoptime parameters were both set to 10,000,000 and search termination settings were set at default values. Four independent runs of each tree search produced final likelihood values that varied by less than 3.5. The tree was the highest likelihood value was used for subsequent analyses. Phylogenetic reliability of the overall ML tree was tested using 500 bootstrap replicates.Phylogenetic relationships were also estimated according to a Bayesian method of phylogenetic inference implemented by MrBayes v3.1.2 . Posterior probabilities of phylogenetic trees were approximated by a 1,000,000-generation Metropolis-coupled Markov chain Monte Carlo simulation (MCMCMC; four chains, chain temperature = 0.2), under a GTR+I+G model of sequence evolution, with simultaneous estimation of parameters, sampling every 1,000th generation. A 50% majority-rule consensus tree was constructed following a 100,000-generation burn-in to allow chains to reach stationarity. Three separate runs of MrBayes v3.1.2 under these parameter settings generated qualitatively similar results.To test morphological-based hypotheses on the taxonomic relationships among clupeiform fishes, the ML topology and branch lengths were recalculated as above, with major groupings constrained to be monophyletic. The deviation between these alternative topologies and the unconstrained ML topology was tested using a Shimodaira-Hasegawa (SH) test with 10000 RELL bootstrap replicates. […]

Pipeline specifications

Software tools Clustal W, DAMBE, ModelTest-NG, MrBayes
Application Phylogenetics
Organisms Hemisus marmoratus, cellular organisms
Diseases Goiter, Endemic