Computational protocol: Global Diversification at the Harsh Sea-Land Interface: Mitochondrial Phylogeny of the Supralittoral Isopod Genus Tylos (Tylidae, Oniscidea)

Similar protocols

Protocol publication

[…] We obtained tissue samples from 16 of the 21 currently recognized species of Tylos () and used published sequences from one additional valid species from California, T. punctatus Holmes & Gay, 1909, and its close relatives, with which it forms T. punctatus sensu lato from California to the western coast of Mexico . Most of the samples were obtained from the Museo di Storia Naturale “La Specola”, Zoological section, in Florence, Italy (MZUF); other researchers and museums kindly provided the remaining samples (). The sample from Puerto Rico, for which no specific permissions were required, was collected by LAH. None of the fieldwork involved endangered or protected species. Photographs of the ventral plates of the fifth pleonite, regarded as a species-diagnostic character in Tylos, are shown for most of the lineages examined in (photographs for additional lineages of T. punctatus sensu lato can be found in ). We used a sample of Helleria brevicornis Ebner, 1868 as outgroup in a subset of the phylogenetic analyses. The monotypic Helleria, endemic to the northern Tyrrhenian area, is the only other genus of the family Tylidae. Genomic DNA was isolated from 2–4 legs per specimen with the DNeasy kit (Qiagen, Inc., Valencia, CA). We PCR-amplified segments of four mitochondrial genes: 16S rDNA; 12S rDNA; Cytochrome Oxidase Subunit I (COI); and Cytochrome b (Cytb); primer sequences and amplification conditions are provided in . PCR-amplified products were cleaned with Exonuclease and Shrimp Alkaline Phosphatase, and subsequently cycle sequenced at the University of Arizona Genetics Core. We used Sequencher 4.8 (Gene Codes, Ann Arbor, MI) for sequence editing and primer removal. None of the protein-coding sequences had premature stop codons or frame shifts, suggesting that they are not pseudogenes. All sequences were deposited in GenBank (Accession Numbers KJ468109–KJ468188). [...] Non-protein-coding sequences were aligned with MAFFT v.6.0 , as implemented in 2014 Feb 4), with the Q-INS-I strategy, which considers secondary structure of RNA, and with the L-INS-i strategy with default parameters (e.g. Gap Opening penalty = 1.53). Resulting alignments were edited manually within MacClade v.4.06 . Regions for which homology could not be confidently established were identified with GBlocks v.0.91b , and excluded from the phylogenetic analyses. The following GBlocks parameters were used: “Allowed Gap Positions”  =  half; “Minimum Length Of A Block”  = 5 or 10; and “Maximum Number Of Contiguous Nonconserved Positions”  = 4 or 8. Alignments showing included and excluded positions are available in & . included Helleria brevicornis as the outgroup. High divergences between H. brevicornis and Tylos, however, rendered many positions in the two ribosomal genes unusable (). To increase the number of usable positions at the two ribosomal genes and reduce noise due to substitution saturation, we subsequently generated a dataset () in which H. brevicornis was removed, and the above MAFFT and GBlocks procedures were repeated (see details about rooting of this dataset in the Results section). [...] Phylogenetic analyses were conducted with the sequences of the four loci concatenated into a single dataset. We used jModeltest v0.1.1 to determine the most appropriate model of DNA substitution among 88 candidate models on a fixed BioNJ-JC tree, under the Akaike Information Criterion (AIC), corrected AIC(c), and Bayesian Information Criterion (BIC) ( & ). We used the closest more complex model (based on the BIC) available in the corresponding Maximum Likelihood (ML) and Bayesian analyses (see & 3), except that when a proportion of invariable sites (I) and a Gamma distribution of rates among sites (G) was selected according to jModeltest, we excluded parameter I to avoid problems related to dependency between both parameters (see RaxML manual and ). In addition, to assess robustness of the results to substitution model, we also used the complex model GTR+G. The following two data partitioning schemes were implemented: (a) all positions within a single partition; and (b) the best partitioning scheme according to the BIC implemented in PartitionFinder v.1.0 . The following parameters were used in PartitionFinder: branch lengths  =  linked; models  =  mrbayes; model selection  =  BIC; search  =  greedy; and a priori partitioning by a combination of each gene and codon position.For the ML analyses, three approaches were employed: (a) RaxML v.8.0.7 (“GTRGAMMA” model; standard bootstrap search) ; (b) GARLI v.2.0.1 implemented in the CIPRES server , which uses genetic algorithms for the ML search; and (c) PhyML v.3.1 (search  =  SPR & NNI) . Clade support within ML analyses was examined by: (a) the approximate Likelihood Ratio (aLRT) test using the Shimodaira-Hasegawa (SH-like) procedure, as implemented in PhyML; and (b) non-parametric bootstrap analyses (100–1000 replicates) in all three ML programs, and summarized with 50% majority rule consensus trees computed by the SumTrees script (v.3.3.1) implemented in DendroPy v.3.10.1 .For the Bayesian analyses, two programs were used. The first one was MrBayes v.3.2.2 –, but such analyses have been reported to return biased clade posterior probabilities in certain cases (e.g. the “star-tree paradox”; –). Therefore, we also applied two of the proposed strategies to alleviate such biases: the polytomy prior as implemented in Phycas v.1.2.0 ; and a Gamma prior on the tree length as implemented in MrBayes v.3.2.2 . The following criteria were used to evaluate convergence and adequate sampling of the posterior distribution: (a) Stable posterior probability values; (b) a high correlation between the split frequencies of independent runs as implemented in AWTY ; (c) small and stable average standard deviation of the split frequencies of independent runs; (d) Potential Scale Reduction Factor close to 1; and (e) an Effective Sample Size (ESS) >200 for the posterior probabilities and parameters, as evaluated in Tracer v.1.5 . Tree samples prior to reaching a stationary posterior distribution were discarded (i.e., “burnin”), and the remaining samples were used to generate majority rule consensus trees with SumTrees (note: the tree summary function of Phycas was not used, as it returned incorrect clade posterior probabilities). Pairwise genetic distances with Kimura-2-parameter (K2P) correction were estimated with PAUPv.4.0b10 for the four concatenated mitochondrial genes () and for the COI gene separately; missing/ambiguous positions were removed for each pairwise sequence comparison. […]

Pipeline specifications