Computational protocol: Decrypting the Hebeloma crustuliniforme complex: European species of Hebeloma section Denudata subsection Denudata (Agaricales)

Similar protocols

Protocol publication

[…] Sequence data were obtained of five different DNA regions, ITS, RPB2, MCM7 (a DNA replication licensing factor) and the variable regions V6 and V9 of the mitochondrial SSU r-DNA. Not all data could be obtained for all collections; for some collections, mostly older collections, none or only partial ITS sequences could be obtained. Sequences were submitted to GenBank with the accession numbers KM390027–KM390104, KM390107–KM390759, KM390763–KM390775 (newly obtained for this study) and AY312982, JN943848–JN943881, KF309396–KF309406 and KF309426–KF309498.Details of DNA extraction, PCR and sequencing primers have been provided earlier (, , , ). Raw sequence data were edited in Sequencher (v. 4.9, Gene Codes Corporation, Ann Arbor, MI, USA). Ambiguous base calls were regularly encountered in sequences from nuclear ribosomal and protein-coding loci. Length deviant ITS copies within the same amplicon were treated as described in . In these cases the attempt was made to segregate the two constituent sequences, presumably representing different nuclei (), separately. Sequences with more than one indel were treated under the assumption that the two most likely constituent sequences were the two most similar ones, i.e. minimizing the number of assumed base exchanges. For analyses of concatenated alignments, the intragenomic consensus with the least number of ambiguous positions was used.Sequence alignments were done in Mafft v. 7 () as implemented on http://mafft.cbrc.jp/alignment/software/, using the FFT-NS-i option for coding genes and the ITS and E-INS-i option for the variable mitochondrial SSU regions. Gap recoding following Simmons & Ochoterena (2000) was done using FastGap v. 1.2 () for the V6 and V9 sequence alignments. PartitionFinder () in combination with RAxML (v. 7.2.8-alpha, ) was used to determine the most efficient partitioning scheme for protein coding data and concatenated alignments. Concatenation of alignments was done in SequenceMatrix (), using only one sequence per collection and locus, i.e. the consensus sequence in case of heterokaryotic data. Prior to the concatenation of different datasets, their compatibility was tested following the principle of , assuming a conflict to be significant if two different relationships for the same set of taxa, one being monophyletic and the other non-monophyletic, are supported by bootstrap with more than 70 % in ML analyses.ML analyses for the compatibility test were done with RAxML v. 7.2.8-alpha on a local computer or RAxML-HPC BlackBox (v. 7.6.3) (, ) through the CIPRES Science Gateway (). Maximum likelihood searches for tree building were carried out locally with 100 replicates using the GTRGamma model, selecting the best solution for each analysis. Fast Bootstrap searches were done locally or on the CIPRES server, with 1 000 replicates. Trees were visualized using FigTree v. 1.4.0 (). The assignment of collections and sequences to species follows morphology.Distance values of ITS sequences were calculated in Mesquite (v. 2.75, , http://mesquiteproject.org) as ‘uncorrected p’ distances based on ambiguity differences, discounting gaps, and on the same alignment, that was also used for concatenation, considering the spacer regions and the 5.8 S rRNA (650 bp). […]

Pipeline specifications