Computational protocol: Evaluation of the DNA Barcodes in Dendrobium (Orchidaceae) from Mainland Asia

Similar protocols

Protocol publication

[…] Sequences for each region were aligned with Clustal X v1.8.7[] and adjusted manually in BioEdit v7.1.3.0[]. As for ITS, after aligning by Clustal X, we adjusted the regions (ITS1 and ITS2) in two ends of 5.8S rDNA based on parsimony principle. The sequence character-based method were performed for the aligned matrices of each barcode using the ‘polymorphic sites’ function of the DnaSP5 program[]. Genetic pairwise distances was computed with the K2P model[] in MEGA5[].Differences between intra- and inter-specific distances for each pair of five single barcodes were compared using IBM SPSS Statistics v19.0[] with Wilcoxon signed-rank tests[]. Barcoding gaps comparing the distributions of the pairwise intra- and inter-specific distances for each candidate barcode with 0.005 distance intervals were estimated in TaxonDNA with a ‘pairwise summary’ function[]. To test the accuracy of the barcode regions for species identification, the proportion of correct identifications were calculated using TaxonDNA with ‘Best match’, ‘Best close match’ and ‘All species barcodes’ functions. To further evaluate the effectiveness of candidate barcodes, we evaluated whether species were considered monophyletic for each barcode by conducting a tree-based analysis. The phylogenetic trees were estimated using the neighbor joining (NJ) feature of MEGA5, and node support was assessed by a bootstrap test[] with 1000 pseudo-replicates of NJ run with the K2P distance options. Liparis kumokiri was used as outgroup for the tree-based analysis following the procedure described by Xiang et al. (2013).Singh et al. []indicatedthat species identification success rate changed with the number of samples. In order to predict the relationships between the number of species sampled and the species identification success rate more accurately, gradient evaluation was used. Gradient evaluation is a method by using different gradient of species in sampling and then evaluating the corresponding efficiency of species identification success of each gradient of ITS+matK with the tree-method (NJ).Based on the sampling size of previous studies (Table S3 in ), we here chose 8 species gradients, i.e., 5, 17, 36, 52, 60, 70, 80, and 91species.Our primary results indicated that ITS+matK had the highest species identification success rate. To test the universality of ITS+matK as a DNA barcode for species identification in large flowering plant genera, we searched for recent literatures about DNA barcoding in Google Scholar and Web of Science. Four large plant genera, including Paphiopedilum (approximately 80 species)[], Ficus (approximately 500 species)[], Pedicularis (approximately 600 species)[] and Lysimachia (approximately 200 species)[], were found(Table S4-S7 in ). We evaluated the effectiveness of ITS+matK for species identification in these genera by calculating genetic distance, constructing NJ trees and conducting analyses using the TaxonDNA program and then compared with the core barcode proposed by the previous study. […]

Pipeline specifications

Software tools Clustal W, BioEdit, DnaSP, MEGA, SPSS
Applications Miscellaneous, Phylogenetics