Computational protocol: Illumina sequencing-based community analysis of bacteria associated with different bryophytes collected from Tibet, China

Similar protocols

Protocol publication

[…] The lengths of the short reads were extended by identifying the overlap between paired-end reads by the FLASH software []. The singleton sequences were removed and the low quality sequences were filtered out using QIIME software (version 1.17) []. Reads that were not assembled were also discarded. The reads were sorted according to barcode sequences and the sample sources. The sequences number of each sample was counted. Sequences covered the V6-V8 region of bacterial 16S rRNA gene were clustered into operational taxonomic units (OTUs) at 97% sequence similarity by using UPARSE software (version 7.1 http://drive5.com/uparse/) [] and chimeric sequences were identified and removed using UCHIME [].To further calculate the Alpha diversity, species richness (Chao), species coverage (Coverage), species diversity (Shannon-Wiener Index and Simpson’s diversity index) and rarefaction analyses were calculated using the software of Mothur version v.1.30.1 []. The richness index Chao estimator, was used to estimate the richness of the bacteria []. Shannon diversity and Simpson indexes were used to estimate the biodiversity of the bacterial communities. These alpha diversity indexes were compared between sample groups using Two Independent Sample tests of nonparametric analysis in SPSS version 16.0 for Windows (SPSS Inc., Chicago, IL). We analyzed the taxonomy of each 16S rRNA gene sequence with RDP Classifier [] (http://rdp.cme.msu.edu/) against the Silva 16S rRNA database using a confidence threshold of 70% []. Community structure analyses were based on the phylum and genus taxonomy levels.Heatmaps were generated on the basis of the relative abundance of phyla and genera, respectively, using R (version 2.15; The R Project for Statistical Computing, http://www.R-project.org). For phylogeny-based cluster comparisons, the composition of the microbial communities present in the samples of ten liverworts, ten mosses and all these twenty bryophytes were analyzed based on the Bray-Curtis distance and principal coordinate analysis (PCoA) plots were generated.The two groups of bryophytes (obtained by the analysis of PCoA plots) specific to different bacteria types was performed using the linear discriminant analysis (LDA) effect size (LEfSe) method (http://huttenhower.sph.harvard.edu/lefse/) for biomarker discovery, which emphasizes both statistical significance and biological relevance. With a normalized relative abundance matrix, LEfSe uses the Kruskal-Wallis rank sum test to detect features with significantly different abundances between assigned taxa and performs LDA to estimate the effect size of each feature. A significance alpha of 0.05 and an effect size threshold of 4 were used for all biomarkers discussed in this study. All tests for significance were two-sided, and p values below 0.05 were considered statistically significant. […]

Pipeline specifications