Computational protocol: Distinct soil bacterial communities along a small-scale elevational gradient in alpine tundra

Similar protocols

Protocol publication

[…] Sequences obtained by pyrosequencing were processed and analyzed following the standard operating procedure described in the website using Mothur program v.1.27.0 (). The denoising process was implemented using the shhh.flows command which is the Mothur implementation of the PyroNoise component of the AmpliconNoise suite of programs. Barcode and primer sequences were removed, and sequences shorter than 200 bp with homopolymers longer than 8 bp were removed at the same time. Next, the sequences were aligned against the SILVA-compatible alignment database and then trimmed, so that subsequent analyses were constrained to the same portion of the 16S rRNA gene. Chimeric sequences were detected using the chimera.uchime command that use the sequences as their own reference to run de novo detection and identified chimeras were removed after that. The remaining reads were preclustered using the pre-cluster command to remove erroneous sequences derived from sequencing errors and then clustered using Mothur’s average algorithm. Taxonomic assignment of each OTU (clustered at 97% sequence similarity) was obtained by classifying alignments against Silva reference bacterial taxonomy files using the classify command at 80% Bayesian bootstrap cutoff with 1000 iterations. Sequences were deposited to the MG-RAST metagenomics analysis server and are available to the public (accession numbers from 4565119.3 to 4565142.3).For community-level composition and each calculated metric, we accounted for the difference in the sampling efforts among the samples by randomly subsampling 4,900 sequences per sample. The number of sequences for rarefaction was determined according to the sample that yielded the lowest number of sequences after quality filtering (Supplementary Table ). [...] The number of phylotypes (the number of OTUs) was used to estimate the community richness. We chose phylogenetic diversity index values (calculated as the sum of branch lengths between root and tips for a community) to estimate the phylogenetic community diversity.To determine if the different elevation samples formed unique phylogenetically related clusters, principal co-ordinates analysis (PCoA) of the UniFrac distance matrices were performed. The UniFrac algorithm computes the overall phylogenetic distances (across all taxonomically resolved levels) between all pairs of sample communities in the dataset from neighbor-joining trees using either unweighted (i.e., presence/absence) or weighted (i.e., accounting for taxon relative abundance) data (). In addition, we tested for significant differences in community composition among elevations using analysis of similarities (ANOSIM) with R statistical software. Canonical correspondence analysis (CCA) was performed to show a visual relationship between environmental factors and bacterial distributions. To further identify the environmental and biogeochemical factors that significantly correlated with community composition we used Mantel tests of Bray–Curtis similarity distance values that were calculated on the presence/absence of the OTUs within each sample using the vegan package of R v.3.1.1 project ().For the phylogenetic community structure, we calculated the mean nearest taxon distance (MNTD) of all of the species pairs occurring in a community based on the observed community dataset (). MNTD is an estimate of the mean phylogenetic relatedness between each OTU in a bacterial community and its nearest relative (). To infer underlying ecological processes with MNTD, the phylogenetic signal in habitat association was tested with Mantel correlograms with 999 randomizations for significance tests (). An environmental-optimum for each OTU was found for each environmental variable as in . Between-OTU environmental-optimum differences were calculated as Euclidean distances using optima for all the environmental variables. We further calculated the differences in the phylogenetic distances between the observed and randomly generated null communities, and we standardized them using the standardized deviation of phylogenetic distances in 1000 null communities (). These null communities were generated with the assumption that all species that exist along the elevation are equally able to colonize any elevation without dispersal limitation at local spatial scales, and thus each species has the same expected prevalence (; ). The total species richness of each elevation was kept standard, and species at each elevation were chosen randomly without replacement from the pool of species present along the elevation. The obtained standardized effect size measure (ses.MNTD) can be used to test for phylogenetic clustering or overdispersion (). Negative ses.MNTD values and low quantiles (P < 0.05) indicate that co-occurring species are more closely related than expected by chance (clustering), whereas positive values and high quantiles (P > 0.95) indicate that the co-occurring species are less closely related than expected by chance (overdispersion; ). These analyses were implemented in the R environment with the package Picante 1.6-2 ().To correlate the observed biodiversity patterns with the environmental variables, we used multiple ordinary least squares (OLS) regression. Before that, strong correlated variables were dereplicated according to their correlation (i.e., one of the two variables was selected if the Pearson correlation is higher than 0.7. Usually we only select the most ecologically related factor from the significant correlated variables). All of the environmental variables and biodiversity metrics were standardized at a mean of 0 and a SD of 1. Akaike’s information criterion was used to identify the most parsimonious model (). The regression analyses were performed in the R environment with the package MASS 7.3–33. […]

Pipeline specifications