Computational protocol: Large Blooms of Bacillales (Firmicutes) Underlie the Response to Wetting of Cyanobacterial Biocrusts at Various Stages of Maturity

Similar protocols

Protocol publication

[…] Pairs of forward and reverse reads were aligned using the usearch (v7.0.1090) () fastq_mergepairs command with -fastq_maxdiffs set to 3. The aligned reads were quality filtered with usearch fastq_filter command with -fastq_trunclen=250 bp., -fastq_maxee=0.1. Reads from all samples (rRNA gene and rRNA) were collected in a single fasta file, and singletons were removed using the usearch sortbysize command (minsize=2). The resulting sequences were used for OTU clustering with the uparse pipeline (), setting the OTU cutoff threshold to 97%. Chimeric sequences were filtered with uchime () (usearch -uchime_ref) using the ChimeraSlayer () reference database downloaded from OTU abundances across individual samples were calculated by mapping chimera-filtered OTUs against the quality-filtered reads using the usearch usearch_global command (-strand plus -id 0.97). [...] We used SILVA reference files (release 123) available from mothur () for taxonomic classification and phylogeny inference. Taxonomy was assigned with a naive Bayes classifier (classify.seqs command in mothur) trained with SILVA full-length sequences and taxonomic references, except for cyanobacteria, for which manual taxonomic classification was performed following the most recent classification system outlined by Komarek et al. () as follows. The 16S rRNA V4 sequences and the cyanobacterial representative OTU sequences from the present study were placed into a phylogenetic tree. For each cyanobacterial OTU, the taxonomic name from the closest genome relative that corresponds to at least 97% sequence similarity was assigned.For phylogeny inference, first the region bounded by primers was determined using mothur (v.1.37.0), and the Silva SEED alignment was sliced (start=10264, end=25298). Representative sequences for each OTU were aligned to the sliced alignment with PyNAST () with default parameters. The alignment was filtered with script in Qiime () using default parameters. Phylogeny was inferred with FastTree 2 (). [...] Sample metadata, OTU table, phylogenetic tree, taxonomic assignments, and representative OTU sequences were imported as a phyloseq () object into R (). All downstream analyses were conducted in R and plotted with ggplot2 (). Tables S1 to S6 are available at the GitHub repository at [...] We used the abundance-weighted mean pairwise distances (“mpd” in Picante) to calculate the net relatedness index (NRI []) and net taxon index (NTI []) for microbial communities in each crust maturity level in each replicate subplot pre- and postwetting. NRI and NTI are standardized metrics of phylogenetic relatedness describing whether an observed community is a phylogenetically biased subset of the taxa that could coexist in the source pool. NRI is calculated as mean pairwise distance (MPD) between all OTUs, as measured by the branch lengths, in an observed sample compared to random draws from the OTU pool (, ). NTI is similarly calculated for the nearest relatives based on the terminal branch lengths and as such is much more sensitive to uncertainty in terminal-level tree resolution (, ). Positive values of NRI/NTI are indicative of phylogenetic clustering, while negative values are indicative of phylogenetic evenness (overdispersion). A multitude of null models specifying how random draws of the communities from the taxa pool are performed can be used in calculating NRI/NTI. The choice of null models for significance testing of NRI and NTI affects type I and type II error rates (, ). In this study, we used the phylogeny shuffle model (“taxa.labels” in the Picante package in R []), which shuffles taxon labels across the phylogeny while keeping the phylogenetic relationships intact, hence fixing the total abundance of taxa within and across communities, the occurrence frequencies of taxa, taxa alpha and beta diversity, and patterns of spatial contagion of taxa. [...] For wetting-responsive OTUs, we repeated the phylogeny estimations within their respective families using a more accurate but slow algorithm (RAxML) (). For each noncyanobacterial wetting-responsive OTU (), 16S rRNA sequences from their associated families were downloaded from the NCBI taxonomy database () and aligned together with the OTU representative sequence using MUSCLE () with default parameters. Alignments were trimmed to cover V4 region. Phylogeny of OTUs was inferred by RAxML using the GTRGAMMA model with 500 bootstrapped replicates. RAxML was called as follows: raxmlHPC-PTHREADS-SSE3 -# 500 -m GTRGAMMA -p 777 -x 2000 -f an -s inputalignment. V4 -n outputtree -T 4. [...] Rare OTUs are prone to cause artifacts in the network analysis (, ). In order to avoid spurious correlations, we first removed OTUs with a maximum relative abundance below 0.5% of the total number of reads across all samples. Co-occurrence metrics were estimated using SparCC () based on rRNA gene relative abundances in dry and wet (25.5 h after wetting) samples. OTU pairs with SparCC correlations with absolute values of ≥0.3 were considered to exhibit a co-occurrence relationship. Co-occurrence patterns were visualized using Cytoscape () as an undirected graph in which each OTU and co-occurrence was indicated by a node and edge, respectively. […]

Pipeline specifications