Computational protocol: Benthic Algal Community Structures and Their Response to Geographic Distance and Environmental Variables in the Qinghai-Tibetan Lakes With Different Salinity

Similar protocols

Protocol publication

[…] The raw 23S rRNA gene sequences were processed following the pipeline coupling USEARCH () and QIIME () software. The paired reads were joined with FLASH (fast length adjustment of short reads) using default setting (). Forward and reverse primers were removed from the joined reads. The remaining reads were then de-multiplexed and quality filtered using QIIME v1.9.0 with split_libraries_fastq.py script (). Briefly, reads having more than three consecutive low quality (Phred quality score <30) bases were removed, and reads containing ambiguous base were discarded, as well as reads comprising consecutive high quality bases less than 75% of the total read length were culled out. Chimera checking was performed using the UCHIME module with de novo method in USEARCH (). Singleton and read length less than 200 were discarded, and operational taxonomic units (OTUs ) were defined at the 97% cutoff () by using the UCLUST algorithm (). OTU representative sequences were then selected and their taxonomy were assigned using parallel_assign_taxonomy_blast.py with default set (sequences similarity >90% and blasted exception value<10-3) against the SILVA 128 LSU database in the QIIME program. Sequences failing to be assigned into Cyanobacteria and eukaryotic algae were removed. In order to validate these assignments of taxonomy, OTU representative sequences were locally BLASTed in NCBI database. The BLASTed results were provided in Supplementary Table . The final OTU table was rarefied to equal sequence number (n = 8843) for each sample with 1000 times, and then alpha diversity was calculated at the 97% identity level in QIIME. A variety of alpha diversity indices were calculated including Simpson, Shannon, Equitability and Chao1.All environmental variables in this study were normalized to values ranged between 1 and 100 as described previously (). The non-metric dimensional scaling (NMDS) ordination with 500 random starts were performed to depict the difference of algal community compositions among lakes based on the Bray-Curtis dissimilarity using the package “vegan.” Cluster analysis was performed according to the Bray-Curtis dissimilarity among samples using PAST software. Simple Mantel tests were performed to assess the Spearman’s correlations between algal community compositions and geographic distance/environmental variables by using the “vegan” package. Geographical distances among sampling sites were calculated based on the GPS locations of each sites using Euclidean method in PAST software (Supplementary Table ). Canonical correspondence analysis (CCA) was also performed to explore the relationships between algal communities and environmental and spatial variables. Before the CCA, a set of spatial variables were generated through the method of principal coordinates of neighbor matrices (PCNM) analysis according to the longitude and latitude coordinates of the sampling sites (). Subsequently, we used a forward selection procedure to select environmental and spatial variables through the ‘ordiR2step’ function in R package “vegan” (). Only significant (p < 0.05) environmental and spatial variables were shown in the CCA ordination.In order to discern the difference between benthic and planktonic algal community composition in lakes, planktonic algal 23S rRNA gene sequences were collected from the two published studies (; ). To avoid any bias resulting from different primers, only 23S rRNA gene sequences derived from the same primer set (p23SrV_f1 and p23SrV_r1) and the same PCR protocol were included in this analysis. Sequences were processed according to the procedures described above. NMDS ordination with 500 random starts were conducted to discern the difference between benthic (this study) and planktonic (previous studies) algal community compositions in lakes according to the Bray-Curtis dissimilarity. In addition, the dominant OTU representative sequences of Cyanobacteria (average relative abundance >0.1%) were selected to perform BLAST against available 23S rRNA genes in the GenBank. Meanwhile, their closest references were retrieved for constructing phylogenic tree. All the OTU representative sequences were aligned with their references by using Clustal W implemented in the Bioedit program. Maximum-likelihood tree was constructed from the representative cyanobacterial 23S rRNA sequences and their references by using the MEGA 6.0. […]

Pipeline specifications

Software tools USEARCH, QIIME, FLASH, UCHIME, UCLUST, Clustal W, BioEdit
Applications Phylogenetics, 16S rRNA-seq analysis
Chemicals Carbon, Nitrogen, Phosphorus