Computational protocol: Impacts of chemical gradients on microbial community structure

Similar protocols

Protocol publication

[…] For 16S ribosomal RNA (rRNA) gene tag sequencing, primers S-D-Arch-0519-a-S-15 and S-D-Bact-0785-b-A-18 covering the v4 region () were used and extended with MIDs of 7 to 10 bp in length for multiplexing. PCR was performed using 25–125 ng template DNA and the Phusion High-Fidelity PCR Master Mix (Finnzymes, Thermo Fisher Scientific, Waltham, MA, USA) with the following protocol: 30 s initial denaturation (98 °C), 30 cycles of 10 s denaturation (98 °C), 30 s primer annealing (66 °C) and 12 s extension (72 °C), and a final extension step at 72 °C for 10 min. The amplicons (approximately 300 bp in length) were purified and pooled, an end-repair step was performed, adaptors for PGM sequencing were ligated and the resulting ion torrent libraries were sequenced on the PGM platform using 400 bp read chemistry. A total of 14 samples taken at different time points were sequenced yielding 729 581 reads. After demultiplexing and filtering with trimmomatic (; quality >20) and mothur (; bdiffs=1, pdiffs=2, minlength=250, maxlength=350, maxambig=0, maxhomop=8), 349 170 reads remained. These were classified to the assembled 16S rRNA gene sequences, one for each bin, with USEARCH () at 92% and 97% identity thresholds. Number of Operational Taxonomic Units (OTUs) were determined for each sample and each clade by clustering with USEARCH at 97% identity. The RNA samples were fixed immediately with RNAlater (Ambion, Austin, TX, USA). Two-milliliter cultures were used to extract DNA or RNA (; ). The isolated DNA and RNA were sequenced with the Ion Personal Genome Machine (PGM) System (Thermo Fisher Scientific, San Francisco, CA, USA) on 316 and 318 chips, respectively. The combined reads of all three DNA samples were assembled from the sff files generated by the Torrent Suite software 2.0.1 with the GS De Novo Assembler 2.6 (454 Life Sciences, Branford, CT, USA default settings for genomic DNA). To recover 16S rRNA genes, an additional assembly was performed (). Briefly, the complementary DNA option was selected during sff files loading, the ‘Minimum overlap identity' was set to 99%, and ‘Extend low depth overlaps' as well as ‘Reads limited to one contig' were selected. Contigs were binned using tetranucleotides with MetaWatt with its most relaxed settings, as previously described (). Binning yielded four bins, representing three of the five abundant clades (for one of the clades, two bins were detected, see ). Detection and phylogeny of 16S rRNA genes was performed as previously described (). Abundances of all clades were estimated on the basis of sequenced DNA and RNA by mapping sequencing reads to the assembled contigs with BBMap (, k 13, minid 0.6). For the two clades that assembled poorly (clades E and F), genomes of closely related bacteria were used as the template for mapping (D. salexigens, Genbank CP001649 and BioProject PRJNA246767/Bin A). The assembled contigs were annotated separately for each bin with prokka (). Transcriptional per-gene activities were computed for all predicted open reading frames (ORFs) from the mapped transcriptomes by dividing (number of reads mapped to ORF/ORF length) by (total number reads mapped to all ORFs in bin/total length of all ORFs in bin). This way, a transcriptional activity of 1.0 corresponded to the average transcriptional activity. For Bin D (Vibrionales), less conserved regions of some key genes were assembled incompletely. For those genes, gene completeness was validated independent of assembly, by performing blastx of all reads (both transcriptomes and metagenomes) against the genome of a related reference organism (Vibrio tubiashii). RNA was extracted from the culture at two different time points during tidal cycling shown in , RNA1 at 0.3 h, at the peak of the oxygen profile and RNA2 at 2 h, the base of the oxygen profile, and from anoxic (RNA3) and oxic (RNA4) incubations of filtered cells shown in Figure 4, at 0.0 mm and 0.3 mm O2, respectively.All sequencing reads generated in this study, including amplicon libraries and transcriptomes, as well as the assembled contigs are available at the NCBI (PRJNA255238).Proteomic analyses of two technical replicates was performed as previously described (). Population abundances estimated from the proteomic results were estimated by counting unambiguously identified peptides for each bin. Sequencing reads from metagenomes and transcriptomes from the Chilean oxygen minimum zone () ( and the Janssand tidal flat (Sequence Read Archive SRP021900) were translated in six frames and scanned for key functional genes with Hidden Markov Models constructed from aligned reference amino acid sequences, one for each gene (pflAB, bd-I, bd-II, the heme-copper oxidase superfamily, dsrAB, nirS and rpoBC) with hmmscan (version 3.0) () with an e-value cut-off of 1e−10. Read counts for each functional gene were normalized against counts for rpoBC. Gene abundances were inferred from metagenomes, gene activities from transcriptomes. […]

Pipeline specifications