Computational protocol: Microbial analysis of Zetaproteobacteria and co-colonizers of iron mats in the Troll Wall Vent Field, Arctic Mid-Ocean Ridge

Similar protocols

Protocol publication

[…] Raw sequence data was processed with MOTHUR (version1.33.2) using a pipeline based on the publicly available 454 SOP as accessed on 11/11/2015 (http://www.mothur.org/wiki/454_SOP) [,]. Filtering of sequences was performed with Ampliconnoise as implemented by MOTHUR (‘shhh-flows’ command). After removal of barcodes and primer sequences, all reads were merged and aligned to a SILVA reference alignment database (silva.nr_v119.align). The default Needleman-Wunsch aligning algorithm was applied, with k-mer template searching and using +1 per match and mismatch penalties -1, -2 and -1 for each mismatch, opening and extension of a gap, respectively. The alignment was cropped to a minimum length of 210bp and further optimized by selecting both start and end positions by which 90% of the sequences started or ended. Sequences with homopolymers longer than 6 bp were removed. Further filtering and preclustering of sequencing data was carried out as described in []. Chimeras were removed from the dataset using default settings of the UCHIME program [] as implemented in MOTHUR. More detailed information on the filtering of reads is given in the .Classification of reads and clustering into operational taxonomic units (OTUs) at 97% sequence similarity was carried out as described in []. A total of 5573 OTUs were identified of which 1877 remained after removal of single- and doubletons. Rarefaction plots were generated using the MOTHUR command ‘rarefaction.single’. In addition, richness [] and diversity (inverse Simpson) indices were generated, with the ‘summary.single’ command.Non-metric multi-dimensional scaling (NMDS) plots were generated using the Vegan package in R [] and were based on Bray-Curtis distances, obtained using the average neighbour clustering algorithm. Analysis of similarities (ANOSIM) [] was used to assess the significance of differences between microbial mats from the rift valley and the rift margin, in which the R-test statistic provides a measure for the separation of community structures; with R = 0 designating no separation, R < 0.25 as barely separable, R > 0.5 as separated but overlapping and R > 0.75 indicating well-separated community structures []. Heatmaps showing the compositional profile of the different iron mat communities were generated in R with the ‘heatmap.2’ function within the gplots package version 2.11.0.1 [] where samples (columns) were clustered hierarchically, by complete linkage with Euclidean distance measure, and OTUs (rows) clustered phylogenetically, by incorporating a phylogenetic tree from MEGA version 5.2.2. This neighbour-joining tree, comprising the 50 main OTUs, was built with default settings of the incorporated maximum composite likelihood algorithm; using default ClustalW parameters (gap opening penalty = 15 and gap extension penalty = 6.66).To contribute to the effort of mapping the global distribution of Zetaproteobacteria, all Zetaproteobacteria reads from TWVF were assigned to predefined OTUs (ZetaOtus), using the curated ZetaHunter application (https://github.com/mooreryan/ZetaHunter, last accessed 20/06/2017) []. ZetaHunter uses the SILVA v123 phylogenetic reference database in order to assign query sequences to ZetaOtus [, , –]. For comparison, all MAR sff-files provided by [] were reanalysed using MOTHUR and ZetaHunter as described above. Due to differences in the targeted 16S rRNA gene region, we were not able to directly compare de novo ZetaOtus constructed for TWVF with those constructed from the other MAR samples. […]

Pipeline specifications

Software tools mothur, UCHIME, gplots, Clustal W, ZetaHunter
Applications Phylogenetics, 16S rRNA-seq analysis
Chemicals Iron