[…] For each genomic DNA sample, independent triplicates were extracted as templates to amplify the amoA gene, using Arch-amoAF and Arch-amoAR primers and following the PCR protocols previously described (). For comparative purposes, the cDNA of samples collected at 200 m from stations 1, 5, and 7 (i.e., Stn1-200 m, Stn5-200 m, Stn7-200 m) was also amplified. In order to enable sample multiplexing during sequencing, barcodes were incorporated between the adapter and forward primer. Nuclease-free water was used as the negative control in each reaction. Triplicate PCRs were performed for each sample and the amplicons were pooled and subsequently purified with the illustraTM GFXTM PCR DNA and Gel Band Purification kit (GE Healthcare, Little Chalfont, Bucks, United Kingdom). An amplicon library was constructed with equimolar concentrations of the amplicons, and emPCR was conducted according to the Rapid Library preparation kit instructions (Roche, Basel, Switzerland). DNA beads were successfully deposited onto the PicoTiterPlate and sequenced with a GS Junior system (Roche).The amoA sequences generated in this study were processed using the microbial ecology community software program, Mothur (). The sequences were de-noised and the barcode and forward primer sequences were removed simultaneously with the shhh.seqs (sigma value = 0.01) and trim.seqs scripts, and chimeric sequences were identified with chimera.uchime (). Reads shorter than 400 bp in lengthand sequences containing undetermined nucleotides were removed. The remaining sequences were aligned with the amoA DNA sequences from the NCBI nucleotide database, and then any sequences that could not be aligned with the previously discovered amoA sequences were removed. The phylogenetic distances between these high quality sequences were calculated with Mothur (), and operational taxonomic units (OTUs) were generated with 97% DNA sequence similarity as the cutoff value. The OTUs that contained just one sequence were removed. The richness estimator (Chao1), diversity (Shannon–Weaver index, H′), and Good’s coverage were calculated with 97% sequence similarity as cutoff values. To evaluate the number of shared OTUs among the samples, a Venn diagram was generated with Mothur (), using a 97% DNA sequence similarity as the cutoff value. A rarefaction curve was also generated, again with a 97% sequence similarity as the cutoff value. The OTUs with relative abundances > 0.1% of the relative abundance of the whole dataset, were regarded as being the principal (or top) OTUs and these were selected for subsequent analysis. The remaining OTUs were treated as a minor group.To identify the phylogenetic affiliation of amoA sequences, representative sequences of the top OTUs were used to search the nucleotide BLAST (BLASTn) webpage of the NCBI nucleotide sequence database. The representative sequences of the top OTUs, the selected reference sequences and the environmental sequences of the amoA gene from the NCBI database were used to construct a Maximum-likelihood (ML) tree using the MEGA 6.0 (molecular evolutionary genetics analysis) software (). The DNA sequences were codon-aligned and a model test was conducted to select the best fit DNA substitution model for construction of the ML tree. Based on the Bayesian Information Criterion calculation, the Tamura 3-parameter model, using discrete Gamma distribution with the assumption that a certain portion of sites are evolutionarily invariable (T92+G+I), was selected. The ML tree was further edited with iTOL (), with the relative abundances of the top OTUs displayed. To evaluate the number of shared OTUs among samples, the normalized OTU data were also used to generate a Venn diagram with Mothur (). [...] To assess the dissimilarity among multiple groups, a newick-formatted tree was generated using the tree.shared command in Mothur, and the Bray–Curtis calculator was used to determine the UPGMA (unweighted pair group method with arithmetic mean) clustering. In addition, Pearson’s correlation coefficients between the environmental variables and the proportions of different clusters, and the abundance of WCA and WCB genes from the different stations were calculated using the SPSS software package (SPSS, Chicago, IL, United States) after the data were square-root transformed. Values of p < 0.05 and p < 0.01 were considered to indicate different levels of statistical significance. […]

