Computational protocol: Coastal urbanisation affects microbial communities on a dominant marine holobiont

[…] Raw data acquired from sequencing were quality filtered, standardised, classified and then clustered into operational taxonomic units (OTUs) using the sequencing analysis software Mothur. Briefly, sequences including forward and reverse reads (251 bps per read) were firstly combined into contigs. Sequences that contained N bases or had >8 homopolymers were filtered out. Remaining sequences were aligned referring to the Silva 16 S rRNA gene database and sequences that did not align were excluded. Sequences aligned were pre-clustered (diffs = 2) and checked for chimeras using UCHIME. Singleton and doubleton sequence reads were removed from the data set to reduce further noises caused by Illumina sequencing error. Remaining sequence counts were rarefied to 44,225 reads per sample to account for differences in sequencing depth. Sequences were then taxonomically classified according to the Silva 16 S rRNA gene database with 60% cut-off confidence and clustered into OTUs at a minimum of 97% taxonomic identity. Rarefaction curves of the processed sequences were generated to estimate sampling efficiency (Supplementary Fig. ). High-quality sequences selected from the raw data set resulted in 13,978 OTUs clustered at 97% similarity. In order to focus analyses on the abundant OTUs and reduce the effect of potentially spurious OTUs, those that contribute to less than 0.01% of relative abundance were removed from the data set, which resulted in 475 OTUs that were used for further analyses. […]

Pipeline specifications

Software tools mothur, UCHIME
Application 16S rRNA-seq analysis
Organisms Escherichia coli