Computational protocol: Environmental metabarcoding reveals heterogeneous drivers of microbial eukaryote diversity in contrasting estuarine ecosystems

Similar protocols

Protocol publication

[…] In all, 104 benthic samples were collected from Thames (20 sampling stations) and Mersey (15 sampling stations) estuaries (UK) in June–July 2008. For both estuaries, benthic communities were sampled at the low-tide mark, accessed either on foot (Thames) or by boat (Mersey) ( and ). At each station, 3 sediment core samples were collected using Perspex tubes (4.4 cm in diameter, 10 cm deep, ∼10 m apart) for metabarcoding analysis of meiofauna, each being stored in 500 ml of DESS (20% dimethyl sulphoxide and 0.25 M disodium EDTA, saturated with NaCl, pH 8.0, ). A fourth core sample was collected for granulometric analysis. In the laboratory, the meiofaunal size fraction and organisms up to 1 mm in size were mechanically separated from the sediment and immobilised on a 45 μm filter before separation from fine silt using repetitive centrifugations in 1.16 specific gravity LUDOX TM-40 solution (Sigma-Aldrich Company Ltd., Gillingham, UK) (). Following this step, each sample was retained on a distinct mesh sieve that was then folded, sliced, placed in a 15 ml Falcon tube and kept at −80 °C until DNA extraction. After overnight lysis at 55 °C, community DNA was extracted with the QIAamp DNA Blood Maxi (Qiagen, Manchester, UK) according to identical protocols set out in . The highly conservative metabarcoding primers SSU_F_04 and SSU_R_22 (; ) were used as they amplify broadly throughout meiofaunal organisms (in addition to protists and fungi) and they flank the most variable (in meiofaunal taxa) ∼450 bp nSSU gene region. The nSSU gene region was then PCR amplified in triplicates from community DNA using Pfu DNA polymerase (Promega, Southampton, UK) and forward and reverse MID-tagged fusion primers; visualised by gel electrophoresis and purified using the QIAquick Gel Extraction Kit (Qiagen); quantified on an Agilent Bioanalyser 2100 (Agilent Technologies, Stockport, UK) and pooled in equimolar quantities. The purified amplicons pools were then sequenced in a single direction (A-Amplicon) on four half plates using the 454 Roche GSFLX (454 Life Sciences, Roche Applied Science, Branford, CT, USA) sequencing platform at Liverpool University's Centre for Genomic Research (Liverpool, UK). All protocols were identical to those presented in .Raw sequence reads were filtered and denoised using FlowClus (Gaspar and Thomas, submitted, freely available at GitHub (jsh58/FlowClus)). Criteria used for the filtering step were: minimum sequence length 150 bp; maximum sequence length 500 bp; truncate reads before first N; truncate before a window of 25 bp whose average quality score is <20; truncate before a set of four flows whose values are <0.40 (criteria recommended by ). The denoising step corrects pyrosequencing errors by clustering the flowgrams and a constant denoising value of 0.50 was used. Then, the data were analysed using the QIIME pipeline (): (1) chimeras were removed using UCHIME (), with the abundance information generated by FlowClus; (2) OTUs were clustered at 96% sequence similarity using UCLUST (), as 96% sequence similarity has most closely emulated species richness via the analysis of control nematode communities using nSSU (); (3) a representative sequence was picked for each OTU; (4) taxonomy was assigned using the Silva 111 database (); and (5) an OTU table was generated. For direct ecological comparisons among samples that have different coverages (that is, number of reads), the percentage of reads in each sample was used instead of read counts and downstream analyses were focused on meiofauna and dominant protist groups occupying shallow sediment habitats. Raw sequence reads were additionally analysed using the OCTUPUS pipeline (, available at http://octupus.sourceforge.net/) and OTUs annotated against the downloaded NCBI (National Center for Biotechnology Information) nucleotide database using the raw data set and also a rarefied data set (1102 randomly picked sequences from each sample). […]

Pipeline specifications

Software tools FlowClus, QIIME, UCHIME, UCLUST
Application 16S rRNA-seq analysis