Computational protocol: Mucosal adherent bacterial dysbiosis in patients with colorectal adenomas

[…] Amplicons were extracted from 2% agarose gels and purified using the AxyPrep DNA Gel Extraction Kit (Axygen Biosciences, Union City, CA, U.S.) according to the manufacturer’s instructions and quantified using QuantiFluor™ -ST (Promega, U.S.). Purified amplicons were pooled in equimolar. Index PCR and sequencing according to the Illumina MiSeq 16S Metagenomic Sequencing Library Preparation protocol ( preparation.html). Briefly, Samples were multiplexed using a dual-index approach with the Nextera XT Index kit (Illumina Inc., San Diego, CA, USA) according to the manufacturer’s instructions. The final library was paired-end sequenced at 2 × 250 bp using a MiSeq Reagent Kit v2 on the Illumina MiSeq platform.Raw fastq files were demultiplexed and quality-filtered using Trimmomatic and FLASH software with the following criteria: (I) the 300-bp reads were truncated at any site receiving an average quality score of <20 over a 50-bp sliding window, discarding the truncated reads that were shorter than 50 bp; (II) exact barcode matching was required, 2-nucleotide mismatch in primer matching and reads containing ambiguous characters were removed; and (III) only sequences that overlapped by more than 10 bp were assembled according to their overlap sequence. Reads that could not be assembled were discarded. [...] The high-quality sequences were assigned to samples according to barcodes. Operational taxonomic units (OTUs) were clustered with 97% similarity cutoff using UPARSE (version 7.1 and chimeric sequences were identified and removed using UCHIME. The taxonomy of each 16S rRNA gene sequence was analyzed by RDP Classifier ( against the SILVA 119 16S rRNA database using a confidence threshold of 70%. OTUs that reached 97% similarity were used for alpha diversity estimations, which included diversity (Shannon, Simpson), richness (Chao I), and Good’s coverage and rarefaction curve analysis using Mothur (Version 1.30.2; heat map was constructed using the gplot package of R software. PCoA was conducted according to the Bray-Curtis distance matrix calculated using OTU information from each sample. To show differences among samples, a cluster tree was generated with the ape package of R using the average method. LEfSe analysis is a metagenomic analysis approach that performs linear discriminant analysis to assess the effective size of each differentially abundant taxon or OTU; the cladogram is displayed according to effective size. [...] Metastats and the Mann-Whitney test were performed using R software and python scripts, respectively. Student’s t-test was performed using SPSS version 20 for Windows. […]

Pipeline specifications

Software tools Trimmomatic, UPARSE, UCHIME, RDP Classifier, mothur, LEfSe, Metastats
Applications Metagenomic sequencing analysis, 16S rRNA-seq analysis
Organisms Bacteria, Homo sapiens, Firmicutes, Bacteroidetes, Lactococcus