Computational protocol: Promising Prebiotic Candidate Established by Evaluation of Lactitol, Lactulose, Raffinose, and Oligofructose for Maintenance of a Lactobacillus-Dominated Vaginal Microbiota

[…] The V4 variable regions of the 16S rRNA genes were amplified in a PCR mixture containing 50 μl light mineral oil, 1 μl sample DNA, and 10 μl each of barcoded primers at 3.2 pmol/μl, which together were set to 85°C before adding 20 μl of GoTaq master mix (Promega, Madison, WI). Reactions were primed at 95°C for 3 min, followed by 25 cycles of 95°C for 1 min, 52°C for 1 min, and 72°C for 1 min. Samples, including amplified no-template controls, were subsequently prepared and sequenced at the London Regional Genomics Centre (; London, Ontario, Canada). First, DNA was quantified using a Qubit 2.0 fluorometer (Thermo-Fisher), pooled at an equal volume of each sample, and purified using the QIAquick PCR purification kit (Qiagen, Hilden, Germany). Purified amplicons were then paired-end sequenced with 250 cycles on an Illumina MiSeq platform (San Diego, CA) in 5% Phi X. A total of 1.91 × 107 reads were obtained, with a median of 115,072 reads/sample.The protocol for initial processing of reads was adapted from the dada2 workflow by Greg Gloor (, using the dada2 and ShortRead packages in R version 3.2.2 ( Read quality was determined by plotting quality profiles for both the forward and reverse reads. Trimming of reads was performed to remove primer barcodes and to remove low-quality ends of forward and reverse reads, which occurred at lengths of 183 and 174, respectively. Filtering was also performed to remove any reads with unidentified nucleotides and more than 2 expected errors, leaving 1.44 × 107 reads with a median of 88,758 reads/sample. Dereplication was performed to summarize individual sequence units (ISUs) by abundance in each sample. Reads derived from PCR or sequencing errors were detected and removed using joint sample inference and error rate estimation on dereplicated ISUs. Pairs of forward and reverse reads were then overlapped and merged into complete sequences. By summarizing overlapping sequences by length, outliers were removed. Five 200-bp sequences, one 201-bp sequence, and two 209-bp sequences were identified and removed, leaving only sequences of 238 to 240 bp in length. Following chimeric sequence identification and removal, a final output of 139 unique sequences across all samples remained. These sequences and abundances can be found at ( Taxonomy was generated to the genus level by comparison of best hits to the Silva rRNA database v123, and to the species level, where possible, according to the Silva species assignment database v123 ( Taxonomy was designated when sequences matched the species with 100% identity, and there were no other matches above 97% identity. Operational taxonomic units (OTUs) were then created by grouping at the genus level. Only OTUs with an abundance of at least 1% across all samples were included for further analysis, leaving only 10 unique genera. To calculate centered log ratios (CLRs) for compositional analysis (), zero-value OTU counts were replaced with an estimate value and then subjected to CLR calculation. Resulting CLRs were graphed in stacked bar plots using R software ( […]

Software tools DADA2, ShortRead
Application 16S rRNA-seq analysis