Computational protocol: Successful collection of stool samples for microbiome analyses from a large community-based population of elderly men

Similar protocols

Protocol publication

[…] Six hundred specimens from unique participants were sent to the Alkek Center for Metagenomics and Microbiome Research (CMMR) at Baylor College of Medicine in Houston, Texas for microbiome analysis. Samples were arrayed in boxes and shipped on dry ice over-night with an accompanying sample manifest that included de-identified sample IDs and box positions. Upon delivery, samples were reconciled with the provided manifest and stored at −80 °C until further processing. For bacterial genomic DNA extraction, samples were thawed at room temperature to re-liquefy the samples and 200 μL of stool suspension were transferred to the extraction deep-well plate. For samples where the fecal material was too thick to pipette, an equivalent volume was transferred using a sterile and disposable spatula. DNA extraction was carried out in the Hamilton STARlet platform following the standard MoBio PowerMag Soil DNA extraction protocol. Extracted DNA was subjected to 16S (v4) rDNA amplification using primers 515F and 806R containing Illumina adapters and a single-end barcode allowing pooling and direct sequencing of PCR products . Amplicons were visualized via gel electrophoresis and quantified via automated Quant-iT PicoGeen assay. Quantified amplicons were normalized and pooled according at DNA mass of 100 ng per sample, and the resulting amplicon pool was cleaned using the ChargeSwitch PCR Clean-up Kit (Invitrogen). The amplicon pool was sequenced on two lanes of an Illumina MiSeq reagent kit v2 (2 × 250 bp) and resulting sequences were demultiplexed based on the unique molecular barcodes, and reads were merged using USEARCH v7.0.1090 , allowing zero mismatches and a minimum overlap of 50 bases. Merged reads were trimmed at first base with Q5. In addition, a quality filter was applied to the resulting merged reads and reads containing above 0.05 expected errors were discarded.The analytic pipeline for 16S rDNA analysis leverages custom analytic packages and pipelines developed at the CMMR to provide summary statistics and quality control measurements for each sequencing run, as well as multi-run reports and data-merging capabilities for validating built-in controls (known and blank) and characterizing microbial communities across large numbers of samples or sample groups.16Sv4 rDNA gene sequences were clustered into Operational Taxonomic Units (OTUs) at a similarity cutoff value of 97% using the UPARSE algorithm . OTUs were mapped to an optimized version of the SILVA Database containing only the 16Sv4 region to determine taxonomies. Abundances were recovered by mapping the demultiplexed reads to the UPARSE OTUs. A custom script constructed a rarefied OTU table from the output files generated in the previous two steps for downstream analyses of alpha-diversity, beta-diversity and phylogenetic trends. […]

Pipeline specifications

Software tools USEARCH, UPARSE
Applications Phylogenetics, Metagenomic sequencing analysis, 16S rRNA-seq analysis
Organisms Homo sapiens