Computational protocol: Characterization of microbial community structure during continuous anaerobic digestion of straw and cow manure

[…] The 16S rRNA gene sequences were processed using the qiime pipeline (Caporaso et al., ). In summary, the dataset was first quality trimmed by removing sequences that were less than 400 or more than 600 bp in length, contained ambiguous bases, had a mean quality score < 25, contained a homopolymer run exceeding 6 bp, or did not contain a primer or barcode sequence. The usearch quality filter was used to remove chimera sequences (Edgar, ). The OTUs were determined using uclust at a threshold of 97% (Edgar, ). Representative sequences were selected as the most abundant sequence in each OTU and further aligned against the Greengenes core set (gg_13_8) using pynast software (Caporaso et al., ; McDonald et al., ) and a minimum identity of 75%. Taxonomy was assigned to each OTU using the Ribosomal Database Project classifier (Wang et al., ) with a minimum confidence threshold of 80%. The alignment was filtered to remove gaps and hypervariable regions using a Lane mask, and a maximum-likelihood tree was constructed from the filtered alignment using FastTree (Price et al., ). The OTU tables were rarefied (according to the sample containing the smallest set of sequences) to equalize sampling depth and avoid heterogeneity (i.e. avoid bias from unequal sampling effort). From the OTU tables and phylogenetic trees, an unweighted UniFrac distance matrix was constructed and further visualized with PCoA (Lozupone and Knight, ). Rarefaction curve, observed species, chao1 (Hill et al., ), Shannon (Spellerberg and Fedor, ) and Simpson indices (Simpson, ) were computed by qiime alpha diversity analysis script (Caporaso et al., 2010). […]

Pipeline specifications

Software tools QIIME, USEARCH, UCLUST, PyNAST, RDP Classifier, FastTree, UniFrac
Databases Greengenes
Applications Phylogenetics, 16S rRNA-seq analysis
Organisms Bos taurus
