Computational protocol: Rumen microbial community composition varies with diet and host, but a core microbiome is found across a wide geographical range

Similar protocols

Protocol publication

[…] Pyrosequence data were processed and analysed using the QIIME software package version 1.8. Sequences over 400 bp in length with an average quality score over 25 were assigned to a specific sample via the barcodes. The number of bacterial, archaeal, and ciliate protozoal sequencing reads available for analysis are summarised in . Sequence data were grouped into operational taxonomic units (OTUs) sharing over 97% (bacteria – UCLUST), 99% (archaea - UCLUST) or 100% (ciliate protozoa – prefix_suffix option in QIIME) sequence similarity. Sequences were assigned to phylogenetic groups by BLAST. Bacterial 16S rRNA genes were assigned using the Greengenes database version 13_5, archaeal 16S rRNA genes using RIM-DB version 13_11_13 and ciliate protozoal 18S rRNA genes against an in-house database. Bacterial and ciliate protozoal data were summarised at the genus level. Archaea were summarised at the species level. Samples for which low read numbers were obtained or that contained high proportions of sequences from “exogenous” bacteria (i.e., likely environmental contaminants such as Stenotrophomonas) were excluded from further analyses ().The identity of the most abundant and prevalent OTUs was determined using BLAST against sequences from type material and against all sequences (excluding sequences from model organisms or environmental samples) in the nt database. Bellerophon (version 3, 200 bp window, Huber-Hugenholtz correction) was used to identify chimeric OTU sequences. Sequence similarities greater than 97% and 93% were used as cut-offs to classify OTUs at species- and genus level, respectively. The rationale for these cut-offs was discussed by Kenters et al.. [...] The resulting dataset allowed us to establish whether animal or dietary factors relate to rumen and camelid foregut microbial community composition, identify the dominant microbes and their potential associations, and describe the degree of similarity of rumen and camelid foregut microbial communities worldwide. Statistical analyses of microbial data were performed using GenStat for Windows, R software, and QIIME. Principal coordinate analysis of Bray-Curtis dissimilarity matrices, analysis of variance, sparse partial least squares discriminant analysis (sPLS-DA, using a sPLS regression approach), and canonical discriminant analyses (CDA) of microbial community composition data in context of the metadata () were used to identify impacts of factors such as host lineage, diet, etc. on rumen and camelid foregut microbial communities and to identify the groups associated with these factors. Pearson, Spearman, SparCC, and regularised canonical correlation analyses (CCA) were used to identify associations within and between archaeal, bacterial, and protozoal groups. Association scores were visualised as relevance networks and clustered image maps (CIM, heatmaps) representing the first two dimensions. González et al. provides a comprehensive overview of sPLS-DA, CCA and the corresponding ‘pairwise associations’, network and CIM techniques and their application. […]

Pipeline specifications

Software tools QIIME, UCLUST, SparCC
Applications Phylogenetics, 16S rRNA-seq analysis
Organisms Homo sapiens, Bacteria
Chemicals Methane