Computational protocol: Fecal Microbial Diversity in Pre-Weaned Dairy Calves as Described by Pyrosequencing of Metagenomic 16S rDNA. Associations of Faecalibacterium Species with Health and Growth

Similar protocols

Protocol publication

[…] The obtained FASTA sequences file was uploaded in the Ribosomal Database Project (RDP) pipeline initial processor that trimmed the 16S primers, tag sorted the sequences, and filtered out additional sequences of low-quality. DECIPHER was used for chimera sequences identification . RDP Classifier at the RDP's Pyrosequencing Pipeline was used to assign 16S rRNA gene sequences of each sample to the new phylogenetically consistent higher-order bacterial taxonomy . The produced FASTA files were also uploaded in the RDP's aligner, which aligns the sequences using the INFERNAL aligner, a Stochastic Context Free Grammar (SCFG)-based, secondary-structure aware aligner , and then processed by the complete linkage clustering tool (that clustered the aligned sequences into OTUs). The cluster file that was obtained from the above process was subsequently used for evaluation of sample richness and diversity through estimation of Chao1 index, again using the RDP pyrosequencing pipeline . Chao1 is a nonparametric estimator of the minimum richness (number of OTUs) and is based on the number of rare OTUs (singletons and doublets) within a sample. The same cluster files were also used to obtain rarefaction curves for each sample, again using the RDP pyrosequencing pipeline.In order to select representative sequences for Faecalibacterium spp. that were found to have significant effects on weight gain and diarrhea, the following procedure was used. The original FASTA file containing all the sequences was uploaded to the RDP pipeline initial processor that trimmed the 16S primers and filtered out additional sequences of low-quality. The produced file was uploaded to the RDP aligner and then processed by the complete linkage clustering tool that clustered the aligned sequences into OTUs. Finally, the dereplicate function was used to create one representative sequence for each OTU. Eventually, a new file of representative sequences was created and the RDP classifier was used again to classify them. Sequences classified as Faecalibacterium spp. were selected and the Basic Local Alignment Search Tool (BLASTn algorithm) from the National Center for Biotechnology Information (NCBI) (ncbi.nlm.nih.gov/BLAST/) was then used to examine the nucleotide collection (EMBL/GenBank/DDBJ/PDB) databases for sequences with high similarity to these representative sequences . Sequences obtained from this project were submitted to Gen Bank (accession number: JX635481-JX643978). [...] Discriminant analysis was performed in JMP Pro (SAS Institute Inc., North Carolina) using bacterial genus prevalences as covariates and week of life as the categorical variable. In this way the microbial transition from week one until week seven was illustrated. Discriminant analysis was also used to describe differences between samples' fecal microbiomes by weight-gain group during the first and second week of calf life. Bacterial genus prevalences were used as covariates and the interaction of week 1 and 2 and weight gain (low and high) as the categorical variable. Finally, discriminant analysis was used to describe differences between samples' fecal microbiomes during the first week of calf life by diarrhea incidence. Bacterial genus prevalences were used as covariates and diarrhea incidence during the pre-weaning period as the categorical variable.Prevalences of genera that were found to be significant for the discriminant analysis that discriminated high and low weight-gain groups of calves or were found to be significant for the discriminant analysis that discriminated healthy and diarrheic calves were further analyzed. MedCalc (version 12.3.0, Ostend, Belgium) was used to create terciles for each genus that were subsequently used as class variables in multivariable models. Effects on weekly weight measurements were evaluated with the use of a mixed general linear model using the MIXED procedure of SAS. Body weight at birth and different genera prevalence terciles were offered to the Model. Body weight measurements were longitudinally collected and therefore treated as a repeated measurement; the error term was modeled by imposing a first-order autoregressive covariance structure to account appropriately for the within-calf correlation of weight measurements. Similar models were used to evaluate the differences in Chao1 index for calves that had or did not have pneumonia or diarrhea and for calves that belonged to the high or the low weight-gain group. Diarrhea incidence was estimated for the first four weeks of calf life. Effects on diarrhea incidence during the first four weeks of calf life were evaluated with the use of a logistic regression model that was fitted to the data using the GLIMMIX procedure of SAS. Genera prevalence terciles and body weight at birth were offered to the model. Variables were removed from the models manually in a stepwise manner and only variables with P-values <0.05 were kept in the final models. The design of this study and the analysis pipeline followed are illustrated in . […]

Pipeline specifications

Software tools RDP Classifier, BLASTN, JMP Pro
Applications Miscellaneous, 16S rRNA-seq analysis
Organisms Bos taurus, Faecalibacterium prausnitzii
Diseases Pneumonia