Computational protocol: Population- and Individual-Level Dynamics of the Intestinal Microbiota of a Small Primate

Similar protocols

Protocol publication

[…] We collected fecal samples from September to November in 2012 and 2013 in Ranomafana National Park (RNP) in southeastern Madagascar (21°16′S latitude and 47°20′E longitude). RNP consists of primary and secondary forests situated on lowland to montane areas (500 to 1,500 m above sea level), and it is surrounded by a peripheral zone with restrictions on land use (). Mouse lemurs have their mating period in October, and practically all females are gestating from October to January ().We collected samples from two different transects: inside the National Park in Talatakely and in the peripheral zone in the campsite of Centre Valbio. The Talatakely transect was in the secondary forest with continuous canopy, and it was heavily used by tourist groups, whereas the campsite transect was a heavily degraded area with both trees and bushes that were both endemic and nonendemic and without continuous canopy. Each night before sunset, we set 40 to 50 traps to catch mouse lemurs and collected the traps approximately 2 h after sunset. The trapping and feces collection procedures are detailed elsewhere (). We microchipped all mouse lemurs to allow for identification and longitudinal surveying of individuals. We estimated the age with validated methods () and divided mouse lemurs into young (1 to 2 years), midage (3 to 4 years), and old (≥5 years) groups. The research was approved by the ethics committee of Viikki Campus, University of Helsinki, Helsinki, Finland, and by the trilateral commission (CAFF/CORE) in Madagascar (permit 203/12/MEF/SG/DGF/DCB.SAP/SCBSE).We could not measure mouse lemur body temperature and thus cannot directly assess their level of heterothermy. Nevertheless, mouse lemurs are known to be heterothermic during periods of inactivity and low ambient temperatures (). We calculated the mean temperature for all trapping days while mouse lemurs are most likely in torpor (from 02h00 to 14h00). In addition, we calculated the monthly mean temperatures and total rainfall based on weather data (, data set 20150601041318_1609) we acquired from the Tropical Ecology Assessment and Monitoring (TEAM) network, which has weather stations run by Centre Valbio in the area ().We obtained approximately 0.1 to 0.2 g of feces per sample and stored the feces in 1.5 ml of RNAlater (Ambion, Inc., Austin, TX, USA) in −18°C. DNA was isolated with the PowerSoil DNA isolation kit (Mo Bio Laboratories, Inc., Carlsbad, CA, USA), according to the standard protocol. We amplified the 16S gene in V1-V2 region with primer pairs pA_Illum_FP_1 (ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTAGAGTTTGATCMTGGCTCAG) and pA_Illum_RP_1 (GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTGTATTACCGCGGCTGCTG), pA_Illum_FP_2 ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTTAGAGAGTTTGATCMTGGCTCAG) and pA_Illum_RP_2 GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCGTATTACCGCGGCTGCTG), and pA_Illum_FP_3 ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTCTCTAGAGTTTGATCMTGGCTCAG) and pA_Illum_RP_3 GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTTAGTATTACCGCGGCTGCTG) and Phusion enzyme (New England BioLabs, Inc., Ipswich, MA, USA). The PCR protocol included an initial denaturation at 98°C for 30 s and 15 cycles of denaturation at 98°C for 10 s, annealing at 62°C for 15 s, and extension at 72°C for 15 s, followed by 10 min of final extension at 72°C. We verified the success of amplification by gel electrophoresis. For each sample, two replicates were separately amplified and then pooled. The amplicons were paired-end sequenced with the Illumina MiSeq platform in the sequencing facility of the Institute of Biotechnology, University of Helsinki.Samples were isolated and amplified in 7 different batches and sequenced in 2 different batches. The amplification was performed in numerical order of the samples, which corresponded to chronological order, and sequencing was performed partly in chronological order: late samples from 2012 and early samples from 2013 were sequenced together, and early samples from 2012 and late samples from 2013 were sequenced together.The amplicon sequences were demultiplexed, and subsequent sequence processing was performed using the mothur pipeline, along standard operating procedures (SOP) (; when not otherwise mentioned. To purge unsuccessful contigs, only contigs between 439 and 511 bp were retained. The alignment was made against aligned SILVA bacterial references (release 102). Preclustering of the sequences was performed with a maximum difference of 5 bp. To reduce the number of unique sequences, all unique sequences composed of only one sequence were discarded. Sequences were classified by using Bayesian classifier with a training set (version 9) from the Ribosomal Database Project (; We used 97% similarity to determine the operational taxonomic units (OTUs).We performed the initial statistical analysis in line with MiSeq SOP with mothur. We rarified the amplicons to the lowest sample size, 4,316 sequences. We estimated the alpha diversity of the samples with the inverse Simpson diversity index. Both the richness, defined as the number of OTUs per sample, and the Simpson diversity were normally distributed and homoscedastic. To explore the structures of communities, we calculated the Yue & Clayton () dissimilarity metric based on the proportions of OTUs in different samples and the Jaccard distance from presence-absence data (). We used Jaccard distance, as it is a widely used intuitive metric, and the Yue & Clayton dissimilarity metric, as it takes into account (in comparison to widely used Bray-Curtis dissimilarity) both shared and nonshared species in each population and emphasizes the shared species with similar species proportions in communities. We visualized the dissimilarity metrics with nonmetric multidimensional scaling (NMDS), using three dimensions. To calculate how many community types there are in the samples, we used the Dirichlet multinomial mixtures method in mothur with Laplace approximation (). The method describes communities as a vector of taxon probabilities created from Dirichlet mixture components. Mixture components are clustered, and the fit is evaluated with Laplace approximation.For subsequent analysis, we used R () with the vegan package (). We created repeated-measures analysis of variance models for rank-transformed richness counts and diversity indices. For each model, we began with a full model including interactions and simplified models, where terms that were not significant were removed until we were left with variables with a P value of <0.05. For the remaining analysis, we considered the variables site, age, year, week, and sex. There was no significant effect with sequencing batches, but we found a significant effect on isolation and amplification batches, and these batches were also included as variables in subsequent analysis. As post hoc tests for multivariate repeated-measure analysis of variance (ANOVA) are complex, we investigated the differences graphically. Nevertheless, we also performed t test comparisons for significant variables for diversity and corrected them with the Holm-Bonferroni method.We tested if the communities overlap using permutational multivariate ANOVA (MANOVA) with dissimilarity matrices (), taking into account the repeated sampling. Permutational MANOVA measures whether the groups are significantly different from each other. We also performed an analysis of multivariate homogeneity of group dispersions (), which tells us if the within-group variation differs between groups. As analysis of multivariate homogeneity cannot be performed with multiple variables, we divided our samples into eight groups based on site, sex, and year and performed pairwise post hoc comparisons. To take repeated measures into account, we randomly sampled one sample per individual for analysis with 100 iterations.To study the temporal effects on microbiota, we calculated variation in population-level and individual-level dissimilarity and used the Mann-Whitney U test to test dissimilarities between trapping years. For the individuals from which we had more than two samples, we plotted the proportions of bacterial phyla and the most common 20 OTUs on graphs for visual inspection. Additionally, we plotted NMDS axis loadings as a function of time and calculated and plotted the dissimilarity indices as a function of the temporal distance of samplings to see if there are temporal trends in the microbiota within the host. The dissimilarity indices were normally distributed; therefore, we explored significant variables by performing repeated-measures ANOVA on dissimilarity indices for mouse lemurs we caught several times as the response variable; the number of days between trapping, temperature, trapping year, site, sex, and age as explanatory variables; and mouse lemur individuals as the repeated factor.We used several methods to explore which OTUs are driving the trends between groups. First, we chose all the OTUs that are present in >10% of the samples and, as they are not normally distributed, we used rank-transformed repeated-measures ANOVA to see if they differed significantly between the groups. Second, we used Dufrêne-Legendre indicator analysis, which uses the maximum indicator value among the groups as a test statistic, without the need for multiple tests (). To further explore the difference between years, we performed a random forest approach with R package randomForest () to assess how well trapping year can be predicted from the community composition. We also used the mean decrease in accuracy as a proxy for the importance of single OTUs in differentiating separate groups. […]

Pipeline specifications

Software tools mothur, vegan, randomforest
Applications Miscellaneous, Amplicon sequencing analysis
Organisms Bacteria, Mus musculus
Diseases Ectromelia, Infectious