Metagenomic studies have dramatically expanded our knowledge of the microbial world. Furthermore, the amount of sample for sequencing has significantly increased with the development of high-throughput sequencing technologies. However, fully capturing all DNA sequences carried by every microorganism in the environment is still impossible. Therefore, estimating a reasonable and practical amount for sequencing to achieve the objectives is particularly necessary.
Furnishes a collection of candidate taxa that corresponds to the chromatogram peaks. T-RFPred sorts terminal-restriction fragments (T-RFs) generated by T-RFLP by taking advantage of the wealth of 16S rRNA gene sequence data. It is based on in silico simulation of the digestion of reference sequences. This tool assists users to estimate the most appropriate restriction enzyme(s) to use when designing experiments to evaluate community structure.
A framework for PERMANOVA power estimation tailored to marker-gene microbiome studies that will be analyzed by pairwise distances, which includes: (i) a method for distance matrix simulation that permits modeling of within-group pairwise distances according to pre-specified population parameters; (ii) a method to incorporate effects of different sizes within the simulated distance matrix; (iii) a simulation-based method for estimating PERMANOVA power from simulated distance matrices; and (iv) an R statistical software package that implements the above. Matrices of pairwise distances can be efficiently simulated to satisfy the triangle inequality and incorporate group-level effects, which are quantified by the adjusted coefficient of determination, omega-squared. From simulated distance matrices, available PERMANOVA power or necessary sample size can be estimated for a planned microbiome study.
Models stable isotope probing of nucleic acids (DNA-SIP) data and assesses how changes in key SIP experimental parameters are predicted to affect DNA-SIP accuracy. SIPSim can feign the distribution of gDNA fragments in isopycnic gradients at sedimentation equilibrium. It can create the DNA-SIP datasets obtained from fractionating isopycnic gradients and conduct high throughput sequencing on many of the gradient fractions.
Models bacterial compositions derived from 16S rRNA sequencing. GPMicrobiome is a Stan implementation of the Temporal Gaussian Process Model for Compositional Data Analysis (TGP-CODA). The model integrates temporal, over dispersion, and zero-inflation components for analyzing longitudinal 16S rRNA sequencing data. It can incorporate different experimental designs, such as non-equidistant sampling over time, missing time points, and variable sequencing depth and quantifies the uncertainty of the final estimates, which is an important property in integrated microbiome studies.