Direct analysis of microbial communities in the environment and human body has become more convenient and reliable owing to the advancements of high-throughput sequencing techniques for 16S rRNA gene profiling. Inferring the correlation relationship among members of microbial communities is of fundamental importance for genomic survey study.
A suite of algorithms for inferring dynamical systems models from microbiome time-series data and predicting temporal behaviors. MDSINE performs all analysis steps from reading data files through to the generation of figures. MDSINE implements methods that not only outperform previous approaches but also provides novel functionality, including capabilities to estimate confidence in model parameters and predicted dynamics. Application of MDSINE to two new gnotobiotic experimental datasets demonstrates the capability to generate predictive hypotheses that standard microbiome analysis methods cannot and, moreover, suggests new strategies for rational design of bacteriotherapies.
Analyzes distance correlations between genomic elements. GenomeInspector calculates distance correlations between elements from at least two inputs or between elements in one input file and annotated genomic elements. It allows the extraction of elements from large sets that fulfill distance requirements.
A method based on least squares with L1 penalty after log ratio transformation for raw compositional data to infer the correlations among microbes through a latent variable model. The simulation results show that CCLasso outperforms existing methods, e.g. SparCC, in edge recovery for compositional data.
Deduces correlation networks from large microbiome datasets. FastSpar is a standalone method able to exploits variances of log ratios to assess linear Pearson Correlation between operational taxonomic units (OTUs). The application is a re-implementation of the SparCC software with the addition of an unbiased p-value estimator coupled to a parallelization software with the aim of processing larger datasets.
Reconstructs 2-dimmentional co-exclusion networks. CoEx determines the score and p-values for 2, 3, and 4 dimensional co-exclusion patterns. It can run bootstrapping simulations for each pairwise co-exclusion pattern and a resulting network in a single run. This tool is able to discover the correspondence between p-value thresholds and expected number/fraction of encountered false positive co- exclusion patterns.
An approach that is capable of estimating correlation values from compositional data. Additionally, SparCC contains a script for calculating the distance between samples using the JSD metric, its square-root, and many other distance measures.
Infers ecological associations between microbial populations. SPIEC-EASI uses algorithms for sparse neighbourhood and inverse covariance selection in order to reconstruct networks. It is able to produce a synthetic benchmark in the absence of an experimentally validated gold-standard network. The tool was tested on a large-scale 16S rRNA gene sequencing dataset sampled from the human gut. The results show that it outperforms state-of-the-art methods to recover edges and network properties on synthetic data.
Finds correlations in compositional data. BAnOCC quantifies uncertainty through the associated posterior and estimates both the log-basis correlation and precision matrix by modeling the composition directly. It is a Bayesian method for inferring the log-basis correlation structure. This tool has been used to assess microbial relationships in the human microbiome, confirming established interactions and suggesting novel ones for future validation.
A Poisson-multivariate normal hierarchical model to learn direct interactions from the count-based output of standard metagenomics sequencing experiments. MInt controls for confounding predictors at the Poisson layer, and captures direct taxon-taxon interactions at the multivariate normal layer using an L1 penalized precision matrix.
Allows to estimate the sparse structure of inverse covariance for latent normal variables. gCoda addresses the high dimensionality of the microbiome data by using a penalized maximum likelihood method. It permits to infer the sparse direct interaction network among microbes from the logistic normal distribution of observed compositional data. The tool outperforms existing methods in edge recovery of inverse covariance for compositional data under a variety of scenarios.
Permits to discover associations among microbes and between microbes and their environmental factors. mLDM is based on a hierarchical Bayesian model with sparsity constraints. It is able to consider both compositional bias and variance of metagenomic data and can estimate absolute abundance for microbes. The tool can be useful to analyse time series data which is used by learning time-varying network structures.
Identifies the viral and bacterial genomes that have similarity with a sequenced human genome. FALCON employs the relative algorithmic entropy method that is based on model-freezing and exclusive information from a reference. It builds multiple finite-context models to proceed and frozen it at the end of the reference sequence. This tool measures target reads using a mixture of the frozen models.
Optimizes various methods for Lasso inference with matrix wrapper. LOL offers functions for breast cancer data set of genome-wide copy number merged data and expression of some important genes, to get the lambda value that yield certain number of non-zero coefficients, or a function that contains various optimization methods for Lasso inference, such as cross-validation, randomised lasso, simultaneous lasso etc. It is specifically designed for multicollinear predictor variables.