1 - 50 of 51 results

AMPHORA / AutoMated PHylogenomic infeRence Application

Allows genome tree reconstruction and metagenomic phylotyping. AMPHORA is an application for large-scale protein phylogenetic analysis. The software supports the analyses of DNA sequences, which means that users can apply AMPHORA2 directly to metagenomic reads without the need to first annotate the sequence. It can phylotype metagenomic sequences from a mixed population of bacteria and archaea and should be useful for the study of microbial evolution and ecology in the genomic era. A web application and a flavor of AMPHOR2 are also available.


A package dedicated to the object-oriented representation and analysis of microbiome census data in R. Phyloseq supports importing data from a variety of common formats, as well as many analysis techniques. These include calibration, filtering, subsetting, agglomeration, multi-table comparisons, diversity analysis, parallelized Fast UniFrac, ordination methods, and production of publication-quality graphics, all in a manner that is easy to document, share, and modify. It simplifies many of the common data management and preprocessing tasks required during analysis of phylogenetic sequencing data. The phyloseq package also provides a set of powerful analysis and graphics functions, building upon related packages available in R and Bioconductor. It includes or supports some of the most commonly-needed ecology and phylogenetic tools, including a consistent interface for calculating ecological distances and performing dimensional reduction.

IMP / Integrated Meta-omic Pipeline

Is developed to perform large-scale, reproducible and automated integrative reference free analysis of metagenomic and metatranscriptomic data. IMP incorporates robust read preprocessing, iterative co-assembly, analyses of microbial community structure and function, automated binning, as well as genomic signature-based visualizations. The IMP-based data integration strategy enhances data usage, output volume, and output quality as demonstrated using relevant use-cases. Finally, IMP is encapsulated within a user-friendly implementation using Python and Docker.


Leverages the CyVerse cyberinfrastructure to provide access to viromic tools and data sets. The iVirus Data Commons contains both raw and processed data from 1866 samples and 73 projects derived from global ocean expeditions, as well as existing and legacy public repositories. Through the CyVerse Discovery Environment, users can interrogate these data sets using existing analytical tools (software applications known as ‘Apps’) for assembly, open reading frame prediction and annotation, as well as several new Apps specifically developed for analyzing viromes.


Encodes a series of well documented choices for the downstream analysis of Operational Taxonomic Units (OTUs) tables, including normalization steps, alpha- and beta-diversity analysis, taxonomic composition, and many others. Rhea is primarily a straightforward starting point for beginners, but can also be a framework for advanced users who can modify and expand the tool. As the community standards evolve, Rhea will adapt to always represent the current state-of-the-art in microbial profiles analysis in the clear and comprehensive way allowed by the R language.

ShortBRED / Short Better Representative Extract Dataset

Facilitates fast, accurate functional profiling of metagenomic samples. ShortBRED consists of two components: (i) a method that reduces reference proteins of interest to short, highly representative amino acid sequences (“markers”) and (ii) a search step that maps reads to these markers to quantify the relative abundance of their associated proteins. Its markers are applicable to other homology-based search tasks, can be applied to profile a wide variety of protein families of interest.

drVM / detect and reconstruct known viral genomes from metagenomes

Allows rapid viral read identification, genus-level read partition, read normalization, de novo assembly, sequence annotation and coverage profiling. In drVM, the first two procedures and sequence annotation rely on known viral genomes as a reference database. drVM has been tested on over 300 previously published sequencing runs, to provide complete viral genome assemblies for a variety of virus types including DNA viruses, RNA viruses and retroviruses. drVM is available for free download and is also assembled as a Docker container, an Amazon machine image and a virtual machine to facilitate seamless deployment.


A customizable web server for fast metagenomic analysis. WebMGA includes over 20 commonly used tools such as ORF calling, sequence clustering, quality control of raw reads, removal of sequencing artifacts and contaminations, taxonomic analysis, functional annotation etc. WebMGA provides users with rapid metagenomic data analysis using fast and effective tools, which have been implemented to run in parallel on our local computer cluster. Users can access WebMGA through web browsers or programming scripts to perform individual analysis or to configure and run customized pipelines. WebMGA offers to researchers many fast and unique tools and great flexibility for complex metagenomic data analysis.

Microbiome Helper

Contains several resources to help researchers working with microbial sequencing data. Microbiome Helper is an assortment of scripts to help process and automate various microbiome and metagenomic bioinformatic tools. It contains a series of scripts that help process and automate various microbiome and metagenomic bioinformatic tools, workflows or standard operating procedures (SOPs) for analyzing 16S/18S rRNA and metagenomic data, tutorials (with test data, example output, and questions for different microbiome analyses) and a virtual box image that can be used to run the workflows and tutorials with little or no configuration.

DECARD / Detailed Evaluation Creation and Analysis of Read Data

Simulates amplicon-based microbiome experiments and tests classification software. DECARD allows to generate realistic synthetic datasets for which there is a known source of the sequences to be used as a gold standard when evaluating microbiome analysis software. For each classification pipeline considered, the software has modules that convert the pipeline output to a common table mapping each sequence to an operational taxonomic unit (OTU) and classification.

ICoVeR / Interactive Contig-bin Verification and Refinement

Allows to visualize genome bins. ICoVeR allows to curate bin assignments based on multiple binning algorithms. It was tested on the refinement disparate of genome bins automatically generated by other binning algorithms for an anaerobic digestion metagenomic dataset. The tool renders the bin refinement process faster and more replicable. It permits to capture the provenance of changes derived in the course of an exploratory task.

FMAP / Functional Mapping and Analysis Pipeline

A stand-alone functional analysis pipeline for analyzing whole metagenomic and metatranscriptomic sequencing data. FMAP performs alignment, gene family abundance calculations, and statistical analysis (three levels of analyses are provided: differentially-abundant genes, operons and pathways). The resulting output can be easily visualized with heatmaps and functional pathway diagrams. FMAP functional predictions are consistent with currently available functional analysis pipelines.

FGP / FunGene Pipeline

Houses tools for researchers to process and analyze their own functional gene sequencing data. FGP offers a pipeline where researchers can assemble a set of analysis tools to process a nucleotide sequence file, filter chimeric sequences, translate the nucleotide sequences, align, and cluster the protein sequences and additionally run the optional cluster file analysis tools. FGP allows libraries of sequence reads to be analyzed through either reference-based or unsupervised approaches after common initial processing steps. Reference-based approaches, such as the FrameBot frameshift correction and nearest neighbor tool offered by FGP, require a set of representative sequences, which can be compiled using the FunGene Repository (FGR).

BMPOS / Brazilian Microbiome Project Operating System

Allows analysis for metagenomic studies (phylogenetic marker genes). BMPOS is effective for sequences processing, sequences clustering, alignment, taxonomic annotation, statistical analysis, and plotting of metagenomic data. It aims to help researchers handle the most used bioinformatics packages dedicated to the study of microbial ecology. The tool can be used as a starting point for every researcher interested in performing microbiome studies based on Next Generation Sequencing (NGS) data.

A-GAME / A GAlaxy suite for functional MEtagenomics

Incorporates tools and workflows for the analysis of environmental DNA (eDNA) sequence data. A-GAME is a general bioinformatics workflow management system implemented within Galaxy. The software contains pre-designed workflows that utilize standard tools for data pre-processing, sequence assembly and annotation; as well as custom utilities dedicated to the analysis of functional metagenomics data. It allows the incorporation of most widely used bioinformatics tools. A-GAME can be used to build and customize bioinformatics workflows.


Identifies CITES (the Convention on International Trade in Endangered Species of Wild Fauna and Flora) -listed species using Illumina paired-end sequencing technology. CITESspeciesDetect is a pipeline composed of five linked tools. It consists in three phases: (1) preprocessing of paired-end Illumina data involving quality trimming and filtering of reads, followed by sorting by DNA barcode, (2) Operational Taxonomic Unit (OTU) clustering by barcode, and (3) taxonomy prediction and CITES identification. The web interface allows stakeholders to perform the next-generation sequencing (NGS) data analysis of their own samples.


Provides a topic model framework for microbiome abundance data, as well as prediction for 16S rRNA survey data. Themetagenomics is a package that offers an R implementation of PICRUSt and wraps Tax4fun, giving users a choice for their functional prediction strategy. Both GreeneGenes and Silva taxonomic annotations are also acceptable. The user provides an abundance table, sample metadata, and taxonomy information, and this method infers the association between topics and sample features, as well as topics and predicted functional content.


Classifies metagenomic datasets. MetaMeta allows user to obtain more precise or sensitive results by providing a single default parameter. It executes and integrates results from metagenome analysis tools. The tool facilitates the execution in many computational environments using Snakemake and BioConda. It can handle multiple large samples at the same time, with options to delete intermediate files and keep only necessary ones. MetaMeta is well suited to large scale projects.

MetaABC / Metagenomics platform for data Adjustment Binning and Clustering

Integrates different steps for better estimation of the taxonomic assignment. MetaABC is an integrated metagenomics platform for data adjustment, binning and clustering. This method incorporates (i) two means for removing artifacts, (ii) five tools for taxonomic binning, (iii) an approach to reanalyze unassigned reads using conserved gene adjacency, and (iv) an option to control sampling biases via genome length normalization.

Fungal ITS Pipeline

Accelerates the processing of large numbers of query sequences. Fungal internal transcribed spacer (ITS) Pipeline is a package developed to obtain extended functionality helpful for complementary, in-depth analyses. It assigns large sets of fungal query sequences to their respective best matches in the international sequence databases and places them in a larger biological context. This pipeline is easily modified to operate on other molecular regions and organism groups.

MePIC / Metagenomic Pathogen Identification for Clinical specimens

Detects pathogen sequences from metagenome data derived from specimen material from patients. MePIC will trim low quality data, remove sequence data from hosts (i.e. host patients) and then perform megablast search against NCBI blast database, or BWASW or BWA aln search against a single sequence specified by user. The result can be further analyzed by third party software such as MEGAN, Tablet, or GenomeJack. The use of the MePIC pipeline will promote metagenomic pathogen identification and improve the understanding of infectious diseases.

MICRA / Microbial Identification and Characterization through Reads Analysis

Identifies and defines microbes via reads analysis. MICRA employs read mapping methods to make use of the increasing number of sequenced microbial genomes. The working consists in four parts: (1) pre-processing, (2) sequence identification, (3) identification of the closest reference genome and plasmids by the core part and (4) the post-analysis. This pipeline software is available as a download version and as a web interface.

Anvi'o / analysis and visualization platform for 'omics data

Brings together many aspects of today’s cutting-edge genomic, metagenomic, and metatranscriptomic analysis practices to address a wide array of needs. Anvi’o is an advanced analysis and visualization platform that offers automated and human-guided characterization of microbial genomes in metagenomic assemblies, with interactive interfaces that can link ‘omics data from multiple sources into a single, intuitive display. It empowers researchers without extensive bioinformatics skills to perform and communicate in-depth analyses on large ‘omics datasets.


Generates an all-against-all comparison dataset between the reads and the reference database and then uses these results to generate cumulative statistics from combined local and global alignment. MetaGeniE is a pipeline which has been designed for accurate, sensitive and specific detection of taxa in complex microbial samples and to address all of the above limitations with typical metagenomic analyses. It also incorporates features such as comprehensive human read filtration and scalability to search large reference databases such as the microbial Refseq database.


Aims to address a major challenge facing researchers today — namely, analyzing, transferring, and storing biomedical "big data" — through the use of cloud-based resources. Nephele is a project from the National Institutes of Health (NIH) that brings together microbiome data and analysis tools in a cloud computing environment. Nephele's advanced analysis pipelines include multiple stages of data processing, many of which can be configured by modifying parameters provided in the submission form.

SIAMCAT / Statistical Inference of Associations between Microbial Communities And host phenoTypes

Aims to identify changes in community composition that are related with environmental factors. SIAMCAT analyses relation between microbial communities and host phenotypes. It supports data pre-processing, statistical association testing, statistical modelling. This tool provides functions for evaluation and interpretation of statistical models, such as cross validation, parameter selection, ROC analysis and diagnostic model plots.


Allows for marine metagenomics analysis. META-pipe offers preprocessing, assembly, taxonomic classification and functional analysis. To reduce the effort to develop and deploy it, it has been integrated to existing biological analysis frameworks, and compute and storage infrastructure resources. META-pipe web service provides integration with identity provider services, distributed storage, computation on a Supercomputer, Galaxy workflows, and interactive data visualizations. The Galaxy based META-pipe is a powerful analysis pipeline for metagenomic samples which is intuitive and easy to use for biologists without extensive programming competence. META-pipe is flexible, modular, and it is integrated with large-scale computer systems and identity providers needed to operate a service with a large user base.

RTG Metagenomics

Delivers comprehensive results on the composition of microbial communities and their associated metabolic pathways, genes and genomes. Real Time Genomics’ metagenomic pipeline was developed to estimate the abundance or frequency of a particular genome (typically a bacterial species) in a complex metagenomic sample. The calculations are performed on standard SAM files after reads have been mapped to a reference genome set containing thousands of genomes, many of which are highly related at the nucleotide level.