Provides rapid data mining on taxonomy and metabolic function across a large number of metagenome datasets. Parallel-META is useful for: (1) vector-graph-based visualization and parallel computing; (2) interaction network construction; (3) bio-marker selection; (4) diversity statistics; (5) 16S rRNA based functional prediction; (6) 16S rRNA copy number calibration; (7) and 16S rRNA extraction for shotgun sequences.
Aims to be a comprehensive software package that allows users to use a single piece of software to analyze community sequence data. mothur builds upon previous tools to provide a flexible and powerful software package for analyzing sequencing data. Extensive community-supported documentation and support are available through a MediaWiki-based wiki and a discussion forum.
Serves for eukaryotic sequence identification and can be applied to environmental samples. EukRep enables genome recovery, genome completeness evaluation and prediction of metabolic potential. Moreover, this classifier utilizes kmer composition of assembled sequences to detect eukaryotic genome fragments prior to gene prediction. It can also notice scaffolds whose analysis would benefit from a eukaryotic gene prediction algorithm.
Identifies and defines microbes via reads analysis. MICRA employs read mapping methods to make use of the increasing number of sequenced microbial genomes. The working consists in four parts: (1) pre-processing, (2) sequence identification, (3) identification of the closest reference genome and plasmids by the core part and (4) the post-analysis. This pipeline software is available as a download version and as a web interface.
A user-friendly Galaxy pipeline for the analysis of high throughput sequencing data that is pre-packaged for use with the MEGARes database. AmrPlusPlus not only increases the accessibility of resistome analysis, but also provides users with 3 integrated tools (ResistomeAnalyzer, RarefactionAnalyzer and SNPFinder) which will help to bridge the gap between the bioinformatics and the statistical analysis of metagenomics data.
Serves for the automated, reference-independent binning and visualization of metagenomic data in the form of assembled contigs or long reads. BusyBee website works about population-level resolved analyses of metagenomic data. This tool helps the user to build confidence in the individual bins while simultaneously facilitating the identification of sequence groups requiring special attention.
Houses tools for researchers to process and analyze their own functional gene sequencing data. FGP offers a pipeline where researchers can assemble a set of analysis tools to process a nucleotide sequence file, filter chimeric sequences, translate the nucleotide sequences, align, and cluster the protein sequences and additionally run the optional cluster file analysis tools. FGP allows libraries of sequence reads to be analyzed through either reference-based or unsupervised approaches after common initial processing steps. Reference-based approaches, such as the FrameBot frameshift correction and nearest neighbor tool offered by FGP, require a set of representative sequences, which can be compiled using the FunGene Repository (FGR).
Performs common tasks in metagenomic data analysis from raw read quality control to bin extraction and analysis. MetaWRAP provides a collection of modules, each being a standalone program addressing one aspect of WMG data processing or analysis, including read quality control (QC), assembly, visualization, taxonomic profiling, and binning. Users can follow the intuitive workflow or use only specific functions. Its modularity gives the investigator flexibility in their analysis approach.
Classifies metagenomic datasets. MetaMeta allows user to obtain more precise or sensitive results by providing a single default parameter. It executes and integrates results from metagenome analysis tools. The tool facilitates the execution in many computational environments using Snakemake and BioConda. It can handle multiple large samples at the same time, with options to delete intermediate files and keep only necessary ones. MetaMeta is well suited to large scale projects.
Provides a lightweight back end pipeline that supports multiple dynamically loaded plugin extensions. PluMA intends to offer a solution for the lack of standardized framework for developing, testing and integrating plugins that are heterogeneous with respect to programming language. This software can assemble pipelines where stages can be plugged in and out.
Allows metagenomic sequence data to be analyzed with the fast, accurate RNA-Seq abundance estimator kallisto. Metakallisto contains python scripts and offers functions that compare the output of a range of metagenomic analysis tools such as kallisto to the ground truth of the illumina 100 metagenomic dataset. Both taxa identification and abundance estimation can be performed at the exact-genome level.
Integrates different steps for better estimation of the taxonomic assignment. MetaABC is an integrated metagenomics platform for data adjustment, binning and clustering. This method incorporates (i) two means for removing artifacts, (ii) five tools for taxonomic binning, (iii) an approach to reanalyze unassigned reads using conserved gene adjacency, and (iv) an option to control sampling biases via genome length normalization.
Performs rarefaction analysis of large count matrices, as well as estimation and visualization of diversity, richness and evenness. RTK computes estimates of ecological diversity and provides appropriate visualizations of the results. It rarefies large high count datasets quickly and returns diversity measures. The tool can be applied to state of the art microbiomics applications and scales better than presently available tools.
Scans viral metagenomes from hundreds of next generation sequencing (NGS) samples. ViraPipe employs data parallel computation strategy. It is able to processes genomic data in partitions at many levels. This tool can avoid false mappings which occurs when the sample reads are merged before the alignment. It is based on existing tools such as BWA aligner, MegaHit de novo assembler, BLAST or HMMER3.
Identifies CITES (the Convention on International Trade in Endangered Species of Wild Fauna and Flora) -listed species using Illumina paired-end sequencing technology. CITESspeciesDetect is a pipeline composed of five linked tools. It consists in three phases: (1) preprocessing of paired-end Illumina data involving quality trimming and filtering of reads, followed by sorting by DNA barcode, (2) Operational Taxonomic Unit (OTU) clustering by barcode, and (3) taxonomy prediction and CITES identification. The web interface allows stakeholders to perform the next-generation sequencing (NGS) data analysis of their own samples.
Allows rapid viral read identification, genus-level read partition, read normalization, de novo assembly, sequence annotation and coverage profiling. In drVM, the first two procedures and sequence annotation rely on known viral genomes as a reference database. drVM has been tested on over 300 previously published sequencing runs, to provide complete viral genome assemblies for a variety of virus types including DNA viruses, RNA viruses and retroviruses. drVM is available for free download and is also assembled as a Docker container, an Amazon machine image and a virtual machine to facilitate seamless deployment.
Assembles short Illumina reads into full-length COI barcode sequences. SOAPBarcode is a sequencing pipeline that transforms raw Illumina reads into full-length COI barcode sequences. It was coupled with with the HiSeq 2000, allowing to achieve a high recovery rate and assemble full COI barcodes and, consequently, deliver reliable and taxonomically informative metabarcoding outcomes for environmental bulk samples.
A package dedicated to the object-oriented representation and analysis of microbiome census data in R. Phyloseq supports importing data from a variety of common formats, as well as many analysis techniques. These include calibration, filtering, subsetting, agglomeration, multi-table comparisons, diversity analysis, parallelized Fast UniFrac, ordination methods, and production of publication-quality graphics, all in a manner that is easy to document, share, and modify. It simplifies many of the common data management and preprocessing tasks required during analysis of phylogenetic sequencing data. The phyloseq package also provides a set of powerful analysis and graphics functions, building upon related packages available in R and Bioconductor. It includes or supports some of the most commonly-needed ecology and phylogenetic tools, including a consistent interface for calculating ecological distances and performing dimensional reduction.
Facilitates fast, accurate functional profiling of metagenomic samples. ShortBRED consists of two components: (i) a method that reduces reference proteins of interest to short, highly representative amino acid sequences (“markers”) and (ii) a search step that maps reads to these markers to quantify the relative abundance of their associated proteins. Its markers are applicable to other homology-based search tasks, can be applied to profile a wide variety of protein families of interest.