Metagenomic data, which contains sequenced DNA reads of uncultured microbial species from environmental samples, provide a unique opportunity to thoroughly analyze microbial species that have never been identified before. Reconstructing 16S ribosomal RNA, a phylogenetic marker gene, is usually required to analyze the composition of the metagenomic data. However, massive volume of dataset, high sequence similarity between related species, skewed microbial abundance and lack of reference genes make 16S rRNA reconstruction difficult.
Reconstructs full-length small subunit (SSU) sequences from metagenomic data from a microbial community of interest, accurate to the species level. In addition, the method also provides accurate SSU sequence abundance estimates. EMIRGE is robust to errors and omissions in the reference database, and is broadly applicable to any dataset produced with short read sequencing technology.
Allows the reconstruction of metagenomes. Virtual Metagenome reflects real functional compositions and actual transitions of gene pools even though they were virtually reconstructed from denaturing gradient gel electrophoresis (DGGE). This tool provides an opportunity to re-evaluate massive volumes of information on species diversity by using 16S rRNA gene sequence data accumulated in previous experiments performed by microbial ecologists. It allows also to re-analyse the data in terms of genes/genomes, in order to provide a deeper view to the inside of the microbial functions.
Reconstructs 16S rRNA genes from metagenomic data. REAGO is able to accurately identify 16S rRNA from error-containing metagenomic datasets at sequence level. The algorithms are robust even if the genera of the underlying genes are not included in the covariance model (CM) training set. It can be readily applied to any metagenomic dataset containing paired-end reads. Several components in REAGO work better with increasing read length. In particular, the homology search stage and the bad edge removal part can all benefit from increased sequence length, which is the trend for next-generation sequencing technologies.
Identifies reads originating from marker, and assembles nearly full length 16S rRNAs sequences of it. MATAM is an approach based on the construction and exploitation of an overlap graph, carefully designed to minimize the error rate and the risk of chimera formation. It is robust to variations in the sequencing depth as well as community complexity.
Integrates taxonomic tree search and Dirichlet process clustering to reconstruct full-length 16S gene sequences from metagenomic sequencing data with high accuracy. RAMBL realizes the access to full-length 16S gene sequences in the near-terabase-scale metagenomic shotgun sequences. It is able to generate a more accurate determination of environmental microbial diversity and yield better disease classification, suggesting that full-length 16S gene assemblies are a powerful alternative to marker gene set and 16S short reads.
Allows microbial metataxonomic investigation. BTW is composed of a set of tools for processing 16S rRNA data from raw sequencing reads. It permits statistical studies including data quality filtering, clustering, taxonomic assignment. This tool optimizes the utilization of next generation sequencing (NGS) amplicon data by simplifying the access to bioinformatics tools for Windows users.
Determines bacterial species by computing the molecular weights of terminal restriction fragments (T-RFs). TRFMA is a web application that allows users to decrease the variability to an average discrepancy of less than one base and displayed candidates for the bacterial species in samples. In addition, researchers are able to confirm the obtained T-RF profile by digestion with additional restriction enzymes.