1 - 26 of 26 results

SUPER-FOCUS / SUbsystems Profile by databasE Reduction using FOCUS

star_border star_border star_border star_border star_border
star star star star star
(1)
An agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with over 70 real metagenomes, the results showing that it accurately predicts the subsystems present in the profiled microbial communities, and is up to 1,000 times faster than other tools.

BLASTX / Translated BLAST: blastx

star_border star_border star_border star_border star_border
star star star star star
(1)
Searches protein database using a translated nucleotide query. BLASTX is a BLAST search application that compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database. This application can also work in Blast2Sequences mode and can send BLAST searches over the network to public NCBI server if desired.

BLAT / BLAST-Like Alignment Tool

Finds genomic sequences that match a protein or DNA sequence submitted by the user. BLAT is a very fast sequence alignment tool similar to BLAST typically used for searching similar sequences within the same or closely related species. It was developed to align millions of expressed sequence tags and mouse whole-genome random reads to the human genome at a higher speed. BLAT is commonly used to look up the location of a sequence in the genome or determine the exon structure of an mRNA, but expert users can run large batch jobs and make internal parameter sensitivity changes by installing command line it on Linux server.

Taxonomer

An ultrafast web-tool for comprehensive metagenomics data analysis and interactive results visualization. Taxonomer is unique in providing integrated nucleotide and protein-based classification and simultaneous host messenger RNA (mRNA) transcript profiling. Using real-world case-studies, we show that Taxonomer detects previously unrecognized infections and reveals antiviral host mRNA expression profiles. Taxonomer enables rapid, accurate, and interactive analyses of metagenomics data on personal computers and mobile devices.

H-BLAST / Heterogeneous BLAST

Provides a fast parallel search tool for a heterogeneous computer that couples CPUs (Central Processing Units) and GPUs (Graphics Processing Units), to accelerate BLASTX and BLASTP - basic modules of NCBI-BLAST. H-BLAST employs a locally decoupled seed-extension algorithm to take advantages of GPUs, and offers a performance tuning mechanism for better efficiency among various CPUs and GPUs combinations. H-BLAST produces identical alignment results as NCBI-BLAST and its computational speed is much faster than that of NCBI-BLAST.

MMseqs / Many-against-Many sequence searching

Allows clustering and searching of large protein datasets, such as UniProt, or 6-frame translated metagenomics sequencing reads. MMseqs is a software suite which contains three core modules: a pre-filtering module, an SSE2- and multi-core-parallelized local alignment module, and a clustering module. In addition to the modules, three workflows for sequence searching, clustering, and updating a clustering facilitate the most common tasks for the non-expert.

SANSparallel

Provides protein sequence database searches with immediate response and professional alignment visualization by third-party software. The database search uses the suffix array neighborhood search (SANS) method, which has been re-implemented as a client-server, improved and parallelized. SANSparallel can be used to make protein functional annotation pipelines more efficient, and it is useful in interactive exploration of the detailed evidence supporting the annotation of particular proteins of interest.

PALADIN / Protein ALignment And Detection INterface

A modification of Burrows-Wheeler Aligner that provides more accurate alignment and orders-of-magnitude improved efficiency by directly mapping in protein space. In brief, PALADIN identifies and translates six possible open reading frames within each read, and maps these translated DNA sequence reads to a protein reference allowing for rapid identification of functional metagenomic profiles. By mapping in protein space, this method takes advantage of the general conservation of amino acid sequences compared to the underlying DNA sequences.

SWORD / Smith Waterman On Reduced Database

Combines a heuristic and an exact approach. It is aimed to replace BLAST, as it is faster between 8 and 16 times and has better or comparable sensitivity. The second main advantage is that it produces guaranteed optimal alignments. We also presented a faster version of SWORD which sensitivity drops significantly on higher e-values depending on dataset used. Its primary use is to retrieve most similar sequences, those with small e-values, up to 68 times faster than BLAST.

PSI-BLAST / Position-Specific Iterated Basic Local Alignment Search Tool

Allows to find regions of sequence similarity. PSI-BLAST is a protein database search program. The software is able to access the probable substitutions at each sequence position using the results of a previous Gapped-Blast search, an algorithm comparing the amino acid substitution matrix. It can combine search results with robust statistics to build and apply profiles also known as a position-specific scoring matrix. A modified application of PSI-BLAST - PSI-BLASTexB - that solves sequence weighting scheme limitations, was also developed.

AlignBucket

A software tool for splitting a fasta file into smaller pieces suitable for alignment with BLAST. AlignBucket optimizes the partition of a large volume of sequences (the whole database) into sets where sequence length values (in residues) are constrained depending on a bounded minimal and expected alignment coverage. The idea is to optimally group protein sequences according to their length, and then computing the all-against-all sequence alignments among sequences that fall in a selected length range.

SW#db

A parallelised version of exact database search algorithms optimised for multiple queries. Although the emphasis is on the Smith-Waterman algorithm, other exact algorithms such as global and semi-global alignment are provided as well. SW#db is parallelized on both GPU and CPU and it can run on multiple GPUs or on a cluster. The running times for large databases are comparable to the times achieved by BLASTP and at least four times faster than the state-of-the-art parallelized tools used for the same purposes such as SSEARCH, CUDASW++ and SSW. Although it could be used for the protein database search instead of BLASTP when the high sensitivity is required, our main intention was to build a library that could provide fast and exact alignment between queries and a reduced database for various bioinformatics tools.

Qudaich / Queries and unique database alignment inferred by clustering homologs

A sequence aligner which can efficiently process large volumes of data and is suited to de novo comparisons of next generation reads datasets. Qudaich performs local sequence alignments in two major steps - i) identifying the candidate database sequence(s) and ii) generating the optimal alignment with those candidate database sequences. Qudaich can handle both DNA and protein sequences and attempts to provide the best possible alignment for each query sequence. Qudaich can produce more useful alignments quicker than other contemporary alignment algorithms.

PAUDA / Protein Alignment Using a DNA Aligner

An approach toward the problem of comparing DNA reads against a database of protein reference sequences that is applicable to very large datasets consisting of hundreds of millions or billions of reads. PAUDA requires <80 CPU hours to analyze a dataset of 246 million Illumina DNA reads from permafrost soil for which a previous BLASTX analysis (on a subset of 176 million reads) reportedly required 800,000 CPU hours, leading to the same clustering of samples by functional profiles.

MICA / Metagenomic Inquiry Compressive Acceleration

A framework for similarity search based on characterizing a data set’s entropy and fractal dimension. MICA is an accelerated version of standard tools with no loss in specificity and little loss in sensitivity for use in three domains (i) high-throughput drug screening, (ii) metagenomics, (iii) and protein structure search. MICA is able to search on large omics datasets to scale even as those datasets grow exponentially. The primary advance of this tool is that it bounds both time and space as function of the dataset entropy.