1 - 50 of 206 results

Bowtie

star_border star_border star_border star_border star_border
star star star star star
(4)
Aligns short read geared toward mammalian re-sequencing. Bowtie is based on a Burrows-Wheeler index based on the full-text minute-space (FM) index. It follows two steps: an initial, ungapped seed-finding stage that derives advantage from the speed and memory efficiency of the full-text minute index and a gapped extension stage that employs dynamic programming and benefits from the efficiency of single-instruction multiple-data (SIMD) parallel processing available on modern processors.

BWA / Burrows-Wheeler Aligner

star_border star_border star_border star_border star_border
star star star star star
(3)
Maps low-divergent sequences against a large reference genome, such as the human genome. BWA consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.

CORA / COmpressive Readmapping Accelerator

Achieves substantial runtime improvement through the use of compressive representation of the reads and a comprehensive homology map of the reference genome, when plugged into existing mapping tools. CORA’s compressive framework achieves speed gains inversely related to the sequencing error rate, the acceleration it provides will substantially improve as sequencers generate higher-quality reads. Furthermore, CORA constructs a reference homology table data structure, which also offers general utility beyond read mapping by providing fast access to all pairs of homologous loci in the reference genome.

BLAST / Basic Local Alignment Search Tool

star_border star_border star_border star_border star_border
star star star star star
(1)
Allows to align query sequences against those present in a selected target database. BLAST is a suite of programs, provided by NCBI, which can be used to quickly search a sequence database for matches to a query sequence. The software provides an access point for these tools to perform sequence alignment on the web. The set of BLAST command-line applications is organized in a way that groups together similar types of searches in one application.

BLASTX / Translated BLAST: blastx

star_border star_border star_border star_border star_border
star star star star star
(1)
Searches protein database using a translated nucleotide query. BLASTX is a BLAST search application that compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database. This application can also work in Blast2Sequences mode and can send BLAST searches over the network to public NCBI server if desired.

CLC Genomics Workbench

star_border star_border star_border star_border star_border
star star star star star
(2)
forum (1)
Allows to analyze, compare, and visualize next generation sequencing (NGS) data. CLC Genomics Workbench offers a complete and customizable solution for genomics, transcriptomics, epigenomics, and metagenomics. The software enables to generate custom workflows, which can combine quality control steps, adapter trimming, read mapping, variant detection, and multiple filtering and annotation steps into a pipeline.

BLAT / BLAST-Like Alignment Tool

Finds genomic sequences that match a protein or DNA sequence submitted by the user. BLAT is a very fast sequence alignment tool similar to BLAST typically used for searching similar sequences within the same or closely related species. It was developed to align millions of expressed sequence tags and mouse whole-genome random reads to the human genome at a higher speed. BLAT is commonly used to look up the location of a sequence in the genome or determine the exon structure of an mRNA, but expert users can run large batch jobs and make internal parameter sensitivity changes by installing command line it on Linux server.

MAQ / Mapping and Assembly with Quality

Builds mapping assemblies from short reads generated by the next-generation sequencing machines. Maq is particularly designed for Illumina-Solexa 1G Genetic Analyzer, and has preliminary functions to handle ABI SOLiD data. Maq first aligns reads to reference sequences and then calls the consensus. At the mapping stage, maq performs ungapped alignment. For single-end reads, maq is able to find all hits with up to 2 or 3 mismatches, depending on a command-line option; for paired-end reads, it always finds all paired hits with one of the two reads containing up to 1 mismatch. At the assembling stage, maq calls the consensus based on a statistical model.

MIA / Mapping Iterative Assembler

The basic idea of this program is to align DNA sequencing fragments (shotgun or targeted resequencing) to a reference, then call a consensus. Then the consensus is used as new reference and the process is repeated until convergence. Since it was originally designed to be used on ancient DNA, it supports a position specific substitution matrix, which improves both alignment and consensus calling on chemically damaged aDNA. MIA has been used to assemble a number of Neandertal and early modern human mitochondria.

CUSHAW

Consists of algorithm for parallel, sensitive, and accurate next-generation sequencing (NGS) read alignment to large genomes such as human genome. CUSHAW is a software suite and is composed of three individual software tools, namely CUSHAW, CUSHAW2, and CUSHAW3. This suite employs inexact k-mer seeds (for CUSHAW); adopts MEM seeds (for CUSHAW2); and introduces hybrid seeds incorporating three different seed types (for CUSHAW3), i.e., MEM seeds, exact-match kmer seeds, and variable-length seeds derived from local alignments.

Nanopolish

Provides a nanopore consensus algorithm using a signal-level hidden Markov model (HMM). The main subprograms of Nanopolish are: (i) nanopolish extract which extracts reads in FASTA or FASTQ format from a directory of FAST5 files; (ii) nanopolish eventalign which aligns signal-level events to k-mers of a reference genome; (iii) nanopolish variants which detects single nucleotide polymorphisms (SNPs) and indels with respect to a reference genome; and (iv) nanopolish variants –consensus which calculates an improved consensus sequence for a draft genome assembly. Furthermore, Nanopolish contains an experimental option that will use event durations to improve the consensus accuracy around homopolymers.

NextGenMap

A read mapper that is more than twice as fast as BWA, while achieving a mapping sensitivity similar to Stampy or Bowtie2. NextGenMap aligns reads reliably to a reference genome even when the sequence difference between target and reference genome is large, i.e. highly polymorphic genome. The software outperforms current mapping methods with respect to runtime and to the number of correctly mapped reads. NextGenMap handles automatically any read data independent of read length and sequencing technology. It may be used to map reads from nonstandard organism to a phylogenetically close reference genome or to apply it to metagenomics data.

PerM / Periodic Seed Mapping

Provides highly efficient mapping solutions for genome-scale mapping projects involving Illumina or SOLiD data. The data structure in PerM requires only 4.5 bytes per base to index the human genome, allowing entire genomes to be loaded to memory, while multiple processors simultaneously map reads to the reference. PerM owes its performance primarily to the use of single periodic spaced seeds which are capable of providing sufficient weight and sensitivity to significantly increase genome-scale mapping performance in comparison with other mapping programs.

HiLive

Implements a k-mer based alignment strategy. HiLive continuously reads intermediate BCL files produced by Illumina sequencers and then extends initial k-mer matches by increasingly produced data from the sequencer. HiLive is a live-mapper, which means that it maps Illumina sequencing reads to target reference genomes while the sequencer is running. It allows a strong reduction in total sample analysis time by starting read mapping while still sequencing. With regard to read alignment quality, HiLive is comparable to existing approaches and follows common file format conventions.

GraphMap

Analyses nanopore sequencing reads. GraphMap progressively refines candidate alignments to robustly handle potentially high-error rates and a fast graph traversal to align long reads with speed and high precision (>95%). Its alignments enabled single-nucleotide variant calling on the human genome with increased sensitivity (15%) over the next best mapper, precise detection of structural variants from length 100 bp to 4 kbp, and species and strain-specific identification of pathogens using MinION reads.

SAPAS

A pipeline for RNA-seq method to research polyA. SAPAS performs a systematic search and evaluation of protocols for typical steps to investigate to what extent these can indeed facilitate RNA-seq data analysis. 29 open-source interfaces and 6 of the more widely used interfaces were evaluated in detail. SAPAS processes the sequencing result using SAPAS method, including quality control, mapping to genome using bowtie, generating cleverage sites, internal priming, clustering cleverage sites.

MOSAIK

Maps next generation sequencing (NGS) reads to a reference genome. MOSAIK is a reference-guided aligner that uses a neural-network based training scheme and supports most existing sequencing technologies. The software uses a Smith-Waterman algorithm and can align reads to a genome using the International Union of Pure and Applied Chemistry (IUPAC) ambiguity codes, ensuring that alignments against known single nucleotide polymorphisms (SNPs) are not penalized. It additionally provides explicit support for structural variant (SV) detections.

mrsFAST / Micro-read substitution-only Fast Alignment Search Tool

A fast, cache oblivious, SNP-aware aligner that can handle the multi-mapping of high throughput sequencing reads very efficiently. mrsFAST-Ultra improves mrsFAST, our first cache oblivious read aligner capable of handling multi-mapping reads, through new and compact index structures that reduce not only the overall memory usage but also the number of CPU operations per alignment. As importantly, mrsFAST-Ultra introduces new features such as being able to (i) obtain the best mapping loci for each read, and (ii) return all reads that have at most n mapping loci (within an error threshold), together with these loci, for any user specified n. Furthermore, mrsFAST-Ultra is SNPaware, i.e. it can map reads to reference genome while discounting the mismatches that occur at common SNP locations provided by db-SNP; this significantly increases the number of reads that can be mapped to the reference genome. In comparison to newly enhanced popular tools such as Bowtie2, it is more sensitive (it can report 10 times or more mappings per read) and much faster (six times or more) in the multi-mapping mode.

SparkBWA

Exploits the capabilities of a big data technology as Spark to boost the performance of one of the most widely adopted aligner, the Burrows-Wheeler Aligner (BWA). The design of SparkBWA uses two independent software layers in such a way that no modifications to the original BWA source code are required, which assures its compatibility with any BWA version (future or legacy). SparkBWA is evaluated in different scenarios showing noticeable results in terms of performance and scalability. A comparison to other parallel BWA-based aligners validates the benefits of our approach. Finally, an intuitive and flexible API is provided to NGS professionals in order to facilitate the acceptance and adoption of the new tool.

ISAS / Imagenix Sequence Alignment System

Generates uniqueome data for human, mouse, worm and fly genomes in both color-space and nucleotide-space. ISAS permits to perform leading edge research without million Dollar compute farms. It is based on two algorithms and can determine by itself which of the two algorithms need to be used for optimal speed on each sequence. The tool makes use of direct machine instruction commands, often bypassing the inefficiencies of compiled language implementations.

ST Pipeline

Permits to process and analyze the raw files generated with the Spatial Transcriptomics (ST) method. ST Pipeline enables demultiplexing of spatially-resolved RNA-seq data and robust quality filtering and identification of unique molecules. It is highly customizable with numerous parameter settings. The tool is more robust, efficient and scales better to arrays with higher density. It filters data, aligns it to a genome, annotates it to a reference, demultiplexes by array coordinates and then aggregates by counts that are not duplicates using the Unique Molecular Identifiers.

LAMSA / Long Approximate Matches-based Split Aligner

Takes the advantage of the rareness of structural variants (SVs) to implement a specifically designed two-step read-splitting and alignment strategy. LAMSA initially splits the read into relatively long fragments and colinearly aligns them to solve the small local variations and/or sequencing errors, and decreases the effect of repeats. The alignments of the fragments are then used for implementing a sparse dynamic programming (SDP)-based split alignment approach to handle the large or non-co-linear variants.

FBB / Fast Bayesian Bound

Compares RNA-seq alignment results across different algorithms. FBB uses quality scores of the reads to align them to a genome of reference. Two theorems are provided to efficiently calculate the Bayesian bound that under some conditions becomes the equality. The algorithm reads the SAM files generated by the alignment algorithms using multiple command option values. The program options are mapped into the FBB reference values, and all the aligners can be compared respect to the same accuracy values provided by the FBB.

PHYLUCE

A package for phylogenomic analyses of data collected from conserved genomic loci using targeted enrichment. PHYLUCE allows the assembly of raw read data to contigs, the identification of ultra-conserved elements (UCE) contigs, parallel alignment generation, alignment trimming, and alignment data summary methods in preparation for analysis and alignment and SNP calling using UCE or other types of raw-read data. As it stands, the PHYLUCE package is useful for analyzing both data collected from UCE loci and also data collection from other types of loci for phylogenomic studies at the species, population, and individual levels.

GASSST / Global Alignment Short Sequence Search Tool

Maps reads with mismatch and indel errors at a high speed. GASSST is a short read aligner that uses the seed and extend strategy. The software’s algorithm has three stages: (i) searching for exact matching seeds between the reference genome and the query sequences; (ii) eliminating hits that have more than a user-specified number of errors, and (iii) computing the full gapped alignment with the Needleman–Wunsch (NW) algorithm. Users can specify the number of hits given per read.