Read alignment software tools | High-throughput sequencing data analysis
An ubiquitous and fundamental step in high-throughput sequencing analysis is the alignment (mapping) of the generated reads to a reference sequence. To accomplish this task, numerous software tools have been proposed.
Aligns high-throughput long and short RNA-seq data to a reference genome using uncompressed suffix arrays. STAR is a standalone software capable of align reads in a continuous streaming mode. The application first run a seed search for then perform seed clustering and stitching. It is able to detect canonical junctions, non-canonical splices and chimeric transcripts and to map full-length RNA sequences.
Assists users in mapping reads to a reference genome. Subread offers a suite of programs for processing next-generation sequencing read data. This package includes Subread (an aligner), Subjunc (an aligner), Sublong (a long-read aligner), Subindel (a long indel detection program), featureCounts (a read quantification program), exactSNP (an SNP calling program) and other utility programs.
Finds genomic sequences that match a protein or DNA sequence submitted by the user. BLAT is a very fast sequence alignment tool similar to BLAST typically used for searching similar sequences within the same or closely related species. It was developed to align millions of expressed sequence tags and mouse whole-genome random reads to the human genome at a higher speed. BLAT is commonly used to look up the location of a sequence in the genome or determine the exon structure of an mRNA, but expert users can run large batch jobs and make internal parameter sensitivity changes by installing command line it on Linux server.
Uses as a splice-aware aligner for short and long reads. BBTools is shown to be a fast and accurate aligner, capable of correctly handling an overall wider variety of references, reads, and mutations than others. It has particularly outstanding performance with deletions, especially long ones, that other aligners cannot handle at all. It can output many different statistics files, such as an empirical read quality histogram, insert-size distribution, and genome coverage, with or without generating a sam file.
Aligns short read geared toward mammalian re-sequencing. Bowtie is based on a Burrows-Wheeler index based on the full-text minute-space (FM) index. It follows two steps: an initial, ungapped seed-finding stage that derives advantage from the speed and memory efficiency of the full-text minute index and a gapped extension stage that employs dynamic programming and benefits from the efficiency of single-instruction multiple-data (SIMD) parallel processing available on modern processors.
Builds mapping assemblies from short reads generated by the next-generation sequencing machines. Maq is particularly designed for Illumina-Solexa 1G Genetic Analyzer, and has preliminary functions to handle ABI SOLiD data. Maq first aligns reads to reference sequences and then calls the consensus. At the mapping stage, maq performs ungapped alignment. For single-end reads, maq is able to find all hits with up to 2 or 3 mismatches, depending on a command-line option; for paired-end reads, it always finds all paired hits with one of the two reads containing up to 1 mismatch. At the assembling stage, maq calls the consensus based on a statistical model.
Efficiently aligns DNA sequencing reads with a reference genome. SMALT employs a hash index of short words, sampled at equidistant steps along the genomic reference sequences to work. It reports the best gapped alignments of each read and a score for the reliability of the best mapping. This tool permits to customize the trade-off between sensitivity and speed. It is useful to discover split (chimeric) reads.