Meaningful analysis of next-generation sequencing (NGS) data, which are produced extensively by genetics and genomics studies, relies crucially on the accurate calling of SNPs and genotypes. Recently developed statistical methods both improve and quantify the considerable uncertainty associated with genotype calling, and will especially benefit the growing number of studies using low- to medium-coverage data. Source text: Nielsen et al., 2011.
2kplus2
2kplus2
An algorithm that search graphs produced from the de novo assembler cortex. The 2k+2…
An algorithm that search graphs produced from the de novo assembler cortex. The 2k+2 algorithm use concepts taken from the graph theory in order to search for possible SNPs. The source code for a C++ implementation of our algorithm is available…
4Pipe4
4Pipe4
An automated analysis process specifically designed for SNP detection from 454…
An automated analysis process specifically designed for SNP detection from 454 pyrosequencing transcriptome reads. 4Pipe4 is the first program specifically built to automate the whole process of finding putative SNPs in NGS datasets that lack both…
Atlas2
Atlas2
A next-generation sequencing suite of variant analysis tools specializing in the…
A next-generation sequencing suite of variant analysis tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in Whole Exome Capture Sequencing (WECS) data. SNPs may be called using…
Bambino
Bambino
A variant detector and graphical alignment viewer for next-generation sequencing data in…
A variant detector and graphical alignment viewer for next-generation sequencing data in the SAM/BAM format, which is capable of pooling data from multiple source files. The variant detector takes advantage of SAM-specific annotations, and produces…
Churchill
Churchill
A highly scalable, ultra-fast and fully automated analysis pipeline for the discovery of…
A highly scalable, ultra-fast and fully automated analysis pipeline for the discovery of genetic variation. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth…
ComB
ComB
A software package designed for the downstream analysis of short read mapping data…
A software package designed for the downstream analysis of short read mapping data produced by the ABI SOLiD and Illumina sequencing platforms.
CoNAn-SNV
CoNAn-SNV
A probabilistic framework for the discovery of single nucleotide variants in WGSS data.
A probabilistic framework for the discovery of single nucleotide variants in WGSS data.
CopySeq
CopySeq
A computational approach that analyzes the depth-of-coverage of high-throughput DNA…
A computational approach that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can integrate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. CopySeq can…
Cortex
Cortex
A tool for genome assembly and variation analysis from sequence data. You can use it to…
A tool for genome assembly and variation analysis from sequence data. You can use it to discover and genotype variants on single or multiple haploid or diploid samples. If you have multiple samples, you can use Cortex to look specifically for…
De novo Identification of Alleles
De novo Identification of Alleles
DIAL
A computational pipeline for identifying single-base substitutions between two closely…
A computational pipeline for identifying single-base substitutions between two closely related genomes without the help of a reference genome. DIAL works even when the depth of coverage is insufficient for de novo assembly, and it can be extended to…
DISCOVAR
DISCOVAR
A variant caller and small genome assembler. The heart of DISCOVAR is a de novo genome…
A variant caller and small genome assembler. The heart of DISCOVAR is a de novo genome assembler, one that is accurate enough to produce assemblies that can be used for variant calling given a reference sequence. DISCOVAR can also generate de novo…
Discovering Single Nucleotide Polymorphism
Discovering Single Nucleotide Polymorphism
discoSnp++
Detects both heterozygous and homozygous isolated SNPs from any number of read datasets,…
Detects both heterozygous and homozygous isolated SNPs from any number of read datasets, without a reference genome, and with very low memory and time footprints (billions of reads can be analyzed with a standard desktop computer). To facilitate…
Family-based Sequencing program
Family-based Sequencing program
FamSeq
A computational tool for calculating probability of variants in family-based sequencing…
A computational tool for calculating probability of variants in family-based sequencing data. It is still challenging to call rare variants. In family-based sequencing studies, information from all family members should be utilized to more…
FamLDCaller
FamLDCaller
A computationally efficient algorithm to infer genotypes by considering multiple…
A computationally efficient algorithm to infer genotypes by considering multiple offspring in family-based sequencing data. FamLDCaller outperforms existing programs such as TrioCaller, GATK, and Beagle in general families with multiple offspring at…
FermiKit
FermiKit
A variant calling pipeline for Illumina whole-genome germline data. It de novo assembles…
A variant calling pipeline for Illumina whole-genome germline data. It de novo assembles short reads and then maps the assembly against a reference genome to call SNPs, short insertions/deletions (INDELs) and structural variations (SVs). FermiKit…
FreeBayes
FreeBayes
A Bayesian genetic variant detector designed to find small polymorphisms, specifically…
A Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs, indels, MNPs, and complex events smaller than the length of a short-read sequencing alignment.
GATK UnifiedGenotyper
GATK UnifiedGenotyper
A multiple-sample, technology-aware SNP and indel caller. It uses a Bayesian genotype…
A multiple-sample, technology-aware SNP and indel caller. It uses a Bayesian genotype likelihood model to estimate simultaneously the most likely genotypes and allele frequency in a population of N samples, emitting an accurate posterior probability…
GeneticThesaurus
GeneticThesaurus
Detecting genetic variation is one of the main applications of high-throughput…
Detecting genetic variation is one of the main applications of high-throughput sequencing, but is still challenging wherever aligning short reads poses ambiguities. GeneticThesaurus is a method for quick and robust variant detection in…
Genome Analysis Toolkit
Genome Analysis Toolkit
GATK
A software package developed to analyze high-throughput sequencing data. The toolkit…
A software package developed to analyze high-throughput sequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust…
Genomic Analysis of Mutations Extracted by…
Genomic Analysis of Mutations Extracted by Sequencing
GAMES
A pipeline aiming to serve as an efficient middleman between data deluge and…
A pipeline aiming to serve as an efficient middleman between data deluge and investigators. GAMES attains multiple levels of filtering and annotation, such as aligning the reads to a reference genome, performing quality control and mutational…
glfMultiples
glfMultiples
A GLF-based variant caller for next-generation sequencing data.
A GLF-based variant caller for next-generation sequencing data.
glfSingle
glfSingle
A GLF-based variant caller for next-generation sequencing data.
A GLF-based variant caller for next-generation sequencing data.
Halvade
Halvade
A framework that enables sequencing pipelines to be executed in parallel on a multi-node…
A framework that enables sequencing pipelines to be executed in parallel on a multi-node and/or multi-core compute infrastructure in a highly efficient manner. As an example, a DNA sequencing analysis pipeline for variant calling has been…
Illuminator
Illuminator
A sequence alignment program for the output from Illumina GA-II clonal sequencers.
A sequence alignment program for the output from Illumina GA-II clonal sequencers.
In Silico Genotyper
In Silico Genotyper
ISG
An open-source tool that can be used for SNP and inversion/deletion (indel) discovery,…
An open-source tool that can be used for SNP and inversion/deletion (indel) discovery, annotation, and phylogenomics. Benchmark comparisons demonstrate that ISG is faster and more flexible than comparable tools. ISG represents an open source,…
Indelocator
Indelocator
A software tool for calling short indels in next generation sequencing data.
A software tool for calling short indels in next generation sequencing data.
Isaac Variant Caller
Isaac Variant Caller
IVC
An analysis package designed to detect SNVs and small indels from the aligned sequencing…
An analysis package designed to detect SNVs and small indels from the aligned sequencing reads of a single diploid sample.
KvarQ
KvarQ
A tool that directly scans fastq files of bacterial genome sequences for known variants,…
A tool that directly scans fastq files of bacterial genome sequences for known variants, such as single nucleotide polymorphisms (SNP), bypassing the need of mapping all sequencing reads to a reference genome and de novo assembly. It is available…
LoFreq
LoFreq
A sensitive and robust approach for calling single-nucleotide variants (SNVs) from…
A sensitive and robust approach for calling single-nucleotide variants (SNVs) from high-coverage sequencing datasets, based on a formal model for biases in sequencing error rates. LoFreq adapts automatically to sequencing run and position-specific…
MAFsnp
MAFsnp
A SNP caller using next-generation sequencing data from multiple samples. MAFsnp has…
A SNP caller using next-generation sequencing data from multiple samples. MAFsnp has several features. First, MAFsnp can provide p-values with or without FDR correction for calling SNPs. Second, an estimated likelihood function is adopted to greatly…
Mapping and Assembly with Quality
Mapping and Assembly with Quality
MAQ
Stands for Mapping and Assembly with Quality.
Stands for Mapping and Assembly with Quality.
marginAlign
marginAlign
The package can be used to align reads to a reference genome and call single nucleotide…
The package can be used to align reads to a reference genome and call single nucleotide variations (SNVs). It is specifically tailored for Oxford Nanopore Reads. The package comes with two programs: marginAlign, a short read aligner, and…
MoDIL
MoDIL
A novel method for finding medium sized indels from high throughput sequencing datasets.
A novel method for finding medium sized indels from high throughput sequencing datasets.
MSIsensor
MSIsensor
A C++ program for automatically detecting somatic and germline variants at microsatellite…
A C++ program for automatically detecting somatic and germline variants at microsatellite regions.
Multi-sample Genotype Model Selection
Multi-sample Genotype Model Selection
MultiGeMS
A multiple sample single nucleotide variant (SNV) caller that works with alignment files…
A multiple sample single nucleotide variant (SNV) caller that works with alignment files of high-throughput sequencing data. MultiGeMS calls SNVs based on a statistical model selection procedure and accounts for enzymatic substitution sequencing…
nanopore
nanopore
A single-nucleotide-variant detection tool that uses maximum-likelihood parameter…
A single-nucleotide-variant detection tool that uses maximum-likelihood parameter estimates and marginalization over many possible read alignments to achieve precision and recall of up to 99%. By pairing this high-confidence alignment strategy with…
Platypus
Platypus
A tool designed for efficient and accurate variant-detection in high-throughput…
A tool designed for efficient and accurate variant-detection in high-throughput sequencing data. By using local realignment of reads and local assembly it achieves both high sensitivity and high specificity. Platypus can detect SNPs, MNPs, short…
PyroHMMsnp
PyroHMMsnp
A realignment-based SNP calling method for 454 and Ion Torrent sequencing data.
A realignment-based SNP calling method for 454 and Ion Torrent sequencing data.
PyroHMMvar
PyroHMMvar
A program to call short indels and SNPs for Ion Torrent and 454 data.
A program to call short indels and SNPs for Ion Torrent and 454 data.
QuadGT
QuadGT
A software package for calling single-nucleotide variants in four sequenced genomes…
A software package for calling single-nucleotide variants in four sequenced genomes comprising a normal-tumor pair and the two parents.
QualitySNPng
QualitySNPng
A software tool for the detection and visualisation of single nucleotide polymorphisms…
A software tool for the detection and visualisation of single nucleotide polymorphisms (SNPs) from next generation sequencing data that uses a haplotype-based strategy.
RAre REference VAriant annotaTOR
RAre REference VAriant annotaTOR
RAREVATOR
A tool for the identification and annotation of germline and somatic variants in rare…
A tool for the identification and annotation of germline and somatic variants in rare reference allele loci from second generation sequencing data. RAREVATOR is a Perl script that executes the UnifiedGenotyper module of GATK for genotyping all the…
realSFS
realSFS
A software used to estimate the allele frequency and SNP calling.
A software used to estimate the allele frequency and SNP calling.
Reveel
Reveel
A method for single nucleotide variant calling and genotyping of large cohorts that have…
A method for single nucleotide variant calling and genotyping of large cohorts that have been sequenced at low coverage. Reveel introduces a novel technique for leveraging linkage disequilibrium that deviates from previous Markov-based models, and…
Revise Simple Tandem repeat Error Reads
Revise Simple Tandem repeat Error Reads
ReviSTER
An automated pipeline using a ‘local mapping reference reconstruction method’…
An automated pipeline using a ‘local mapping reference reconstruction method’ to revise mismapped or partially misaligned reads at simple tandem repeat loci. ReviSTER estimates alleles of repeat loci using a local alignment method and…
RVD
RVD
A command-line program for ultrasensitive rare single nucleotide variant detection using…
A command-line program for ultrasensitive rare single nucleotide variant detection using targeted next-generation DNA resequencing.
SAMtools
SAMtools
A suite of programs for interacting with high-throughput sequencing data. It consists of…
A suite of programs for interacting with high-throughput sequencing data. It consists of three separate repositories: Samtools (Reading/writing/editing/indexing/viewing SAM/BAM/CRAM format), BCFtools (Reading/writing BCF2/VCF/gVCF files and…
SeqEM
SeqEM
A genotype calling algorithm for next-generation sequence data.
A genotype calling algorithm for next-generation sequence data.
SeqHBase
SeqHBase
A reliable big data-based computational toolset for efficiently manipulating genome-wide…
A reliable big data-based computational toolset for efficiently manipulating genome-wide variants, annotations and every-site coverage in NGS studies. SeqHBase uses a heuristic framework of inheritance information for detecting de novo, inherited…
Slider
Slider
An alignment approach that reduces the alignment problem space by utilizing each read…
An alignment approach that reduces the alignment problem space by utilizing each read base’s probabilities given in the prb files. Compared with other aligners, Slider has higher alignment accuracy and efficiency. In addition, given that Slider…
Sniper
Sniper
A Bayesian probabilistic model that enables SNP discovery in both unique and repetitive…
A Bayesian probabilistic model that enables SNP discovery in both unique and repetitive regions of a genome by utilizing the information from multiply-mapped sequence reads. Sniper fully accounts for sequencing error, template bias, and multi-locus…
Snippy
Snippy
Find SNPs/indels in a bacterial genome from NGS reads.
Find SNPs/indels in a bacterial genome from NGS reads.
SNPest
SNPest
Models the genotyping and SNP calling from the raw read sequences in a fully…
Models the genotyping and SNP calling from the raw read sequences in a fully probabilistic framework. There are many advantages in using a probabilistic model: The sampling and sequencing process is modeled explicitly which makes the approach…
SNPSVM
SNPSVM
A support vector machine for calling variants from next-gen sequencing data.
A support vector machine for calling variants from next-gen sequencing data.
SNPTools
SNPTools
A suite of tools that enables integrative SNP analysis in next generation sequencing data…
A suite of tools that enables integrative SNP analysis in next generation sequencing data with large cohorts.
SNVerGUI
SNVerGUI
A fast and easy desktop GUI tool for the identification of genomic variants from pooled…
A fast and easy desktop GUI tool for the identification of genomic variants from pooled sequencing and individual sequencing data. Using SNVerGUI, users can perform sophisticated variant detection by simply configuring several parameters in a…
SOAPsnp
SOAPsnp
A method based on Bayes’ theorem (the reverse probability model) to call consensus…
A method based on Bayes’ theorem (the reverse probability model) to call consensus genotype by carefully considering the data quality, alignment, and recurring experimental errors.
SolSNP
SolSNP
A Java-based DNA variant calling tool for Next-Generation Sequencing alignment data.
A Java-based DNA variant calling tool for Next-Generation Sequencing alignment data.
Syzygy
Syzygy
SNP and indel calling for pooled and individual targeted resequencing studies.
SNP and indel calling for pooled and individual targeted resequencing studies.
The Northern Arizona SNP Pipeline
The Northern Arizona SNP Pipeline
NASP
A reproducible pipeline that scales well with the large amount of whole-genome sequencing…
A reproducible pipeline that scales well with the large amount of whole-genome sequencing data typically used in comparative genomics applications. NASP produces comparable, and often better, results to other pipelines, but is much more flexible in…
TrioCaller
TrioCaller
A linkage-disequilibrium framework to genotype inference in parents-offspring trios.…
A linkage-disequilibrium framework to genotype inference in parents-offspring trios. TrioCaller will facilitate genotype calling and haplotype inference for many ongoing sequencing projects.
Ultrafast SNP analysis using the Burrows-Wheeler…
Ultrafast SNP analysis using the Burrows-Wheeler transform of short-read…
In contrast to the conventional mapping-based approach, a dictionary-based approach to…
In contrast to the conventional mapping-based approach, a dictionary-based approach to sequence analysis is proposed. It is expected to be efficient because the dictionary (BWT) of short-read data makes it possible to simultaneously process…
VAAL
VAAL
A polymorphism discovery algorithm for short reads.
A polymorphism discovery algorithm for short reads.
VariantMaster
VariantMaster
Extract causative variants for monogenic and sporadic genetic diseases.
Extract causative variants for monogenic and sporadic genetic diseases.
VARiD
VARiD
A Hidden Markov Model for SNP and indel identification with AB-SOLiD color-space as well…
A Hidden Markov Model for SNP and indel identification with AB-SOLiD color-space as well as regular letter-space reads.