Unlock your biological data


Try: RNA sequencing CRISPR Genomic databases DESeq

Somatic single nucleotide variant identification software tools | High-throughput sequencing data analysis

With the advent of relatively affordable high-throughput technologies, DNA sequencing of cancers is now common practice in cancer research projects and will be increasingly used in clinical practice to inform diagnosis and treatment. Somatic (cancer-only) single nucleotide variants (SNVs) are the simplest class of mutation, yet their identification in DNA sequencing data is confounded by germline polymorphisms, tumour heterogeneity and sequencing and analysis errors.

Source text:
(Roberts et al., 2013) A comparative analysis of algorithms for somatic SNV detection in cancer. Bioinformatics.

1 - 50 of 95 results
filter_list Filters
build Technology
healing Disease
settings_input_component Operating System
tv Interface
computer Computer Skill
copyright License
1 - 50 of 95 results
GATK / Genome Analysis ToolKit
star_border star_border star_border star_border star_border
star star star star star
Focuses on variant discovery and genotyping. GATK provides a toolkit, developed at the Broad Institute, composed of several tools and able to support projects of any size. The application compiles an assortment of command line allowing one to analyze of high-throughput sequencing (HTS) data in various formats such as SAM, BAM, CRAM or VCF. The website includes multiple documentation for guiding users.
star_border star_border star_border star_border star_border
star star star star star
Allows users to interact with high-throughput sequencing data. SAMtools permits the manipulation of alignments in the SAM/BAM/CRAM formats: reading, writing, editing, indexing, viewing and converting SAM/BAM/CRAM format. It limits the mapping quality of reads with excessive mismatches and applies base alignment quality to fix alignment errors. This tool can sort and merge alignments, remove polymerase chain reaction (PCR) duplicates or generate per-position information.
Provides a nanopore consensus algorithm using a signal-level hidden Markov model (HMM). The main subprograms of Nanopolish are: (i) nanopolish extract which extracts reads in FASTA or FASTQ format from a directory of FAST5 files; (ii) nanopolish eventalign which aligns signal-level events to k-mers of a reference genome; (iii) nanopolish variants which detects single nucleotide polymorphisms (SNPs) and indels with respect to a reference genome; and (iv) nanopolish variants –consensus which calculates an improved consensus sequence for a draft genome assembly. Furthermore, Nanopolish contains an experimental option that will use event durations to improve the consensus accuracy around homopolymers.
A statistical method for detecting and genotyping single-nucleotide variants in single-cell data. Monovar exhibited superior performance over standard algorithms on benchmarks and in identifying driver mutations and delineating clonal substructure in three different human tumor data sets. Monovar is capable of analyzing large-scale data sets and handling different whole-genome amplification (WGA) protocols, and thus it is well suited for addressing the growing need for accurate single-cell DNA variant detection.
A platform-independent mutation caller for targeted, exome, and whole-genome resequencing data generated on Illumina, SOLiD, Life/PGM, Roche/454, and similar instruments. The newest version, VarScan 2, is written in Java, so it runs on most operating systems. It can be used to detect different types of variation: 1) germline variants (SNPs and indels) in individual samples or pools of samples, 2) multi-sample variants (shared or private) in multi-sample datasets (with mpileup), 3) somatic mutations, LOH events, and germline variants in tumor-normal pairs and 4) somatic copy number alterations (CNAs) in tumor-normal exome data.
star_border star_border star_border star_border star_border
star star star star star
A versatile machine learning approach that uses Random Forest classification models to accurately call somatic variants in low-depth sequencing data. SNooPer uses a subset of variant positions from the sequencing output for which the class, true variation or sequencing error, is known to train the data-specific model. During the training phase, using a real dataset of 40 childhood acute lymphoblastic leukemia patients, it was shown how the SNooPer algorithm is not affected by low coverage or low variant allele frequencies, and can be used to reduce overall sequencing costs while maintaining high specificity and sensitivity to somatic variant calling.
star_border star_border star_border star_border star_border
star star star star star
Offers a platform for population-level analyses. dDocent is an open-source software dedicated to individually barcoded restriction-site associated DNA sequencing (RADseq) data processing. The application employs data reduction techniques and interact with other programs to propose features such as de novo assembly of RAD loci, single nucleotides polymorphisms (SNPs) and indel calling as well as quality trimming or baseline data filtering.
Provides analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs. Strelka is a variant calling method building upon the innovative Strelka somatic variant caller to improve upon aspects of variant calling for both germline and somatic analysis. The germline caller employs an efficient tiered haplotype model to improve accuracy and provide read-backed phasing, adaptively selecting between assembly and a faster alignment-based haplotyping approach at each variant locus. The germline caller also analyzes input sequencing data using a mixture-model indel error estimation method to improve robustness to indel noise.
Extracts causative variants in familial and sporadic genetic diseases. VariantMaster implements a methodology to evaluate the status (presence or absence) of a variant in familial or case-control contexts. The software allows users to identify causative variants in familial, sporadic germline, and somatic genetic disorders, including cancers. It also allows for the search of causative variants in one or more recurrently mutated genes in a pool of unrelated individuals sharing the same phenotype.
An accurate somatic mutation detection pipeline implementing a stochastic boosting algorithm to produce highly accurate somatic mutation calls for both single nucleotide variants and small insertions and deletions. The workflow currently incorporates five state-of-the-art somatic mutation callers, and extracts over 70 individual genomic and sequencing features for each candidate site. A training set is provided to an adaptively boosted decision tree learner to create a classifier for predicting mutation statuses.
A versatile variant caller for both DNA- and RNA-sequencing data. VarDict contains many features that are distinct from other variant callers, including linear performance to depth, intrinsic local realignment, built-in capability of de-duplication, detection of polymerase chain reaction (PCR) artifacts, accepting both DNA- and RNA-seq, paired analysis to detect variant frequency shifts alongside somatic and loss of heterozygosity (LOH) variant detection and structural variant (SV) calling. VarDict facilitates application of next-generation sequencing in cancer research, enabling researchers to use one tool in place of an alternative computationally expensive ensemble of tools.
Examines epigenomic and transcriptomic next generation sequencing (NGS) data. Octopus-toolkit can be used for antibody- or enzyme-mediated experiments and studies for the quantification of gene expression. It can accelerate the data mining of public epigenomic and transcriptomic NGS data for basic biomedical research. This tool provides a private and a public mode: one to process the user’s own data, and the other to analyze public NGS data by retrieving raw files from the GEO database.
A software package for calling single nucleotide variants (SNVs) using NGS data from multiple same-patient samples. Instead of performing multiple pairwise analyses of a single tumour sample and a matched normal, multiSNV jointly considers all available samples under a Bayesian framework to increase sensitivity of calling shared SNVs. By leveraging information from all available samples, multiSNV is able to detect rare mutations with variant allele frequencies down to 3% from whole-exome sequencing experiments.
SWAN / Statistical Structural Variant Analysis for NGS
A statistical framework and algorithm for structural variant (SV) detection from whole genome sequencing data. SWAN integrates multiple features, including insert size, hanging read pairs and read coverage into one statistical framework and detects putative SVs through genome-wide likelihood ratio scans. SWAN remaps soft-clip/split read clusters to supplement the likelihood analysis, joins multiple sources of evidence and identifies break points whenever possible. SWAN has improved sensitivity for detecting structural variants smaller than 10 kilobases and is particularly successful at identifying deletions smaller than 500 base pairs.
CLC bio / CLC Genomics Workbench
star_border star_border star_border star_border star_border
star star star star star
forum (1)
Allows to analyze, compare, and visualize next generation sequencing (NGS) data. CLC Genomics Workbench offers a complete and customizable solution for genomics, transcriptomics, epigenomics, and metagenomics. The software enables to generate custom workflows, which can combine quality control steps, adapter trimming, read mapping, variant detection, and multiple filtering and annotation steps into a pipeline.
A variant detector and graphical alignment viewer for next-generation sequencing data in the SAM/BAM format, which is capable of pooling data from multiple source files. The variant detector takes advantage of SAM-specific annotations, and produces detailed output suitable for genotyping and identification of somatic mutations. The assembly viewer can display reads in the context of either a user-provided or automatically generated reference sequence, retrieve genome annotation features from a UCSC genome annotation database, display histograms of non-reference allele frequencies, and predict protein-coding changes caused by SNPs.
A computational method for detecting somatic variants using high throughput sequencing data from unpaired tissue samples. We evaluate the performance of the method using genomic data from synthetic and real tumor samples. SomVarIUS identifies somatic variants in exome-seq data of ~150X coverage with at least 67.7% precision and 64.6% recall rates, when compared with paired-tissue somatic variant calls in real tumor samples. SomVarIUS can be useful in a variety of clinical or research settings, where matched normal samples are not yet routinely collected (e.g. precision medicine initiatives) or for archival samples (e.g. FFPE and fresh frozen tumor samples stored in the tissue banks).
A probabilistic method for somatic structural variation (SV) prediction by jointly modeling discordant and concordant read counts. PSSV is specifically designed to predict somatic deletions, inversions, insertions and translocations by considering their different formation mechanisms. Simulation studies demonstrate that PSSV outperforms existing tools. PSSV has been successfully applied to breast cancer data to identify somatic SVs of key factors associated with breast cancer development.
DIGTYPER / Duplication and Inversion GenoTYPER
A method to genotype tandem duplications and inversions. DIGTYPER computes genotype likelihoods for a given inversion or duplication and reports the maximum likelihood genotype. In contrast to purely coverage-based approaches, DIGTYPER uses breakpoint-spanning read pairs as well as split alignments for genotyping, enabling typing also of small events. We tested our approach on simulated and on real data and compared the genotype predictions to those made by DELLY, which discovers SVs and computes genotypes. DIGTYPER compares favorable especially for duplications (of all lengths) and for shorter inversions (up to 300 bp). In contrast to DELLY, our approach can genotype SVs from data bases without having to rediscover them.
ISOWN / Identification of SOmatic mutations Without matching Normal tissues
Predicts somatic mutations from tumor only samples. ISOWN is an algorithm that uses supervised machine learning to distinguish simple substitution somatic mutations in coding regions from germline variants in the absence of matching normal DNA. The software can assist researchers in accelerating sequencing process, reducing financial investment in sample sequencing and storing requirements, or increase the power of analysis by increasing the number of tumor samples sequenced with the same resources.
0 - 0 of 0 results
1 - 31 of 31 results
filter_list Filters
computer Job seeker
Disable 12
person Position
thumb_up Fields of Interest
public Country
language Programming Language
1 - 31 of 31 results