1 - 50 of 69 results


Integrates prior knowledge about the characteristics of structural variants (SVs). forestSV is a statistical learning approach, based on Random Forests (RFs) that leads to improved discovery in high throughput sequencing (HTS) data. This application offers high sensitivity and specificity coupled with the flexibility of a data-driven approach. It is particularly well suited to the detection of rare variants because it is not reliant on finding variant support in multiple individuals.


A computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. The package is composed of three modules, PEMer workflow, SV-Simulation and BreakDB. PEMer workflow is a sensitive software for detecting SVs from paired-end sequence reads. SV-Simulation randomly introduces SVs into a given genome and generates simulated paired-end reads from the ‘novel’ genome. Subsequent analysis with PEMer workflow on the simulated reads can facilitate parameterize PEMer workflow. BreakDB is a web accessible database developed to store, annotate and dsplay SV breakpoint events identified by PEMer and from other sources.

GRIDSS / Genomic Rearrangement IDentification Software Suite

Allows identification of genomic rearrangements. GRIDSS is a module software suite containing tools which performs genome-wide break-end assembly prior to variant calling using a positional de Bruijn graph assembler. The GRIDSS pipeline comprises three distinct stages: extraction, assembly, and variant calling. The software identifies non-template sequence insertions, microhomologies and large imperfect homologies, and supports multi-sample analysis.

GenomeVIP / Genome Variant Investigation Platform

Performs variant discovery on Amazon's Web Service (AWS) cloud or on local high-performance computing clusters. GenomeVIP is a genomics analysis pipeline for cloud computing with germline and somatic calling on amazon’s cloud. It provides a collection of analysis tools and computational frameworks for streamlined discovery and interpretation of genetic variants. The server and runtime environments can be customized, updated, or extended.


Identifies regions of the genome suspected to harbor a complex event. SVelter then resolves the structure by iteratively rearranging the local genome structure, in a randomized fashion, with each structure scored against characteristics of the observed sequencing data. SVelter is able to accurately reconstruct complex chromosomal rearrangements when compared to well-characterized genomes that have been deeply sequenced with both short and long reads. SVelter is able to interrogate many different types of rearrangements, including multi-deletion and duplication-inversion-deletion events as well as distinct overlapping variants on homologous chromosomes.

Wham / WHole-genome Alignment Metrics

A structural variant (SV) caller that integrates several sources of mapping information to identify SVs. Wham classifies SVs using a flexible and extendable machine-learning algorithm (random forest). Wham is not only accurate at identifying SVs, but its association test can identify shared SVs enriched in a cohort of diseased individuals compared to a background of healthy individuals. Wham is designed for paired-end Illumina libraries with standard insert sizes (~300bp-500bp). It integrates mate-pair mapping, split read mapping, soft-clipping, alternative alignment and consensus sequence based evidence to predict SV breakpoints with single-nucleotide accuracy. Wham can be easily run as a stand-alone tool or as part of gkno or bcbio-nextgen pipelines.


Employs a two-stage process to evaluate and refine structural variation (SV) predictions. SVmine is an algorithm for further mining of SV predictions from multiple algorithms to improve the sensitivity, specificity and breakpoint resolution of SV detection. It first performs quality evaluation and filters low quality SV predictions. Then, it refines breakpoint positions of the high quality SVs by performing precise “sandwich” realignments of soft-clipped reads. The realignment strategy used by SVmine can also be generalized to Pacbio long read data.


Provides a structural variation (SV) caller for long reads. Sniffles is mainly designed for PacBio reads, but also works on Oxford Nanopore reads. SV are larger events on the genome (e.g. deletions, duplications, insertions, inversions and translocations). Sniffles can detect all of these types and more such as nested SVs (e.g. inversion flanked by deletions or an inverted duplication). Furthermore, Sniffles incorporates multiple auto tuning functions to determine data set depending parameter to reduce the overall risk of falsely infer SVs.

SV2 / support-vector structural-variant genotyper

Implements a machine-learning algorithm for genotyping deletions and tandem duplications from paired-end whole genome sequencing (WGS) data. SV2 can rapidly integrate variant calls from multiple structural variant discovery algorithms into a unified callset with low rates of false discoveries and Mendelian errors, accurate de novo detection with no transmission bias in families. SV2 is an open source software written in Python that exploits read depth, discordant paired-ends, and split-reads in a supervised support vector machine classifier. Required inputs include a BAM file with supplementary alignment tags (SA), a Single-Nucleotide Variant (SNV) VCF file with allelic depth, and either a BED or VCF file of deletions and tandem duplications to be genotyped. The final product is a VCF file with genotypes and annotations for genes, repeats, and other befitting statistics for structural variant (SV) analysis.


Permits to automate and discover structural variations (SVs). Tardis is a toolkit that integrates read pair, read depth, and split read (using soft clipped mappings) sequence signatures to discover several types of SV, while resolving ambiguities among different putative SVs. This application is suitable for cloud use as the memory footprint is low. It is also capable of characterizing deletions, small novel insertions, tandem duplications, inversions, and mobile element retrotransposition.

NAIBR / Novel Adjacency Identification with Barcoded Reads

Recognizes novel adjacencies resulting from structural variants in an individual genome from linked-read sequencing data. NAIBR combines a split-read type signal from linked-reads with signals of structural variants in the underlying paired-reads in the data. It was tested on the detection of somatic structural variants in tumor cell line HCC1954T. This tool enables the identification of novel adjacencies arising from small structural variants.

SVachra / Structural Variation Assesment of CHRomosomal Aberrations

Detects chromosomal aberrations with high specificity across a several variant types and lengths in next-generation mate pair sequencing data. SVachra calculates the distributions of the inward and outward facing mate pair types and applies independent clustering of the inward and outward facing discordant mapped reads to call chromosomal structural variants. Subsequently, it generates a highly specific breakpoint calling that aims to perform a more unbiased detection methodology.


Detects structural variants in cancer using whole genome sequencing data with or without matched normal control sample. SV-Bay does not only use information about abnormal read mappings but also assesses changes in the copy number profile and tries to associate these changes with candidate SVs. The likelihood of each novel genomic adjacency is evaluated using a Bayesian model. In its final step, SV-Bay annotates genomic adjacencies according to their type and, where possible, groups detected genomic adjacencies into complex SVs as balanced translocations, co-amplifications, and so on. A comparison of SV-Bay with BreakDancer, Lumpy, DELLY and GASVPro demonstrated its superior performance on both simulated and experimental datasets.

iSVP / integrated Structural Variant calling Pipeline

Allows detection of structural variants (SV) from next-generation sequencing (NGS) data. iSVP is a pipeline that combines existing SV detection methods. The software was applied to human whole genome sequence data from a HapMap NA12878 sample and detected numerous SVs that were biologically explainable. It is applicable to high-coverage whole genome sequencing (WGS) data with reasonable computational resources, and thus can enhance the genome-wide detection of SVs for the identification of disease-causing variants.


Calculates annotations from one or more aligned bam files from many high-throughput sequencing technologies, and then builds a one-class model using these annotations to classify candidate structural variants (SVs) as likely true or false positives. SVClassify method gives the highest scores to SVs that are insertions or large homozygous deletions, and have accurate breakpoints. Deletions smaller than 100-bps often have low scores with our method, so other methods like svviz are likely to give better results for very small SVs.