Identifies the structural variation (SV) by whole genome de novo assembly. SOAPsv aims to show that SVs reports for a greater fraction of diversity between individuals than do single nucleotide polymorphisms (SNPs). This software also demonstrates that de novo assembly can detect SVs of a large range of lengths. The SV maps of human genomes allows to initially describe the genomic patterns of SVs and their relationship with a variety of genomic features.
Identifies somatic variation in tumor genomes. SMuFin uses direct comparison with the corresponding normal samples to detect in a single run somatic single-nucleotide variants (SNV) and structural variants such as insertions, deletions, inversion and translocations of any size. This software allows to describe at base pair resolution complex scenarios of chromosomal rearrangements like chromoplexy and chromothripsis.
To characterize the mutational spectrum of somatic SVs in cancer, it is important to identify both simple (e.g., deletion, insertion, and inversion) and complex SVs at base-pair resolution. Meerkat predicts both germline and somatic SVs directly from short read data, focusing on complex events.
A tool designed for efficient and accurate variant-detection in high-throughput sequencing data. By using local realignment of reads and local assembly it achieves both high sensitivity and high specificity. Platypus can detect SNPs, MNPs, short indels, replacements and (using the assembly option) deletions up to several kb. It has been extensively tested on whole-genome, exon-capture, and targeted capture data.
A Perl/C++ package that provides genome-wide detection of structural variants from next generation paired-end sequencing reads. BreakDancer sensitively and accurately detected indels ranging from 10 base pairs to 1 megabase pair that are difficult to detect via a single conventional approach.
Assists users to infer an underlying genotype at each structural variants (SVs). SVTyper is a Bayesian likelihood algorithm that can operate on copy-neutral events such as inversions and translocations as well as copy number variants (CNVs). It permits the production of SV genotypes, useful for meaningful variant interpretation, as well as quantitative estimates of breakpoint allele frequencies that allow inference of the fraction of tumor cells that carry a particular variant.
Detects genotype insertions and deletions from paired-end reads. CTK is a suite of tools for next-generation sequencing (NGS) data analysis and is based on an internal segment size approach to discover indel variation from paired-end read data. It contains also, among others, a long-indel-aware read mapper (LASER), a BAM converter to a list of alignment pairs with prior probabilities and a split feature by chromosome.
A computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. The package is composed of three modules, PEMer workflow, SV-Simulation and BreakDB. PEMer workflow is a sensitive software for detecting SVs from paired-end sequence reads. SV-Simulation randomly introduces SVs into a given genome and generates simulated paired-end reads from the ‘novel’ genome. Subsequent analysis with PEMer workflow on the simulated reads can facilitate parameterize PEMer workflow. BreakDB is a web accessible database developed to store, annotate and dsplay SV breakpoint events identified by PEMer and from other sources.
Identifies structural variant (SV) breakpoint junctions by clustering split reads. NanoSV first orders all mapped segments of each split read by their positions within the originally sequenced read. This tool utilizes split read mapping to discover all defined types of SVs. It finishes by gathering evidence form different reads supporting the same candidate breakpoint junction. NanoSV suits for Nanopore and Pacific Biosciences data.
Provides computational tools and methods for high-quality insertion sequence (IS) annotation. ISsaga uses established ISfinder annotation standards and permits rapid processing of single or multiple prokaryote genomes. ISsaga provides general prediction and annotation tools, information on genome context of individual ISs and a graphical overview of IS distribution around the genome of interest.
Allows identification of genomic rearrangements. GRIDSS is a module software suite containing tools which performs genome-wide break-end assembly prior to variant calling using a positional de Bruijn graph assembler. The GRIDSS pipeline comprises three distinct stages: extraction, assembly, and variant calling. The software identifies non-template sequence insertions, microhomologies and large imperfect homologies, and supports multi-sample analysis.
Integrates sequencing reads from next-generation sequencing (NGS) and single-molecule sequencing (SMS) technologies to accurately assemble and detect structural variations (SV) in human genome. By identifying homologous SV-containing reads from different technologies through a bipartite-graph-based clustering algorithm, our approach turns a whole genome assembly problem into a set of independent SV assembly problems, each of which can be effectively solved to enhance assembly of structurally altered regions in human genome.
Identifies regions of the genome suspected to harbor a complex event. SVelter then resolves the structure by iteratively rearranging the local genome structure, in a randomized fashion, with each structure scored against characteristics of the observed sequencing data. SVelter is able to accurately reconstruct complex chromosomal rearrangements when compared to well-characterized genomes that have been deeply sequenced with both short and long reads. SVelter is able to interrogate many different types of rearrangements, including multi-deletion and duplication-inversion-deletion events as well as distinct overlapping variants on homologous chromosomes.
An approach that uses a 'kmer' strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants.
A computational tool for automated annotation of insertion sequences (ISs). OASIS takes advantage of widely available transposase annotations to identify candidate ISs and then uses a computationally efficient maximum likelihood method of multiple sequence alignment to identify the edges of each element. Thanks to its speed and flexibility, OASIS is capable not only of providing detailed IS information for a single genome but also of annotating thousands of genomes within hours, making it a valuable high-throughput tool for a global investigation of IS distribution across diverse taxa.
Detects and visualizes structural variation from paired-end mapping data. Under this scheme, abnormally mapped read pairs are clustered based on the location of a gap signature. Several important features, including local depth of coverage, mapping quality and associated tandem repeat, are used to evaluate the quality of predicted structural variation. Compared with other approaches, it can detect many more large insertions and complex variants with lower false discovery rate. Moreover, inGAP-sv, written in Java programming language, provides a user-friendly interface and can be performed in multiple operating systems.
Identifies transposase sequences, inverted repeats and candidate target direct repeats of insertion sequences (ISs) in complete genomes. IScan is able to identify ISs with an arbitrary number of ORFs, including ISs with ORFs encoded on both strands. IS annotation in existing genomes may be highly heterogeneous, because different researchers may use different annotation methods. A tool like IScan thus allows the user to create consistent IS annotation with multiple user-specified parameters (repeat length, sequence similarity to a reference family member, etc.) across multiple genomes. This consistency and flexibility is essential for detailed analyses of IS evolution across multiple genomes.
Identifies hypervariable regions from structural variants (SVs). SVM2 is an SV mapping tool includes several features based on statistics and resequencing coverage measures for windows around a given genomic coordinate. This software has been trained to discriminate genomic loci flanking four classes of event (deletions, insertions shorter or longer than the library insert size and hypervariable regions) from normal genomic regions.
A statistical framework and algorithm for structural variant (SV) detection from whole genome sequencing data. SWAN integrates multiple features, including insert size, hanging read pairs and read coverage into one statistical framework and detects putative SVs through genome-wide likelihood ratio scans. SWAN remaps soft-clip/split read clusters to supplement the likelihood analysis, joins multiple sources of evidence and identifies break points whenever possible. SWAN has improved sensitivity for detecting structural variants smaller than 10 kilobases and is particularly successful at identifying deletions smaller than 500 base pairs.