Structural variant identification software tools | High-throughput sequencing data analysis
From prokaryotes to eukaryotes, phenotypic variation, adaptation and speciation has been associated with structural variation between genomes of individuals within the same species. Many computer algorithms detecting such variations (callers) have recently been developed, spurred by the advent of the next-generation sequencing technology. Such callers mainly exploit split-read mapping or paired-end read mapping.
Identifies the structural variation (SV) by whole genome de novo assembly. SOAPsv aims to show that SVs reports for a greater fraction of diversity between individuals than do single nucleotide polymorphisms (SNPs). This software also demonstrates that de novo assembly can detect SVs of a large range of lengths. The SV maps of human genomes allows to initially describe the genomic patterns of SVs and their relationship with a variety of genomic features.
Leverages a breakpoint junction library for structural variants (SVs) detection. BreakSeq is an approach that identifies SVs by aligning raw reads directly onto SV breakpoint junctions of the alternative, non-reference, alleles contained in a library. The software can serve for identifying specific SV alleles in personal genomics data. It enables a step towards overcoming reference biases.
Identifies somatic variation in tumor genomes. SMuFin uses direct comparison with the corresponding normal samples to detect in a single run somatic single-nucleotide variants (SNV) and structural variants such as insertions, deletions, inversion and translocations of any size. This software allows to describe at base pair resolution complex scenarios of chromosomal rearrangements like chromoplexy and chromothripsis.
Enables discovering and genotyping structural variations using sequencing data. Genome STRiP performs discovery and genotyping of copy number variations (CNVs) by analyzing the data from many samples simultaneously in a population-based framework. The software can discover polymorphisms and produce genotypes. It can be used to find novel structural variations or to genotype known variants in new samples.
Determines complex sets of DNA rearrangements and deletions in cancer genomes. ChainFinder is a program based on a statistically based search rooted in graph theory with the aim of highlighting rearrangements that can possibly permits users to detect coordinate chromosomal alterations. This application starts from a pre-computed model and performs comparisons to detect genomic rearrangements and their associated deletions. This algorithm was tested on prostate tumors.
To characterize the mutational spectrum of somatic SVs in cancer, it is important to identify both simple (e.g., deletion, insertion, and inversion) and complex SVs at base-pair resolution. Meerkat predicts both germline and somatic SVs directly from short read data, focusing on complex events.
Performs read alignments and classifies structural variants (SVs), including complex SVs enriched in repetitive DNA elements and short tandem duplications. The pipeline includes three steps: first, it aligns nanopore reads to the reference genome thank to LAST, then it selects the best alignments and finally, its perform a classification of SVs.
A tool designed for efficient and accurate variant-detection in high-throughput sequencing data. By using local realignment of reads and local assembly it achieves both high sensitivity and high specificity. Platypus can detect SNPs, MNPs, short indels, replacements and (using the assembly option) deletions up to several kb. It has been extensively tested on whole-genome, exon-capture, and targeted capture data.
A Perl/C++ package that provides genome-wide detection of structural variants from next generation paired-end sequencing reads. BreakDancer sensitively and accurately detected indels ranging from 10 base pairs to 1 megabase pair that are difficult to detect via a single conventional approach.
Automatically detects copy number alterations (CNAs) and loss of heterozygosity (LOH) regions using next-generation sequencing (NGS) data. Control-FREEC consists of three steps: (i) calculation and segmentation of copy number profiles, (ii) calculation and segmentation of smoothed BAF profiles; and (iii) prediction of final genotype status. The software can call genotype status including when no control experiment is available and/or the genome is polyploid. It also corrects for GC-content and mappability biases.
An algorithm using NGS reads with partial alignments to a reference genome to directly map structural variations at the nucleotide level. Application of CREST to whole-genome sequencing data from five pediatric T-lineage acute lymphoblastic leukemias (T-ALLs) and a human melanoma cell line, COLO-829, identified 160 somatic structural variations. Experimental validation exceeded 80%, demonstrating that CREST had a high predictive accuracy.
Finds genomic rearrangements, including translocations, inversions and deletions. FACTERA can perform with high specificity without compromising sensitivity. It is able to define fusion genes and breakpoints in targeted sequencing data. This tool is applicable on paired-end and soft-clipped reads and is useful for whole genome shotgun sequencing investigation. It aligns all soft-clipped and unmapped reads against each candidate fusion sequence.
A tool to generate local assemblies of breakpoints genome-wide. NovoBreak is an algorithm used in cancer genomic studies to discover structural variants (both somatic and germline) breakpoints in whole-genome sequencing data. Assemblies realized by novoBreak are based on clusters of reads which share a set of short nucleotide stretches of length K (K-mers) present in a subject genome but not in the reference genome or control data.
Identifies structural variants, performs sequence assembly at the breakpoints, and reconstructs the complex structural variants using the long-fragment information from the 10x Genomics platform. GROCSVs is implemented as a multi sample analysis pipeline, allowing the simultaneous analysis of multiple tumor and matched normal samples, or multiple related individuals.
Integrates prior knowledge about the characteristics of structural variants (SVs). forestSV is a statistical learning approach, based on Random Forests (RFs) that leads to improved discovery in high throughput sequencing (HTS) data. This application offers high sensitivity and specificity coupled with the flexibility of a data-driven approach. It is particularly well suited to the detection of rare variants because it is not reliant on finding variant support in multiple individuals.
Combines pairwise distance and rearrangement algorithms for unichromosomal and multichromosomal genomes. GRIMM can be run with signed or unsigned gene data. The application calculates the lowest number of rearrangement steps and determines hypothesis according the result. Scenarios may be programmed in several ways according to the size of the genomes or researchers’ need.
Allows users to detect structural mosaicism abnormalities in targeted or whole-genome sequencing (WGS) data. MrMosaic measures deviations in copy number and allele frequency by comparison between depth and B-allele fraction from studied files and randomly chromosomes from the same data. The software was tested with 4,911 whole-exome sequencing (WES) data.
Allows structural variant (SV) discovery. LUMPY is a general probabilistic SV discovery framework that integrates multiple SV detection signals, including those generated from read alignments or prior evidence. The software is based upon a general probabilistic representation of an SV breakpoint that allows any number of alignment signals to be integrated into a single discovery process. It can detect SV from multiple alignment signals in files from one or more samples. A simplified wrapper for standard analyses, LUMPY Express, can also be executed.
Assists users in handling structural variants (SVs) breakpoints. Hydra-sv confronts various discordant mappings with the aim of enabling the detection, assembly, and interpretation of the mechanics related to these breakpoints. This software can be employed on a genome used for testing to highlight novel DNA junctions or, theoretically, to detect genetic events triggering a breakpoint.
A computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. The package is composed of three modules, PEMer workflow, SV-Simulation and BreakDB. PEMer workflow is a sensitive software for detecting SVs from paired-end sequence reads. SV-Simulation randomly introduces SVs into a given genome and generates simulated paired-end reads from the ‘novel’ genome. Subsequent analysis with PEMer workflow on the simulated reads can facilitate parameterize PEMer workflow. BreakDB is a web accessible database developed to store, annotate and dsplay SV breakpoint events identified by PEMer and from other sources.
Topics (12): WGS analysis, Homo sapiens, Abnormalities, Drug-Induced, Nervous System Malformations, Malformations of Cortical Development, Nervous System Malformations, Malformations of Cortical Development, Nervous System Malformations, Malformations of Cortical Development, Genetic Diseases, X-Linked, Nervous System Malformations, Malformations of Cortical Development