Variant recalibration software tools | Whole-genome sequencing data analysis
Accurate genomic variant detection is an essential step in gleaning medically useful information from genome data. However, low concordance among variant-calling methods reduces confidence in the clinical validity of whole genome and exome sequence data, and confounds downstream analysis for applications in genome medicine.
Focuses on variant discovery and genotyping. GATK provides a toolkit, developed at the Broad Institute, composed of several tools and able to support projects of any size. The application compiles an assortment of command line allowing one to analyze of high-throughput sequencing (HTS) data in various formats such as SAM, BAM, CRAM or VCF. The website includes multiple documentation for guiding users.
Performs genomic data analysis. HiGene is a platform aiming to assist in the computation of resource allocation and skew processing of genomic data analysis. This software exploits Apache Spark to parallelize the GATK pipeline and improves pipeline performance by a combined optimization of resource usage at fine-grained level with the resolving of the task skew problem from both data and computation aspects. It was evaluated on a whole genome sequencing (WGS) dataset.
Provides a method for combining sets of SNP variant calls produced by different variant calling programs. The integrated set of SNP variant calls produced by BAYSIC™ improves the sensitivity and specificity of the variant calls used as input. In addition to combining sets of germline variants, BAYSIC can also be used to combine sets of somatic mutations detected in the context of tumor/normal sequencing experiments.
Allows harmonization of genotype data stored using different file formats with different and potentially unknown strands. The Genotype Harmonizer (GH) is an easy to use command-line tool that solves the unknown strand issue by aligning ambiguous A/T and G/C single nucleotides polymorphism (SNPs) to a specified reference, using linkage disequilibrium (LD) patterns without prior knowledge of the used strands. GH supports many common genome-wide association study/next generation sequencing (GWAS/NGS) genotype formats including PLINK, binary PLINK, VCF, SHAPEIT2 & Oxford GEN. It uses an advanced LD-based method to perform the alignment of ambiguous SNPs and supports many genotype file formats.
Filters candidate variants according to the given criteria. FMFilter can handle compound heterozygous and de novo models properly. It offers options to make filtering according to genotype quality, read depth, gene name, mutation type and custom annotated population frequency. This tool can find out compound heterozygous mutations. It provides an alternative for the analysis of next-generation sequencing data collected by Mendelian's disease research.
Automatically integrates variant calling pipelines into an overall model that predicts accurate variant probabilities. VariantMetaCaller utilizes support vector machines (SVM) to merge multiple information sources built by variant calling pipelines. It can be used to determine probabilities of variants. This tool can produce probabilistic scores for calls even in areas, such as in targeted gene panels or organisms.
Analyzes individual Variant calling format (VCF) files with the goal of identifying alignable scaffold-discrepant position (ASDP)-associated variants. ASDPex is an implementation an ASDP extraction algorithm. It also calls the most likely combination of haplotypes for each of the 178 genomic regions. For this purpose, it scans each of the 178 regions in turn and compares all of the associated alternate loci.