Variation data sharing software tools | Genome annotation
Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Complex standards have proven too time-consuming to implement.
A sequence variation nomenclature checker for automated analysis and correction of sequence variant descriptions using reference sequences from any organism. Mutalyzer handles most variation types: substitution, deletion, duplication, insertion, indel, and splice-site changes following current recommendations of the Human Genome Variation Society (HGVS). Input is a GenBank accession number or an uploaded reference sequence file in GenBank format with user-modified annotation, an HGNC gene symbol, and the variant (single or in a batch file).
An efficient algorithm for the extraction of HGVS descriptions from two sequences with three main requirements in mind: minimizing the length of the resulting descriptions, minimizing the computation time and keeping the unambiguous descriptions biologically meaningful. Description Extractor is able to compute the HGVS descriptions of complete chromosomes or other large DNA strings in a reasonable amount of computation time and its resulting descriptions are relatively small. Additional applications include updating of gene variant database contents and reference sequence liftovers.
A data integration solution that creates a layer of meaning above the many variation standards and formats, enabling them to be integrated and made sense of as a whole. VarioML is not 'Yet Another Data Standard', but instead a way to make sense of the different formats already used in the lab and clinic. There is no need to learn another standard. Whatever vocabularies of mutation data you are working with, VarioML will help you resolve the difficulties of merging with other data in other formats.