ANNOVAR is a very useful Perl package for the annotation of genetic variants detected using NGS. Users can annotate VCF files (or other compatible formats) throw 3 modules :
- Gene-based annotations : identify wether SNPs or CNVs cause protein coding changes using RefSeq, UCSC, ENSEMBL, GENCODE or AceView. Example : “Giving a list of variants in VCF format, what are the genes impacted ?”
- Region-based annotations : identify variants in a specific genomic region. Example : “Giving a list of differentially methylated regions, what are the genes in these loci ?”
- Filter-based annotations : identify variants that are reported in specific databases such as dbNSFP (pathogenicity prediction algorithms), dbSNP, 1000 genomes, ClinVar or COSMIC. Custom databases can be created in GFF format. Example : “Giving a list of variants, how can I find those previously reported as cancer-relevant mutations ?”
Giving a list of VCFs, ANNOVAR produces tab-separated values files which can be easily integrated in bioinformatics pipelines or directly read in a spreadsheet.