|Interface||Command line interface|
|Restrictions to use||None|
|Programming languages||C++, Perl|
|License||GNU Lesser General Public License version 3.0|
Add your version
- person_outline Richard Durbin <>
Comparison of two VCF files, chr order issues
#1 opened on 2017-02-10 by cmonat • 3 answers
Hi, I'm using the vcftools for the first time. I'm trying to compare two VCF files with the following command line: ------ vcftools --vcf Sample_All.vcf --diff Sample_indA.vcf --diff-site --out Multiple_vs_indA ----- and I get the following error message: ----- Error: Both files must be sorted in the same chromosomal order. chr1 in file 2 appears to be out of order. ----- But I know that in both VCF file I have the chr1 included, so, why do I have this message? Plus I have tried to had the "--not-chr chr1" parameter to my command line, as I understand it was advised to do in this case, but then I have the same error message but on chr2 ... Is it normal? What should I do to resolve this issue? Or maybe it is just a warning message? Thank you C.
It seems you have an ordering issue between your two files. You should check that both files are sorted in the same way. You can use "SortVcf" from Picard tools (https://omictools.com/picard-tool) or vcf-sort from VCFtools ("vcf-sort file.vcf.gz") command to sort your files. If they are both sorted equally, you should check if you have the same number of chromosomes present in both files.
Thank you! It seems also that bcftools (https://omictools.com/bcftools-tool) have replaced vcftools... C.
I don't think one has replaced the other. They are just two different toolkits. Both have been published in 2011.
Publication for VCFtools
VCFtools IN pipelines(53)
[…] 2013) by bwa (li and durbin, 2009). then snp/indel variants were called using gatk (mckenna et al., 2010; depristo et al., 2011). finally, snp/indel density was calculated from the variation data by vcftools (danecek et al., 2011)., to verify the indel variants derived from resequencing analysis, we designed specific pcr primers for randomly selected indel variants with 3bp or more difference, […]
[…] the genome-wide genetic differentiation between broiler and layer lines. chromosomes w/z, unplaced, random, and mitochondrial were not considered in this study. this method was performed using vcftools v. 0.1.12 software  with snp (n = 12,806,643) and indel (n = 1,273,210) datasets considered separately and using overlapping windows of 20 kb and a step size of 10 kb. weighted fst […]
[…] to find the populations wise genetic differentiation with respect to cardiovascular diseases, pair-wise weir and cockerham fst  values were calculated for the 1000 genomes data, using the vcftools. for this purpose, two approaches were employed, i.e fst calculation for all the genes which harbored the predicted deleterious snvs in this analysis, and for deleterious snvs […]
[…] lankan tamil (stu), and telugu (itu). all known snps annotated by the 1000 genomes project were retrieved as vcf files and filtered for the amplicon regions covered in our ampliseq approach using vcftools (52) and for a maf > 0.05 using the genomeanalysistoolkit (gatk v. 3.2) (53, 54). the filtered snp list was then pruned using plink (55) in order to exclude less informative snps […]
[…] depth based on trusight one sequencing panel target region list. the raw sequencing data have been processed with a custom pipeline based on popular open-source bioinformatics tools bwa, samtools, vcftools, as well as in-house perl scripts, using hg19 assembly as a reference sequence. in total 49,772 nucleotide variants were found. variant annotations were added by snpeff/snpsift software […]
Be the first to review VCFtools