Similar protocols

Protocol publication

[…] me. Each original genomic region entry was in browser extensible data (BED) format. Then we filtered the redundant entry and merged the overlapped entries together for each genomic feature. For each entry, we computed the coverage percentage from the merged NGS alignment files. Then we figured out the average coverage for each genomic feature., To validate the significance of difference between coverage of different genomic features, firstly we did 1000 times bootstrap to get 1000 sets of coverage entries of each genomic feature (each time with 80% volume of total entry number in the feature category). Then we did two-sided t-test for comparison between two features to get the P-value., For SOAPsnp and MAQ, we assigned the Phred-scaled likelihood that the genotype is identical to the reference, which is also called ‘SNP quality', as predictor, and assigned the 1 and 2 genotype in Affymetrix array as SNP case and 0 in genotype as SNP control for the response. We also did the 0 to 2 and 2 to 0 conversion when the minor allele is the reference allele, before ROC display and AUC calculation. SNVmix outputs 3 possibilities, homozygous to reference, heterozygous genotype and homozygous to the non-reference, we added the latter two (AB and BB) together to get the ‘SNP possibility' as predicator, and also assigned the 1 and 2 genotype in Affymetrix array as case and 0 in genotype as control for the response. To provide statistical significance for the comparison between different classifiers, firstly we found the genomic location which is both covered by SNP array and the NGS alignment method (total 583891 in ), then we did bootstrap 1000 times to get 1000 AUC values for each classifier (each time with 80% volume), lastly we did two sided t-test to get the p-value. To compare the pe […]

Pipeline specifications

Software tools SOAPsnp, MAQ, SNVMix