Computational protocol: Asthma and genes encoding components of the vitamin D pathway

Protocol publication

[…] SNPs were selected using the CEPH genotype dataset from phase 1 of the International HapMap project []. The genotype data were downloaded from the genomic region covering ten kilobases up- and downstream of each gene. A maximally informative set of SNPs was selected using a pairwise tagging algorithm described by Carlson et al. []. A Perl program, called ldSelect, was used to select the SNPs in each gene. Briefly, this program analyzes the pattern of LD between SNPs and forms bins of SNPs in LD based on an r2 threshold. The algorithm ensures that all pairwise LD values between SNPs in the same bin exceed the r2 threshold. Accordingly, any SNP in a bin can serve as a proxy (tagSNP) for all other SNPs in the same bin. Only one tagSNP needs to be typed per bin. At this level, nonsynonymous SNPs genotyped in the HapMap dataset were prioritized using the "-required" option. Similarly, some SNPs were prioritized based on the type of variation (A/T, C/T, etc) to meet the genotyping technology requirement. The minor allele frequency and the r2 thresholds were set at 0.05 and 0.8, respectively, using the "-freq" and "-r2" options. Known nonsynonymous SNPs or functional variants not genotyped in the HapMap dataset were also selected for genotyping. Selected SNPs and their characteristics are shown in additional file (see Additional file ). The location of SNPs relative to the gene structure is illustrated in Additional file . […]

Pipeline specifications

Software tools ldSelect, SNPinfo
Databases International HapMap Project
Application GWAS
Diseases Asthma
Chemicals Vitamin D