Computational protocol: Genetic Loci Involved in Antibody Response to Mycobacterium avium ssp. paratuberculosis in Cattle

Similar protocols

Protocol publication

[…] Genotype quality assurance was performed within the R statistical environment using the GenABEL package as implemented with the “check.marker” function . Data was quality controlled for marker call rate, minor allele frequency and Hardy Weinburg Equilibrium (HWE): markers missing 5% of data, or with MAF of less than 2% were removed as were markers that were significantly out of HWE. Genotyping efficiency for samples was also verified and samples with more than 5% missing data were removed. The duplicated samples showed 99.9% concordance of genotypes calls.Classical Multi Dimension Scaling (MDS) was used to explore population substructure and to verify the genetic homogeneity of the sample set prior to analysis. Pair wise identities by state (IBS) were calculated for all 966 samples based on autosomal SNPs using identity matrices as implemented in the GenABEL library . [...] Genome-wide association analysis was performed using the GenABEL package in R using a three step GRAMMAR-CG approach, (Genome wide Association using Mixed Model and Regression - Genomic Control), with the extension of using the genomic kinship matrix estimated through genomic marker data, instead of the pedigree , . First an additive polygenic model was used to obtain individual environmental residuals using the polygenic function of the GenABEL library to disentangle the cryptic population structure caused by the presence of closely related animals in the sample set . To account for relatedness, the variance/covariance matrix was estimated from the genomic kinship matrix, as pedigree information was not available. The relationship matrix used in the analysis was estimated using genomic data with the “ibs” (option weight =  “freq”) function of GenABEL. Secondly, association was tested using a simple least squares method on the residuals, corrected for cryptic relatedness, familiar correlation, and independent of pedigree structure. Thirdly, the Genomic Control (GC) approach was used to correct for conservativeness of the GRAMMAR test, based on the estimation of the lambda factor, which is the median of all genome-wide observed test statistics divided by the expected median of the test statistic under the null hypothesis of no association, assuming that the number of true associations is very small compared to the number of tests that are actually performed.Cases were defined as animals serologically positive for MAP by ELISA with a sample-to-positive ratio (S/P) >0.7 and MAP negative controls were defined as animals showing a sample-to-positive ratio (S/P) <0.6. Cases were set to 1 and MAP negative controls to 0. Uncorrected p-values <5×10−7 were accepted to represent very strong proof of genome-wide association, while p-values between 5×10−7 and 5×10−5 were considered as moderately significant associations.SNP effects were then estimated using the formula V = 2pqa2 where p and q are the frequencies of the minor and major alleles and a is the allelic substitution effect . Further to the initial genome-wide association study (GWAS) a confirmatory association study (CMAS) was carried out, using a smaller randomly selected sub-set of animals belonging to the same initial cohort of Holstein samples used in the initial the GWA. The analysis of these data followed the same statistical approach as described above. The threshold for confirmation of significant results in the smaller Holstein cohort was set at a p-value of less than 0.05 divided by the actual number of SNPs tested (n = 6).SNP location and gene names were based on the Btau_4.0, assembly released on 4 October 2009 ( All analyses were carried out within the R statistical environment ( […]

Pipeline specifications

Software tools GenABEL, kinship
Applications Population genetic analysis, GWAS
Organisms Mycobacterium avium, Bos taurus
Diseases Paratuberculosis, Mycobacterium avium-intracellulare Infection