Computational protocol: A genome-wide association study in a large F2-cross of laying hens reveals novel genomic regions associated with feather pecking and aggressive pecking behavior

Similar protocols

Protocol publication

[…] In order to investigate the mapping resolution of the design, the linkage disequilibrium (LD) structure was investigated for the first nine chromosomes i.e. GGA1 to GGA9 (GGA for Gallus gallus chromosome). The Beagle Genetic Software Analysis [, ], which is included in the synbreed R package [], was used to phase haplotypes and then the common LD measure r2 was estimated using PLINK [] for pairs of SNPs that were <5 Mb apart across the autosomes.GWAS are frequently conducted using mixed linear models (e.g., []). In its simplest form, such models include a general mean, a fixed SNP effect and a random family effect. The latter is important to capture population stratification effects and, hence, to prevent inflation of type I errors (e.g., []). Previous studies showed that FPD, APD and APR are not normally distributed and that Poisson models should be used for the statistical analyses [, ]. Poisson models with fixed and random effects belong to a class of generalized linear mixed models (GLMM). Due to the lack of a closed form of expression of the likelihood for these models, approximate likelihood techniques are often used, as for example in the software ASReml []. However, for hypothesis testing, the behavior of these techniques has not been sufficiently well investigated, and Collins [] recommended that GLMM should not be used for this purpose. Therefore, we used the following generalized linear model based on the Poisson distribution and no random effects for single-marker association analysis:1ηijm=Hj+Si+Di+bmxim,where η ijm is the linear predictor for hen i and SNP m, H j is the fixed hatch effect, S i and D i are the fixed sire and dam effects, respectively, x im denotes the number of copies of the minor allele of SNP m (x = 0, 1, or 2), and b m is the regression coefficient for SNP m. Thus, instead of fitting a random family effect, we included fixed sire and dam effects in the model to account for population stratification effects.In a previous study, we detected substantial permanent environmental effects for FPD, APD and APR [], which could also be caused by dominant gene effects. Because dominance and additive gene effects tend to be correlated such that larger dominance deviations are observed for genes with larger additive effects [], we tested only genome-wide significant SNPs from Model () or from the meta-analysis (described below) for dominance effects using the following Poisson model:2ηijm=Hj+Si+Di+bmxim+b~mzimwhere z im is an indicator variable, which is 1(0) if the individual is heterozygous (homozygous) at SNP m and b~m is a fixed regression coefficient, which is a dominance estimate. The other terms are defined as in Model ().To correct for multiple-testing, we applied a Bonferroni-type correction as:pgenome-wide=1-1-p#SNP,where the number (#) of SNPs was equal to 29,376. The genome-wide significance level was set at p genome-wide ≤ 0.05. Because Bonferroni’s correction is very conservative, we considered an additional nominal significant level; i.e. p < 5 × 10−5. To estimate the number of false positives among the significant SNPs, we calculated a false discovery rate (FDR) q value for each association test by using the software QVALUE []. The FDR q value of the significant SNP with the largest p value provided an estimate of the proportion of false positives among the significant SNPs.A meta-analysis was performed using the data from the selection experiment and the F2-cross experiment. We combined the p values from both studies using the inverse Chi square method of Fisher [], known as Fisher’s combined probability test, as follows:χ2k2∼-2∑i=1klnpi,where p i is the p value for the ith hypothesis test and k is the number of studies being combined (i.e., k = 2 in our study). The significance levels were used for the p value obtained from the meta-analysis were the same as those for the GWAS (Model 1). […]

Pipeline specifications

Software tools BEAGLE, synbreed, PLINK
Databases FPD
Application GWAS
Organisms Gallus gallus, Martes pennanti