Computational protocol: A Genome-Wide Linkage Study for Chronic Obstructive Pulmonary Disease in a Dutch Genetic Isolate Identifies Novel Rare Candidate Variants

Similar protocols

Protocol publication

[…] For the genome-wide linkage analysis, 142 related COPD cases from ERF were used. The cases were linked in a single large pedigree of 23 generations. However, due to the linkage software restraints, the cases were clustered into 27 smaller (≤24 bits) families using PEDCUT software (). We used HaploPainter () to illustrate all 27 pedigrees (Supplementary Figure ). We then performed affected-only parametric linkage analysis in MERLIN software () using incomplete penetrance and no phenocopies for both dominant (0, 0.5, 0.5) and recessive models (0, 0, 0.5) (). The measure of the likelihood of linkage is the LOD score and we considered LOD ≥ 3.3 to be statistically significant. Further we performed per-family analysis for significant regions to identify the families with COPD cases contributing the most to the LOD score. [...] Next, we used exome-sequence data in ERF to identify rare variants that may explain the identified linkage peaks. For this, among all variants in this region we selected only variants with predicted damaging effects on protein (missense and stop-coding) based on the FunctionGVS column of the SeattleSeq Annotation database from the National Heart, Lung and Blood Institute (NHLBI) and with MAF < 0.05 in the general population (1000 Genomes). As frequencies in a genetically isolated population may be inflated or deflated due to genetic drift (), we used the MAF from the general population for filtering. We selected variants shared among most (>50%) of the affected family members as candidate variants.A formal test of association was performed for the identified candidate variants in each study – ERF, in samples with exome-sequence (N = 636) and in exome-chip (N = 572) data, in three RS cohorts (RS-I, RS-II, and RS-III), using the HRC imputed data (N = 11,372), the LLS (N = 11,177), the VlaVla cohort (N = 1,394) and the results (N = 11,797). For this analysis, in ERF we used “seqMeta” package in R () to perform single-variant analysis, adjusted for age, sex, and smoking status (current/past/never smoking). Logistic regression analysis was used to associate the variants in the RS and the VlaVla cohort, using SPSS software () and in LLS, using PLINK (), applying the same models as used in ERF. Variants were excluded from the analysis if the minor allele count was less than five in either the case or the control category. Summary statistics for identified the variants were extracted from the results of . A fixed-effects meta-analysis was performed with the summary statistics from all studies using the “rmeta” package in R (). […]

Pipeline specifications

Software tools PedCut, HaploPainter, Merlin, SeattleSeq Annotation, seqMeta, PLINK, rmeta
Applications Population genetic analysis, GWAS
Chemicals Niacin, Niacinamide