Computational protocol: Determining the prevalence of McArdle disease from gene frequency by analysis of next generation sequencing data

Similar protocols

Protocol publication

[…] We evaluated variant call data from the ClinSeq® cohort (n=951) and the NHLBI GO Exome Sequencing Project (ESP) (n=4,297 EA and 2,201 AA). The ClinSeq® cohort is composed of 951 patients predominantly of Caucasian descent ascertained for their family history of cardiovascular disease, participants are otherwise healthy and were not selected for known muscular conditions or symptoms. The ESP cohort is composed of several groups of patients, most of the patients have a personal or family history of cardiovascular or pulmonary disease, some of them are healthy controls while others are affected with hyperlipidemia, cardiovascular disease, or other associated conditions. None of the cohorts were selected for primary muscle disease. We first analyzed variant calls for the PYGM gene in the ClinSeq® database, materials and methods for the ClinSeq® study are described elsewhere; DNA isolation, library preparation, capture, sequencing and alignment and base calling were performed as described in previous reports. PYGM variant analysis was performed in VarSifter v1.6. Variants were filtered for mutation type and population frequency.Variants that met population frequency (MAF <0.5% in ClinSeq® and ESP) and quality filters were further classified by cross-referencing them with mutations in the Human Gene Mutation database (HGMD). The pathogenicity of these variants was evaluated by reviewing publications with clinical, functional, and/or genetic data. To be considered pathogenic, a variant had to be reported in the literature in a patient with classical manifestations of the disease with compatible ancillary testing (e.g., characteristic muscle biopsy, absent muscle phosphorylase levels, or second-wind phenomenon on treadmill testing) and the identification of biallelic variants in PYGM. The phase of the variants had to be known and appropriate Mendelian segregation confirmed. For variants not described in the literature, further classification was limited to allele frequency in the general population and in-silico model predictions: PolyPhen-2, SIFT and CADD (Combined annotation dependent depletion) score. Variants that did not meet our criteria for classification as pathogenic, were predicted to be deleterious by all four models and had a MAF<0.5% were considered to be variants of uncertain significance (VOUS). Variants with a MAF>0.5% or unpublished variants predicted to be benign by one or more in silico models were considered to be likely benign.Statistical analysis for the 95% confidence intervals was performed using the exact binomial method based on the beta distribution as described by Clopper and Pearson. Variants p.Arg50* and p.Gly205Ser were Sanger verified for the ClinSeq® cohort, Sanger validation is not possible for variants in the ESP cohort. […]

Pipeline specifications

Software tools VarSifter, PolyPhen, SIFT, CADD
Databases HGMD
Applications WGS analysis, WES analysis
Organisms Homo sapiens
Diseases Glycogen Storage Disease Type V