Computational protocol: Identification of GAA variants through whole exome sequencing targeted to a cohort of 606 patients with unexplained limb-girdle muscle weakness

Similar protocols

Protocol publication

[…] Ethical approval was granted by the Newcastle and North Tyneside research ethics committee (REC reference number 09/H0906/28) and by the local ethical committees of the participating centres. A standardised form for collecting detailed phenotypic information was created using the PhenoTips online software tool []: this was completed by the referring clinician for each patient enrolled onto the project. Informed written consent was given by the patients, who were anonymised by the collaborating centres by using unique MYO-SEQ patient identification codes. The fundamental requirement for inclusion in the project was that of unexplained limb-girdle muscle weakness and/or elevated serum creatine kinase activity. [...] Whole exome sequencing and data processing [] were performed by the Genomics Platform at the Broad Institute of Harvard and MIT (Broad Institute, Cambridge, MA, USA). Briefly, whole exome sequencing was performed on DNA samples (>250 ng at >2 ng/μl) using Illumina exome capture (38 Mb target). Our exome sequencing pipeline included sample plating, library preparation (2-plexing of samples per hybridisation), hybrid capture, sequencing (76 bp paired reads), sample identification, quality control check, and data storage. Our hybrid selection libraries cover >80% of targets at 20× and an overall mean target coverage of >80×, while GAA had mean coverage of 87.1×. The exome sequencing data was de-multiplexed and each sample’s sequence data were aggregated into a single Picard BAM file. The data were processed through a pipeline based on Picard using base quality score recalibration and local alignment at known insertions/deletions. The reads were mapped to the human genome build 37 (hg19) using the Burrows-Wheeler Aligner. Single nucleotide polymorphisms and insertions/deletions were jointly called using the Genome Analysis Toolkit HaplotypeCaller package v3.1 [–]. Default filters were applied to the variant calls using the Genome Analysis Toolkit Variant Quality Score Recalibration approach, and the variants were annotated using Variant Effect Predictor. [...] The variant call set was uploaded onto the Broad Institute of Harvard and MIT’s seqr platform. The biological relevance of the variants identified within GAA was determined by considering the (i) population frequency detailed by the Exome Aggregation Consortium (ExAC) of the Broad Institute of Harvard and MIT [], (ii) deleteriousness of the variant predicted by PolyPhen-2 [], SIFT [], MutationTaster2 [], and FATHMM [], and (iii) ClinVar reports of pathogenicity [] and published literature. […]

Pipeline specifications

Software tools PhenoTips, Picard, BWA, GATK, VEP, PolyPhen, SIFT, MutationTaster, FATHMM
Databases ClinVar
Application WES analysis
Organisms Homo sapiens
Diseases Glycogen Storage Disease Type II, Neuromuscular Diseases, Respiratory Insufficiency, Genetic Diseases, Inborn
Chemicals Creatine