Computational protocol: Diagnostic outcomes of exome sequencing in patients with syndromic or non-syndromic hearing loss

Similar protocols

Protocol publication

[…] Clinical exome sequencing (cES) was performed using the in-solution capture of exonic sequences with Nextera Rapid Capture Enrichment kit (Illumina, USA) targeting the exons of 4813 genes associated with human genetic disease (TruSight One Panel by Illumina, USA). Sequencing was performed on the Illumina MiSeq platform in 2 x 100 pair-end reads. Raw sequence files were processed using a custom in-house exome analysis pipeline, based on a GATK best practices backbone. Alignment of reads to the human reference assembly (hg19) was performed using the Burrows-Wheeler (BWA) aligner, duplicate sequences removed using Picard MarkDuplicates, followed by base quality score recalibration, variant calling, variant quality score recalibration and variant filtering using elements of the GATK toolset []. [...] Variants were stored and annotated in the variant collection and annotation system, based on vtools and ANNOVAR software. Refseq gene models were used for transcript positioning of variants and annotations from dbSNP v138 were used for single nucleotide polymorphism (SNP) annotation. The Slovene genomic variation database, based on a compilation of 1500 exomes was considered the primary source for assessment of variants’ prevalence in the population. Furthermore, the datasets of the Exome Aggregation Consortium (ExAC, exac.broadinstitute.org), UK10K control population (www.uk10k.org) and GoNL (www.nlgenome.nl) projects were employed as sources of variant frequencies in other worldwide populations. Consensus calls of dbNSFP v2 precomputed pathogenicity predictions were used for evaluation of pathogenicity for missense variants. Additionally, SNPeff predictors were utilized as a means of parallel annotation of variant effects. GERP++ rejected substation (RS) scores were used as the fundamental information source of evolutionary sequence conservation. Our pipeline included ClinVar, HGMD (http://www.hgmd.cf.ac.uk/ac/index.php), LOVD (http://www.lovd.nl/3.0/home) and Hereditary Hearing loss Homepage databases as sources of known disease association for identified variants.The search for causative variants was first focused on genes already associated with HL (). In the case of syndromic HL patients, we surveyed the variants in genes associated with syndromic features that accompanied the hearing impairment. The associations were tracked by the Human Phenotype Ontology database (http://human-phenotype-ontology.github.io/). We supplemented this list of genes with genes in deafness gene panels [].A minimum median coverage of 60x was required to proceed with the interpretation of exome sequencing data. Variants were taken into consideration, if they were covered by at least 5 reads and if the GATK variant call quality score exceeded 100.0.We filtered the variants in accordance with the mode of inheritance, variant functional effect (we considered missense, nonsense, splice site, in-frame INDELs and frame-shift INDELs in our analysis) and by masking the variant set with phenotype gene panels. Considering the relatively high frequency of more prevalent deafness-associated variants in the general population, we used relaxed frequency threshold criteria for variant selection. For variants in genes, associated with dominant inheritance, we filtered out variants attaining a frequency above 0.01% in control. Conversely, for variants in genes, associated with recessive inheritance we excluded the variants with a minor allele frequency above 2% in the general population. […]

Pipeline specifications

Software tools GATK, BWA, Picard, vtools, ANNOVAR, SnpEff, GERP
Databases dbNSFP ClinVar dbSNP LOVD HGMD UK10K Hereditary Hearing loss Homepage
Application WES analysis
Organisms Homo sapiens
Diseases Genetic Diseases, Inborn