Computational protocol: Revisiting the morbid genome of Mendelian disorders

Similar protocols

Protocol publication

[…] The SHGP database is generated on individuals with various genetic diseases based on the Mendeliome assay and/or exome sequencing. As described before, the Mendeliome assay comprises 13 gene panels which cover the spectrum of “pediatric and adult” clinical genetic medicine []. Within each panel, genes were sorted based on the most prominent sign/symptom with which they are most likely to be associated upon presentation to clinical care. A total of 3070 genes covering over 4000 Mendelian disorders as annotated by OMIM up to August 2013 were used as a basis for the design and synthesis of highly multiplexed gene panels using Ion AmpliSeq Designer software (Life Technologies, Carlsbad, CA, USA). For both the Mendeliome assay as well as for exome sequencing, DNA samples were treated to obtain Ion Proton AmpliSeq libraries as appropriate. The template-positive Ion PI Ion Sphere particles were processed for sequencing on the Ion Proton instrument (Thermo Fisher, Carlsbad, CA, USA).After running several quality checks as described before, the reads were aligned using the tmap program (Ion Torrent Suite, Thermo Fisher, Carlsbad, CA, USA, https://github.com/iontorrent/TS) to the reference hg19 sequence. The variants within the aligned reads were called using the Torrent Suite Variant Caller (TVC) program.A total of 5849 non-overlapping individuals were assayed using the Mendeliome assay and 2000 were assayed using whole-exome sequencing as of August 2016. These comprise the content of the SHGP database until that data.In addition to these, we sequenced additional 350 exomes from the sample pool using Illumina. That is, we have 350 samples whose exomes were sequenced using both the Ion Proton and Illumina platforms. The Illumina exomes were sequenced as follows: the exome target regions were captured using the TruSeq Exome Enrichment kit (Illumina) according to the recommended manufacturer’s protocol. Then, the individual samples were processed to produce Illumina sequencing libraries and, in the subsequent step, the sequencing libraries were enriched for the desired target using the Illumina Exome Enrichment protocol. The final libraries were then sequenced using an Illumina HiSeq 2500 Sequencer to an average read depth of target regions of 81.8X. The reads were mapped against UCSC hg19 by BWA. The SNVs and Indels were called using the GATK package.For both Ion and Illumina exomes, the variants were annotated using public knowledge databases as well as in-house variants databases, as described in []. The pathogenicity predictions were computed using SIFT, PolyPhen2, MutationTaster, MetaSVM, and CADD. […]

Pipeline specifications

Software tools TMAP, BWA, GATK, PolyPhen, MutationTaster, TESSA
Databases OMIM
Application WES analysis
Organisms Homo sapiens
Diseases Fanconi Syndrome, Fanconi Anemia, Genetic Diseases, Inborn