Computational protocol: Mutations of epigenetic regulatory genes are common in thymic carcinomas

Similar protocols

Protocol publication

[…] Data processing and variant calling procedure mainly followed the Best Practices workflow recommended by the Broad Institute (http://www.broadinstitute.org/gatk/guide/best-practices). Briefly, the raw sequencing reads were mapped to human genome version 19 by Burrows-Wheeler Aligner, followed by local realignment using the GATK suite from the Broad Institute and duplicated reads were marked by Picard tools (http://picard.sourceforge.net). Somatic variant calling was performed on sequencing reads of matched tumor-normal samples by the Strelka somatic variant caller, and germline variant calling was done with the UnifiedGenotyper from the Broad Institute. Mutiple annotation databases and open source packages were used to annotate and predict the effects of variants, including SnpEff, dbNSFP, dbSNP 137 (NCBI), ESP6500 (NHLBI Exome Sequencing Project), and COSMIC database. SIFT scores range from 0 to 1, and scores < 0.05 suggest that the amino acid change is damaging (sift.jcvi.org/www/SIFT_help.html). PolyPhen-2 scores > 0.85 are interpreted as probably damaging (genetics.bwh.harvard.edu/pph2/dokuwiki/overview#prediction).The following filtering criteria were used for somatic variation calls: (1) SNVs and indels in tumors were considered somatic if they were completely absent in the paired blood samples; (2) Only SNVs and indels of >15 reads with allelic fraction of >15% were reported; (3) MAPQ score of <20 were excluded for variant count; (4) Somatic SNVs and indels were reported only when they were within the open reading frames and splice sites; (5) Synonymous variations, non-coding region mutations outside splice-sites, and single nucleotide polymorphisms (SNPs) that have not been reported to be disease-related were filtered out; (6) All the identified somatic SNVs and indels were validated visually using Integrative Genomics Viewer (IGV, Broad Institute). […]

Pipeline specifications

Software tools GATK, BWA, Picard, Strelka, SnpEff, SIFT, PolyPhen, IGV
Databases dbNSFP dbSNP
Application WES analysis
Organisms Homo sapiens
Diseases Neoplasms, Thymoma