Computational protocol: Comprehensive genetic exploration of selective tooth agenesis of mandibular incisors by exome sequencing

Similar protocols

Protocol publication

[…] Exome sequencing was performed for 5 individuals in Family A, 4 individuals in Family B and 83 sporadic patients (51 Japanese and 32 Koreans). DNA samples (3 μg) were subjected to exome capture using the SureSelect Human All Exon Kit (Agilent Technologies, Santa Clara, CA, USA) according to the manufacturer’s instructions. In brief, genomic DNA was randomly fragmented by sonication under standard conditions (Covaris, Woburn, MA, USA), followed by end repair, the addition of a single A base, adaptor ligation and gel electrophoresis to isolate 300-bp fragments, followed by PCR amplification. The captured DNA underwent high-throughput sequencing using the HiSeq2500 system (Illumina, San Diego, CA, USA).Next, the size-selected libraries were used for cluster generation on the flow cells. All prepared flow cells were run on the Illumina HiSeq2500 using paired-end 100-bp reads. Reads were mapped to the reference genome (UCSC hg19) using Burrows-Wheeler Aligner v.0.7.9. Burrows-Wheeler Aligner-generated SAM files were converted to BAM format, then sorted and indexed using SAMtools v.0.1.18. Duplicated reads were marked with Picard v.1.102 (https://github.com/broadinstitute/picard). The files obtained in BAM format were analyzed using GATK v.2.7 following their best practice guidelines. In brief, BAM files were first subjected to insertion or deletion (indel) realignment, base quality score recalibration and variant calling with the UnifiedGenotyper walker to obtain potential variants in the Variant Call Format file. These variants were annotated using the algorithm (table.annovar.pl) in ANNOVAR (version 2013jul21). For gene annotation, we used the RefSeq gene database (build hg19), while variant annotation was based on dbSNP (dbSNP 137), the 1000 Genomes Project database and 1208 Japanese data in the Human Genetic Variation database (http://www.genome.med.kyoto-u.ac.jp/SnpDB/index.html). [...] In Family A and Family B, variants detected from exome sequencing data were further analyzed by performing three filtering steps based on different criteria. In the first filtering step, we selected missense and nonsense variants, splice-site single-nucleotide variants and coding indels. The second filtering step was based on the frequency in the Human Genetic Variation database. Variants with a frequency <5% in the Human Genetic Variation database were filtered as SMIA candidates. Finally, heterozygous variants co-segregated in the family were selected. After these filtering steps, candidate variants were confirmed for all family members by Sanger sequencing on the 3730xl DNA Analyzer (Life Technologies, Carlsbad, CA, USA). Functional estimation and the conservation score of the variants were evaluated by prediction tools Polymorphism Phenotyping v2 and Genomic Evolutionary Rate Profiling, respectively.In the 83 sporadic Japanese and Korean patients (SH1-SH83), variants of PAX9, AXIN2, EDA, EDAR, WNT10A, BMP2 and GREM2 previously reported in selective tooth agenesis were identified using exome data. Functional estimation and the conservation score of the variants were evaluated as before by Polymorphism Phenotyping v2 and Genomic Evolutionary Rate Profiling. […]

Pipeline specifications

Software tools BWA, SAMtools, Picard, GATK, ANNOVAR, PolyPhen, GERP
Databases dbSNP HGVD
Application WES analysis
Organisms Homo sapiens
Diseases Abnormalities, Drug-Induced