Computational protocol: Candidate Gene Resequencing in a Large Bicuspid Aortic Valve-Associated Thoracic Aortic Aneurysm Cohort: SMAD6 as an Important Contributor

[…] The raw data were processed using an in-house-developed Galaxy-based pipeline, followed by variant calling with the Genome Analysis Toolkit Unified Genotyper (DePristo et al., ). Variants were subsequently annotated and filtered with the in-house developed database VariantDB (Vandeweyer et al., ), which uses ANNOVAR. Heterozygous coding or splice site (±2 bp from exon-intron boundaries for nucleotide substitution, and ±5 bp for multi-bp deletions or insertions) variants with an allelic balance between 0.25 and 0.85 (FLNA in males: 0.75–1) and a minimum coverage of 10 reads were selected. Finally, we included variants that fitted within at least one of the following three categories; unique variants [absent in the Exome Aggregation Consortium (ExAC) database (Lek et al., )], variants with an ExAC Minor Allele Frequency (MAF) lower than 0.01% or variants with an ExAC MAF between 0.01% and 0.1% that had a Combined Annotation Dependent Depletion (CADD) (Kircher et al., ) score above 20. All splice region variants underwent splice site effect prediction using ALAMUT (Interactive Biosoftware, France). Synonymous variants outside of splicing regions were not taken into account.The ExAC database was used as an independent control dataset. The raw data of variants (~all ExAC datasets) fulfilling ExAC's quality control parameters (“PASS”) were extracted from the offline version of ExAC v0.3.1. Since the ExAC variants were annotated using VEP, whereas our patient variant annotation was ANNOVAR-based, we re-annotated the ExAC variants with ANNOVAR. The same variant filtering strategy as described for the patient cohort was subsequently applied. For each selected ExAC variant, the allele frequency was determined by computing the ratio of the Mutant Allele Count (mAC) and Total Allele Count (tAC). Next, we re-scaled each variant's mAC by multiplying its computed allele frequency by its respective tAC_Adj, i.e., the tAC average of all variants in that specific gene. Finally, the variant counts for each panel gene were obtained by summing up the re-scaled mACs. [...] Variants discussed in the results section were confirmed with Sanger sequencing. Primers were designed using Primer3 software (Untergasser et al., ) v4.0.0 and polymerase chain reaction (PCR) products were purified with Calf Intestinal Alkaline Phosphatase (Sigma-Aldrich, USA). Sequencing reactions were performed using the BigDye Terminator Cycle Sequencing kit (Applied Biosystems, Life Technologies, USA), followed by capillary electrophoresis on an ABI3130XL (Applied Biosystems, Life Technologies, USA). The obtained sequences were analyzed with CLC DNA Workbench v5.0.2 (CLC bio, Denmark). […]

Pipeline specifications

Software tools GATK, ANNOVAR, CADD, Primer3
Databases VariantDB
Applications WGS analysis, qPCR
Organisms Homo sapiens, Mus musculus
Diseases Heart Defects, Congenital, Heart Valve Diseases, Aortic Aneurysm, Thoracic