Computational protocol: Mutations in STAMBP, encoding a deubiquitinating enzyme, cause Microcephaly-Capillary Malformation syndrome

Similar protocols

Protocol publication

[…] Using target capture with the Agilent SureSelect 50 Mb All Exon kit (Agilent Technologies, Santa Clara, CA) and sequencing of 100 bp paired end reads on Illumina Hiseq, we generated over 15 Gb of sequence for each sample such that approximately 90% of the coding bases of the exome defined by the consensus coding sequence (CCDS) project were covered by at least 20 reads. Reads were first quality trimmed from the 3′ end using the Fastx-toolkit and were then aligned to hg19 with BWA. Duplicate reads were marked using Picard and excluded from downstream analyses. For each sample, single nucleotide variants (SNVs) and short insertions and deletions (indels) were called using samtools pileup and varFilter with the base alignment quality (BAQ) adjustment disabled, and were then quality filtered to require at least 20% of reads supporting the variant call. Coverage of the exome was determined using the Genome Analysis Toolkit (GATK). Variants were annotated using both Annovar and custom scripts to identify whether they affected protein coding sequence, and whether they had previously been seen in dbSNP131 or in the 1000 genomes pilot release (Nov. 2010). […]

Pipeline specifications

Software tools FASTX-Toolkit, BWA, Picard, SAMtools, GATK, ANNOVAR
Databases dbSNP CCDS
Application WES analysis
Organisms Homo sapiens
Chemicals Potassium