Computational protocol: WISExome: a within-sample comparison approach to detect copy number variations in whole exome sequencing data

[…] Whole-exome sequencing was performed as previously described []: Genomic DNA was isolated from blood for 336 samples and prepared using the SeqCap EZ Human Exome Library v3 kit (Roche, Basel, Switzerland), then sequenced using an Illumina HiSeq 2500. Reads were mapped with BWA (0.7.10) to Hg19 []. We removed duplicate reads (as marked by Picard Tools 1.111), reads with a mapping quality below 30, and reads that were not part of a read pair. Samples were split into a training set of 319 samples and a test set of 17 samples. [...] Array analysis was carried out on the high-resolution CytoScan HD array platform (Affymetrix, a part of Thermo Fisher Scientific, Santa Clara, CA, USA) according to the manufacturer’s protocols. This array consists of over 2.6 million copy number markers. Analysis was done using Nexus software (BioDiscovery, El Segundo, CA, USA), using SNPRank segmentation with a minimum of 20 probes per segment and the significance threshold set at 1e-5. [...] We ran XHMM (downloaded from GitHub @ 18 June 2015), CoNIFER (version 0.2.2; released 17 September 2012), CODEX (GitHub, commit 3d40ac9 @ 7 April 2017), and CLAMMS (GitHub, commit 3e19892 @ 10 April 2017) according to their default settings. All tools were run on the same samples as WISExome, as described in “Sample preparation” section. XHMM and CoNIFER do not distinguish between training and test samples, CODEX and CLAMMS used the same division in training and test samples as WISExome. Additional information on decisions and settings can be found in the Supplementary Section . […]

