Computational protocol: The mutational landscape of MYCN, Lin28b and ALKF1174L driven murine neuroblastoma mimics human disease

Similar protocols

Protocol publication

[…] Murine tumor and constitutional DNA was captured using the Agilent SureSelect XT target enrichment system for Illumina paired end sequencing. A total of 200 ng of input DNA was used per sample. DNA was sheared using a Covaris S-series Single Tube Sample Preparation System to a target length of 200 bp (duty cycle 10% intensity 5; 200 cycles per burst for 200 seconds). Subsequently, the DNA was purified using Agencourt AMPure beads and analyzed on an Agilent Bioanalyzer using the high sensitivity DNA assay. After end repair and adapter ligation, performed as specified by the manufacturer’s protocol, the samples were purified with Ampure Beads and then amplified for 6 cycles using Phusion High fidelity PCR reagents. Libraries were sequenced on an Illumina HiSeq 2000 in 2x 100 bp mode.Raw sequencing data was demultiplexed on the HiSeq instrument using the manufacturer’s software. Mapping was performed to build 37 of the murine reference genome (Genome reference consortium MGSCm37) using BWA [] (v. 0.5.9). Reads were quality recalibrated using the Genome analysis toolkit [] (v. 1.6-13-g91f02df) and duplicate reads were removed using Picard tools (v. 1.59). Variants were called using the Genome Analysis Toolkit (GATK) unified genotyper [] (v 1.6-13-g91f02df). Variants were annotated and sample calls between tumors and controls were compared using our custom cloud based analysis platform seqplorer ( (De Wilde et al., in preparation). Mutations were found by considering the raw read counts in the tumor and matching normal sample. Fisher’s exact test was calculated on the raw read counts for each variant called by the GATK in the tumor sample and subsequently multiple testing corrected (according to Benjamini Hochberg []). A mutation was considered if its p-value was significant at the 0.05 level, the percentage of variant reads in the normal sample was under 5% and the percentage of variant reads in the tumor sample was at least 10% higher.Coverage data was extracted for each sample using the samtools depth option. To evaluate capture efficiency, we defined the target region as the coding parts of the canonical transcript from of all coding genes in the murine genome, according to the Ensembl database (release 68). The total target region comprised of 196,710 coding genomic elements, including 35,131,573 base pairs. [...] To find the genomic insert of the mutant ALK transgene in the murine genome, we sequenced tail derived DNA from mouse 5 included in this analysis. DNA was sequenced as described above, omitting the exome capture step. The reference genome was constructed from the MGSCm37 genome, adding the fasta sequence of the ALK transgene vector as a separate chromosome. We generated a total of 167 million reads which, after mapping with stampy [] (version 1.0.13), resulted in a genome wide coverage of 8.65 fold with a local coverage of the transgene of 18.34 fold . Subsequently SVDetect [] (version 0.7) was used in interchromosomal rearrangement mode to identify the integration site of the ALK transgene vector. [...] DNA was isolated using the DNeasy Blood & Tissue Kit (Qiagen) according to the manufacturer’s instructions. ArrayCGH was performed using a 180K (AMADID 027411) mouse whole-genome arrays (Agilent Technologies). Random primed labeling (BioPrime ArrayCGH Genomic Labeling System, Invitrogen) was used to label 400 ng of tumor DNA and matched control DNA with Cy3 and Cy5 dyes (Perkin Elmer), respectively. Hybridization and washing were performed according to the manufacturer (Agilent Technologies). Fluorescence intensities were measured on an Agilent G2505C scanner. Data were extracted using the Feature Extraction v10.1.1.1 software (Agilent Technologies), and further processed with ViVar []. Gains and losses were determined using the circular binary segmentation algorithm []. […]

Pipeline specifications

Software tools BWA, GATK, Picard, SAMtools, Stampy, SVDetect
Databases GRC
Applications WGS analysis, WES analysis
Organisms Mus musculus, Homo sapiens
Diseases Neoplasms, Neuroblastoma, Genetic Diseases, Inborn