Computational protocol: Novel therapeutic strategy for cervical cancer harboring FGFR3-TACC3 fusions

Similar protocols

Protocol publication

[…] Extracted DNA was quantified with a Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific). Two hundred nanograms of DNA were sheared on a Covaris S2 (Covaris, Woburn, MA, USA) into fragments with a modal length of 350–500 bp. Sequencing libraries with different indexed adapters were constructed with SureSelect XT Reagent Kits (Agilent). Target enrichment was conducted with SureSelect Human All Exon V5+lncRNA Kit (Agilent) according to the manufacturer’s protocol. The libraries were sequenced on an Illumina HiSeq 2500 platform in rapid run mode with a 2 × 100-bp paired-end module (Illumina).As a quality control step, the Illumina adapter and low-quality sequences were trimmed using Trimmomatic, version 0.32. When DNA inserts were shorter than the read length, the non-adapter portion of the forward and reverse sequences became reverse complements, i.e., the reverse read contained the same sequence information as the forward read. In such cases, the forward read was retained to avoid doubling the count bases from overlapping alignments of paired-end reads. The paired-end and single-end read datasets were separately aligned to a human reference genome (hg19) by BWA, version 0.7.12, and merged for subsequent analyses by SAMtools, version 0.1.19. The aligned reads were processed for removal of PCR duplicates by Picard Tools, version 1.111 (broadinstitute.github.io/picard). Local realignment and base quality recalibration were implemented by GATK, version 3.2.2,. Averages of depth and coverages over target regions captured by SureSelect Human All Exon V5+lncRNA Kit were calculated by DepthOfCoverage and CallableLoci tools in GATK, respectively,. [...] We used an analytical pipeline in which putative somatic single-nucleotide variants (SNVs) and short insertions and deletions (indels) were called based solely on whole-exome sequencing data from tumor samples without matched normal samples. SNVs and indels were determined based on whole-exome sequencing data from tumor-derived DNA by using HaplotypeCaller and VariantRecalibraor of GATK,. Functional annotation of the identified variants was implemented by ANNOVAR. We hypothesized that somatic mutations were not merely identified as germline genetic variations in the general population. We defined putative somatic SNVs and indels as variants for which the allele frequencies were less than 0.1% in all populations, based on publicly available databases provided by the following whole-genome and -exome sequencing projects: the 1000 Genomes Project, the Exome Aggregation Consortium, and the Human Genetic Variation Database–. Prevalence of putative somatic mutations from previous genome-wide screenings in various cancer types was retrieved from COSMIC, release v79. [...] We sought putative somatic copy number alterations by using Control-FREEC software. The read counts per region covered by consecutive capture probes were normalized by GC content. We excluded the regions with low mappability scores calculated for the read length of 100-bp, allowing up to two mismatches. Since matched normal DNA samples were not available in this study, the read counts of each tumor sample were compared with “reference read counts” obtained by pooling blood-derived sequencing data generated by the same exome sequencing platform from 21 women without any history of gynecological cancers. […]

Pipeline specifications

Software tools Trimmomatic, BWA, SAMtools, Picard, GATK, ANNOVAR, Control-FREEC
Databases HGVD
Application WES analysis
Organisms Mus musculus
Diseases Carcinoma, Squamous Cell, Neoplasms