Computational protocol: Distinct patterns of somatic genome alterations in lung adenocarcinomas and squamous cell carcinomas

Similar protocols

Protocol publication

[…] Exome capture was performed using the Agilent SureSelect Human All Exon 50MB kit followed by Illumina paired-end sequencing. Reads were processed using the Picard pipeline. This pipeline utilizes BWA for read alignment, Picard tools for marking duplicates, and the Genome Analysis Tool Kit (GATK) for realignment around small insertions and deletions (indels) as well as base quality recalibration. Contamination in tumor exomes was estimated using ContEst. Only tumors with <5% contamination, an available SNP6.0 array for copy number analysis, and a valid ABSOLUTE solution were considered in the final analysis. The final sample set included 227 previously described lung ADCs from the TCGA, 274 newly reported lung ADCs from the TCGA, and 159 lung ADCs from the Imielinski et al cohort, together with 176 previously described lung SqCCs from TCGA, and 308 newly reported lung SqCCs from TCGA. Somatic single nucleotide variants (SSNVs) and indels were called using MuTect and Indelocator (, respectively. These algorithms compare the tumor to the matched normal in order to exclude germline variants. Somatic calls were excluded if found in a panel of over 2,900 normal exomes as previously described. Coding mutation patterns can be viewed for individual genes at [...] DNA was hybridized onto Affymetrix SNP 6.0 arrays and normalized as previously described. Segmentation was performed using Circular Binary Segmentation algorithm followed by Ziggurat Deconstruction to infer the length and amplitude each segment. Recurrent focal SCNA peaks were identified using GISTIC2.0. A peak was considered focally amplified or deleted within a tumor if the GISTIC2.0-estimated focal copy number ratio was greater than 0.1 or less than −0.1, respectively. Purity and ploidy were estimated using ABSOLUTE. Two peaks were considered the same across tumor types if 1) the known target gene of each peak was the same or 2) the genomic location of the peaks overlapped +/− 1 Mb and each of the overlapping peaks had less than 25 genes and was smaller than 10 Mb. […]

Pipeline specifications

Software tools DNA copy, GISTIC
Application aCGH data analysis
Diseases Adenocarcinoma, Carcinoma, Squamous Cell, Neoplasms
Chemicals Tyrosine