Computational protocol: Using the MCF10A/MCF10CA1a Breast Cancer Progression Cell Line Model to Investigate the Effect of Active, Mutant Forms of EGFR in Breast Cancer Development and Treatment Using Gefitinib

[…] DNA (200ng) from the MCF10A and MCF10CA1a cell lines was analyzed for copy number alterations using Illumina Human Omni 2.5M SNP arrays (2.5–8 v1.1). Copy number was quantified and summarized using GAP [], and copy number changes across all chromosomes for a given sample were visualized by Circos plots []. Copy number was classified as follows: 0 = deletion, 1 = loss, 3–5 = medium gain, and 6–8 = high level gain. The raw data was submitted to the Gene Expression Omnibus (GEO) database (accession number: GSE59800). [...] Sequencing data was aligned to the human genome (hg19) using BWA []. Cell line specific variants were identified using qSNP [] (heuristic driven somatic/germline caller) and the Genome Analysis Tool Kit (GATK) [] (a Bayesian caller). Only variants that were called by both qSNP and GATK were used in subsequent analyses. The Pindel tool [] was used to identify insertion-deletion events. Cell line specific variants were subsequently annotated for gene consequence using ENSEMBL v70.Several online tools were used to determine the effects of the rare variants discovered by exome sequencing. Either the nucleotide and/or the amino acid residue changes were entered into the PolyPhen-2 [] and Provean [] online software tools. The Provean online tool uses both the Provean and SIFT [–] algorithms to determine the pathogenicity of the mutations. The Ingenuity Pathway Analysis tool was used to assess the gene lists for potentially damaging rare variants in each line (IPA, QIAGEN Redwood City, […]

Pipeline specifications

Software tools Circos, BWA, qSNP, GATK, Pindel, PolyPhen, PROVEAN, SIFT, IPA
Databases GEO
Applications WES analysis, Genome data visualization
Organisms Homo sapiens
Diseases Breast Neoplasms, Carcinoma, Non-Small-Cell Lung, Neoplasms
Chemicals Tyrosine