Computational protocol: Meta-analysis of cancer gene expression signatures reveals new cancer genes, SAGE tags and tumor associated regions of co-regulation

Similar protocols

Protocol publication

[…] Python scipy (scientific python) and matplotlib libraries were used for the tests and graphs presented in . We used geneRIF database to scan gene related articles which contains at least 1 geneRIF for ∼10 000 genes (approximately half of the genes covered in this study). To define a cancer geneRIF; we scanned the geneRIFs for any of the ‘tumor’, ‘carcinoma’, ‘cancer’ or ‘neopla’ keywords.When assigning a rank to genes to define the amount of change in single cancer-normal comparison, we used the lowest rank (highest change) in the case of multiple probesets/tags. We used the Entrez Homologene database to assign ortholog numbers.To account for multiple hypotheses testing over genes for microarray data we performed a Benjamini–Hochberg correction as explained earlier. For SAGE data, we chose a relatively stringent P-value (0.02) for the analysis presented. To calculate the false discovery rate (FDR) of multiple hypotheses testing over samples we calculated the expected number of false positives by a randomization based approach which is based on randomly selecting X number of genes from Y number of cancer types; where X represents the number of genes altered in a certain cancer type. […]

Pipeline specifications

Software tools SciPy, matplotlib
Databases HomoloGene
Applications Miscellaneous, WGS analysis
Organisms Homo sapiens
Diseases Neoplasms