TCGA Data Portal / The Cancer Genome Atlas Data Portal

Generates, analyzes, and makes available genomic sequence, expression, methylation, and copy number variation (CNV) data on over 11,000 individuals who represent over 30 different types of cancer. The information generated by TCGA is centrally managed and entered into databases as it becomes available, making the data rapidly accessible to the entire research community. TCGA is a collaborative effort led by the National Cancer Institute and the National Human Genome Research Institute to map the genomic and epigenomic changes that occur in types of human cancer, including nine rare tumors. Its goal is to support new discoveries through the generation of a catalog of somatic aberrations occurring in the different neoplasms, and accelerate the pace of research aimed at improving the diagnosis, treatment, and prevention of cancer.


A comprehensive, curated oncogenomic database that provides copy number aberration data to the human cancer research community. Over the past years, the database has undergone an extensive expansion and significant qualitative enhancements. Particularly, the database has made the transition from a ‘cytogenetic’ resource based on cancer cytogenetic data to an integrated resource incorporating cancer genome data from increasing variety of genome analysis techniques. Likewise, many ideas of the user interface improvements and data analysis tools have been implemented based on suggestions from users.


A curated reference database and bioinformatics resource targeting copy number profiling data in human cancer. The arrayMap database provides a platform for meta-analysis and systems level data integration of high-resolution oncogenomic CNA data. The 2014 release of arrayMap contains more than 64 000 genomic array data sets, representing about 250 tumor diagnoses. The large amount of tumor CNA data in arrayMap can be freely downloaded by users to promote data mining projects, and to explore special events such as chromothripsis-like genome patterns.

CanGEM / Cancer GEnome Mine

A public, web-based database for storing quantitative microarray data and relevant metadata about the measurements and samples. CanGEM supports the MIAME standard and in addition, stores clinical information using standardized controlled vocabularies whenever possible. Microarray probes are re-annotated with their physical coordinates in the human genome and aCGH data is analyzed to yield gene-specific copy numbers. Users can build custom datasets by querying for specific clinical sample characteristics or copy number changes of individual genes. Aberration frequencies can be calculated for these datasets, and the data can be visualized on the human genome map with gene annotations.


A database for identifying and visualizing CNAs in cancers at any specific region within the human genome. CaSNP stores pre-computed raw copy numbers, and dynamically generates viewable and downloadable summaries of CNA status in response to user queries. A schema for uniformly processing, storing, annotating and presenting data sets across different data sets or platforms was successfully implemented, making CaSNP a useful tool for cancer genomic meta-study. The query results contain numerical values of cancer copy numbers and the frequencies of CNA events, which are well suited for more detailed analysis by other software or methods. Besides the tabular display, the heatmap view displays SNP copy numbers in colors, enabling users to intuitively and comprehensively visualize the results and facilitating finding novel CNA regions in subset of samples.

BCIP / Breast Cancer Integrative Platform

Maintains multi-omics data selected with strict quality control and processed with uniform normalization methods. BCIP is a database that provides a user-friendly interface integrating comprehensive and flexible analysis tools on differential gene expression from 9005 tumor and 376 normal tissue samples, copy number variation (CNV) from 3035 tumor samples, microRNA-target interactions, co-expressed genes, KEGG pathways, mammary tissue-specific gene functional networks and survival analysis.