TCGA_dbGaP specifications


Unique identifier OMICS_18627
Name TCGA_dbGaP
Software type Pipeline/Workflow
Interface Command line interface
Restrictions to use None
Input data Some TCGA project ID, file ID, case ID, disease type or experiment strategy.
Input format PY, CSV
Output data Reads from genomic region of interest.
Output format SAM, CSV
Operating system Unix/Linux
Programming languages Python
Computer skills Advanced
Version 1.0
Stability Stable
Maintained Yes




TCGA_dbGaP citations


Diffuse gliomas classified by 1p/19q co deletion, TERT promoter and IDH mutation status are associated with specific genetic risk loci

Acta Neuropathol
PMCID: 5904227
PMID: 29460007
DOI: 10.1007/s00401-018-1825-z

[…] Raw genotyping files (.CEL) for the Affymetrix Genome-wide version 6 array were downloaded for germline (i.e. normal blood) glioma samples from The Cancer Genome Atlas (TCGA, dbGaP study accession: phs000178.v1.p1). Controls were from publicly accessible genotype data generated by the Wellcome Trust Case–Control Consortium 2 (WTCCC2) analysis of 2699 individuals from […]


Common, germline genetic variations in the novel tumor suppressor BAP1 and risk of developing different types of cancer

PMCID: 5650391
PMID: 29088836
DOI: 10.18632/oncotarget.20465

[…] 8581 and rs390802 to elucidate potential mechanism underlying the association of these SNPs with renal cell carcinoma. Data on renal clear cell carcinoma were downloaded from The Cancer Genome Atlas (TCGA, dbGaP Study Accession: phs000178.v9.p8, data portal: Expression data (RNA-seq) and methylation (HumanMethylation450 chip) were measured in tumo […]


Extending TCGA queries to automatically identify analogous genomic data from dbGaP

PMCID: 5538035
PMID: 28794857
DOI: 10.5256/f1000research.10605.r21260

[…] Latest source code: source code as at the time of publication: doi, 10.5281/zenodo.160551 (Kurata, 2016) ( CC0 1.0 Universal […]


Current Developments in Machine Learning Techniques in Biological Data Mining

Bioinform Biol Insights
PMCID: 5390918
PMID: 28469415
DOI: 10.1177/1177932216687545

[…] ional epigenomic data from the ENCODE and Roadmap Epigenomics projects to understand the biology of complex disorders, such as cancer and autoimmune diseases. In his research, he works with data from TCGA, dbGAP, GEO, and PGC2 databases and collaborates with scientists interested in an in-depth understanding of the molecular mechanisms from a global perspective. He has published more than 40 peer- […]


Association of breast cancer risk with genetic variants showing differential allelic expression: Identification of a novel breast cancer susceptibility locus at 4q21

PMCID: 5340257
PMID: 27792995
DOI: 10.18632/oncotarget.12818

[…] data were available, n = 93 for the data normalized per gene, and n = 94 for the data normalized per isoform. Birdseed processed germline genotype data from the Affy6 SNP array were obtained from the TCGA dbGAP data portal []. Gene expression levels were assayed by RNA sequencing, RSEM (RNAseq by Expectation-Maximization, [] normalized both per gene and per isoform, as obtained from the TCGA conso […]


Fine scale mapping of the 17q22 breast cancer locus using dense SNPs, genotyped within the Collaborative Oncological Gene Environment Study (COGs)

Sci Rep
PMCID: 5013272
PMID: 27600471
DOI: 10.1038/srep32512

[…] were measured with Agilent 44 K. (3) NB93 consists of 93 Caucasian adjacent normal breast samples from TCGA. Birdseed processed germline genotype data from the Affy6 SNP array were obtained from the TCGA dbGaP data portal. (4) BC765 consists of 765 Caucasian breast tumour samples from TCGA. Gene expression levels were assayed by RNA sequencing, RSEM (RNAseq by Expectation-Maximization) normalized […]

