Computational protocol: DNA methylation age of human tissues and cell types

Similar protocols

Protocol publication

[…] All data are publicly available (Additional file ). Many data sets involve normal adjacent tissue from TCGA. Details on the individual data sets can be found in Additional file . To give credit to the many researchers who generated the data, I briefly mention relevant citations. Data sets 1 and 2 (whole blood samples from a Dutch population) were generated by Roel Ophoff and colleagues []. Data set 3 (whole blood) consists of whole blood samples from a recent large scale study of healthy individuals []. The authors used these and other data to estimate human aging rates and developed a highly accurate predictor of age based on blood data. Data set 4 consists of leukocyte samples from healthy male children from Children’s Hospital Boston []. Data set 5 consists of peripheral blood leukocyte samples []. Data set 6 consists of cord blood samples from newborns []. Data set 7 consists of cerebellum samples, which were provided by C Liu and C Chen (Gene Expression Omnibus (GEO) identifier GSE38873). Data sets 8, 9, 10, and 13 consist of cerebellum, frontal cortex, pons, and temporal cortex samples, respectively, obtained from the same subjects []. Data set 11 consists of prefrontal cortex samples from healthy controls []. Data set 12 consists of neuron and glial cell samples from []. Data set 14 consists of normal breast tissue samples []. Data set 15 consists of buccal cells from 109 15-year-old adolescents from a longitudinal study of child development []. Data set 16 consists of buccal cells from eight different subjects []. Data set 17 consists of buccal cells from monozygotic (MZ) and dizygotic (DZ) twin pairs from the Peri/postnatal Epigenetic Twins Study (PETS) cohort []. Data set 18 consists of cartilage (chondrocyte) samples from []. Data set 19 normal consists of adjacent colon tissue from TCGA. Data set 20 consists of colon mucosa samples from []. Data set 21 consists of dermal fibroblast samples from []. Data set 22 consists of epidermis samples from []. Data set 23 consists of gastric tissue samples from []. Data set 24 consists of head/neck normal adjacent tissue samples from TCGA (HNSC data). Data set 25 consists of heart tissue samples from []. Data set 26 consists of normal adjacent renal papillary tissue from TCGA (KIRP data). Data sets 27 consists of normal adjacent tissue from TCGA (KIRC data). Data set 28 consists of normal adjacent liver samples from []. Data set 29 consists of normal adjacent lung tissue from TCGA (LUSC data). Data set 30 consists of normal adjacent lung tissue samples from TCGA (LUAD data). Data set 31 is from TCGA (LUSC). Data set 32 consists of mesenchymal stromal cells isolated from bone marrow []. Data set 33 consists of placenta samples from mothers of monozygotic and dizygotic twins []. Data set 34 consists of prostate samples from []. Data set 35 consists of normal adjacent prostate tissue from TCGA (PRAD data). Data set 36 consists of male saliva samples from []. Data set 37 consists of male saliva samples from []. Data set 38 consists of stomach from TCGA (STAD data). Data set 39 consists of thyroid TCGA (THCA data). Data set 40 consists of whole blood from type 1 diabetics [,]. Data set 41 consists of whole blood from []. Data sets 42 and 43 consist of involve whole blood samples from women with ovarian cancer and healthy controls, respectively; these are the samples from the United Kingdom Ovarian Cancer Population Study [,]. Data set 44 consists of whole blood from []. Data set 45 consists of leukocytes from healthy children of the Simons Simple Collection []. Data set 46 consists of peripheral blood mononuclear cells from []. Data set 47 consists of peripheral blood mononuclear cells from []. Data set 48 consists of cord blood samples from newborns provided by N Turan and C Sapienza (GEO GSE36812). Data set 49 consists of cord blood mononuclear cells from []. Data set 50 consists of cord blood mononuclear cells from []. Data set 51 consists of CD4 T cells from infants []. Data set 52 consists of CD4+ T cells and CD14+ monocytes from []. Data set 53 consists of immortalized B cells and other cells from progeria, Werner syndrome patients, and controls []. Data sets 54 and 55 are brain samples from []. Data sets 56 and 57 consist of breast tissue from TCGA (27K and 450K platforms, respectively). Data set 58 consists of buccal cells from []. Data set 59 consists of colon from TCGA (COAD data). Data set 60 consists of fat (adipose) tissue from []. Data set 61 consists of human heart tissue from []. Data set 62 consists of kidney (normal adjacent) tissue from TCGA (KIRC). Data set 63 consists of liver (normal adjacent tissue) from TCGA (LIHC data). Data set 64 consists of lung from TCGA. Data set 65 consists of muscle tissue from []. Data set 66 consists of muscle tissue from []. Data set 67 consists of placenta samples from []. Data set 68 consists of female saliva samples []. Data set 69 consists of uterine cervix samples from [,]. Data set 70 consists of uterine endometrium (normal adjacent) tissue from TCGA (UCEC data). Data set 71 consists of various human tissues from the ENCODE/HAIB Project (GEO GSE40700). Data set 72 consists of chimpanzee and human tissues from []. Data set 73 consists of great ape blood samples from []. Data set 74 consists of sperm samples from []. Data set 75 consists of sperm samples from []. Data set 76 consists of vascular endothelial cells from human umbilical cords from []. Data sets 77 and 78 (special cell types) involve human embryonic stem cells, iPS cells, and somatic cell samples measured on the Illumina 27K array and Illumina 450K array, respectively []. Data set 79 consists of reprogrammed mesenchymal stromal cells from human bone marrow (iP-MSC), initial mesenchymal stromal cells, and embryonic stem cells []. Data set 80 consists of human ES cells and normal primary tissue from []. Data set 81 consists of human ES cells from []. Data set 82 consists of blood cell type data from []. […]

Pipeline specifications

Software tools MUSCLE, APE
Databases GEO TCGA Data Portal
Applications Phylogenetics, Nucleotide sequence alignment
Organisms Pan troglodytes, Homo sapiens
Diseases Breast Neoplasms, Neoplasms
Chemicals Steroids