Dataset features

Specifications


Dataset type: Other
Number of samples: 10
Release date: Jan 28 2012
Last update date: Mar 23 2012
Access: Public
Diseases: Huntington Disease
Dataset link Molecular mechanism underlying the regulatory specificity of a Drosophila homeodomain protein that specifies myoblast identity

Experimental Protocol


10 Protein binding microarray (PBM) experiments of Drosophila transcription factors were performed. Briefly, the PBMs involved binding GST-tagged fly transcription factors to double-stranded 44K Agilent microarrays in order to determine their sequence preferences. The method is described in Berger et al., Nature Biotechnology 2006 (PMID: 16998473). A key feature is that the microarrays are composed of de Bruijn sequences that contain each 10-base sequence once and only once, providing an evenly balanced sequence distribution. Individual de Bruijn sequences have different properties, including representation of gapped patterns. The array probe sequences on the custom array design used in this study were reported previously in Berger et al., Cell 2008 (PMID: 18585359) and are available via an academic research use license. Here we provide the data transformed into median signal intensities (after normalization and detrending of the original array data) for all 32,896 8-base sequences, Z-scores for these intensities, and E-scores. E-scores are a modified version of AUC, and describe how well each 8-mer ranks the intensities of the spots. 'Keep fraction' (kf) parameter setting of 0.9 was used to calculate E-scores. In general the E-scores are slightly more reproducible than Z-scores, but contain less information about relative binding affinity. Additional experimental details are found in Berger et al., Nature Biotechnology 2006 (PMID: 16998473), and the accompanying Supplementary information.

Repositories


GEO

GSE35380

ArrayExpress

E-GEOD-35380

BioProject

PRJNA152617

Download


Contact


Leila Shokri

Dataset Statistics

info

Citations per year

Dataset publication