Dataset features


Application: ChIP-seq analysis
Number of samples: 30
Release date: Apr 13 2012
Last update date: Jul 20 2018
Access: Public
Chemicals: Formaldehyde
Dataset link Transcription Factor Binding Sites by ChIP-seq from ENCODE/Caltech

Experimental Protocol

Cells were grown according to the approved ENCODE cell culture protocols ( Chromatin immunoprecipitation followed published methods (Johnson & Mortazavi et al., 2007) with the exception of certain experiments for which glutaraldehyde was added to the crosslink reaction. Information on the antibodies used is available via the metadata for each subtrack. Libraries were constructed using the Illumina ChIP-seq Sample Preparation Kit or using a modified protocol that includes the addition of multiplexing tags to the fragments. DNA fragments were repaired to generate blunt ends and a single A nucleotide was added to each end. Double-stranded Illumina adaptors or Double-stranded Illumina adaptors with multiplexing tags were ligated to the fragments. Ligation products were amplified by 18 cycles of PCR, and the DNA between 150-250 bp was gel purified. Completed libraries were quantified with Quant-iT dsDNA HS Assay Kit. The DNA library was sequenced on the Illumina GAII and GAIIx sequencing systems, and more recently, for multiplexed libraries, several of them were pooled and sequenced on the HiSeq platform. Cluster generation, linearization, blocking and sequencing primer reagents were provided in the Illumina Cluster Amplification kits. Older libraries were generated using 2 rounds of PCR. Matched input samples were sequenced for each variation of fixation conditions and the number of PCR rounds. Reads of 32 bp, 36 bp or 50 bp length were generated. Sequencing reads (fastq files) were assigned to the corresponding libraries based on the multiplexing tag for pooled libraries (all tags have been removed from reads in the fastq files available for download) or directly processed. Bowtie (Langmead et al., 2009) was used to map reads to the male or female version of the mouse genome (excluding the _random chromosomes in the assembly) depending on the cell line sex. The following parameters were used: "-v 2 -k 11 -m 10 -t --best --strata". Aligned reads were converted into rds files using the ERANGE package (Johnson & Mortazavi et al., 2007) and the program in ERANGE was used to identify enriched regions against the matching input sample. The following settings were used for point-source transcription factors: "--shift learn --ratio 3 --minimum 2 --listPeak --revbackground". For histone modifications, the settings were changed to "--notrim --nodirectionality --spacing 100 --ratio 3 --minimum 2 --listPeak --revbackground".