[…] A archive. The RNA-Seq data and ChIP-Seq data were also downloaded from ENCODE. The lincRNA binding peaks from the CHART and ChIRP data were extracted from relevant publications. Detailed data sources are listed in ., The raw sequences of the ChIA-PET data were re-processed with the updated ChIA-PET Tool Pipeline. Replicate data sets were merged before processing. Statistics regarding the chromatin interaction clusters of the ChIA-PET data in the two human and four mouse cell lines are shown in ., For the K562 and mouse ESC lines, mapped files (in bigwig format) were downloaded from ENCODE. For the MCF7 cell line, raw data were filtered with adapters and low quality reads were trimmed using Trimmomatic. The clean reads were mapped to the human hg19 genome using bwa, and the BAM format files were converted to Bedgraph format and normalized based on sequencing depth. The read coverage around the TSSs was calculated using Bedtools., ChIP-Seq and RNA-Seq data were processed with a uniform pipeline. Of the 108 RNA-Seq data sets used in our study, 64 were obtained with the Illumina G2Ax platform and 44 were obtained with the Illumina HiSeq 2000 platform. Since the RNA-Seq data were sequenced using different sequencing platforms, we checked to see if there were any batch effects between the different sequencing platforms. The first two principal components of the expression matrix showed that there were indeed batch effects from the different sequencing platforms (). We then used the combat function in R package sva to correct for the batch effects. The following analysis was performed on the corrected expression data., Chromatin interaction anchor regions from the ChIA-PET data were used as raw anchors, and the overlapping, neighboring anchors were merged into larger regions that were treated as nodes in the chromatin interaction network. If two regions (nodes in the network) had chromatin interactions, an edge was added between these two regions (nodes) in the network. Using this method, we constructed the genome-wide chromatin interaction network., We used the GENCODE V19 and M4 annotation files to define the promoter regions for human and mouse samples, respectively. Genomic regions that were 2. […]

Software tools Trimmomatic, BWA, BEDTools, sva