A variety of NGS-based techniques have been developed. For example, chromatin immunoprecipitation coupled with parallel sequencing (ChIP-seq) is widely used to assess the binding of proteins to the genome (Barski et al., 2007). RNA sequencing (RNA-seq) can estimate the abundance of whole transcripts and their isoforms (Mortazavi et al., 2008). Genome-wide nucleosome positioning and open chromatin can be captured by MNase-seq (Schones et al., 2008) and DNase-seq (4), respectively. As the demand for NGS has increased, several thousand NGS-based data sets have been deposited in public data repositories such as gene expression omnibus (GEO) (Barrett et al., 2013). Notably, novel findings frequently emerge from reanalyzing available NGS-based data sets (Hnisz et al., 2013; Kang et al., 2014). However, there is no easy way to access, download, and process a large set of original (raw) NGS-based data for comparative and integrative analysis, although some web-based applications have been developed to resolve the issue.
(Barski et al., 2007) High-resolution profiling of histone methylations in the human genome. Cell .
(Mortazavi et al., 2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods.
(Schones et al., 2008) Dynamic regulation of nucleosome positioning in the human genome. Cell .
(Boyle et al., 2008) High-resolution mapping and characterization of open chromatin across the genome. Cell .
(Barrett et al., 2013) NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res.
(Hnisz et al., 2013) Super-enhancers in the control of cell identity and disease. Cell .
(Kang et al., 2014) Mammary-specific gene activation is defined by progressive recruitment of STAT5 during pregnancy and the establishment of H3K4me3 marks. Mol Cell Biol.
(Kim et al., 2018) Octopus-toolkit: a workflow to automate mining of public epigenomic and transcriptomic next-generation sequencing data. Nucleic Acids Res.