Normalization software tools | ChIP sequencing data analysis
Chip-seq experiments are becoming a standard approach for genome-wide profiling protein-DNA interactions, such as detecting transcription factor binding sites, histone modification marks and RNA Polymerase II occupancy. However, when comparing a ChIP sample versus a control sample, such as Input DNA, normalization procedures have to be applied in order to remove experimental source of biases.
Identifies peak regions in ChIP-Seq datasets that correspond to sites of transcription factor binding. PeakSeq scores the results of ChIP-Seq experiments by compensating for the mappability map and comparing against a normalized matching control dataset. This method was developed for use with tag sequence data from the Illumina Genome Analyzer platform. It can also be used to identify broader regions of binding that show significant enrichment relative to control.
Assists in ChIP-seq quality control and protocol optimization. CHANCE assesses the strength of immunoprecipitation (IP) enrichment to identify potentially failed experiments. It permits to identify insufficient sequencing depth, polymerase chain reaction (PCR) amplification bias in library preparation, and batch effects. It also identifies biases in sequence content and quality, as well as cell-type and laboratory-dependent biases in read density.
Uses convolutional neural networks to learn a mapping from suboptimal to high-quality histone ChIP-seq data. Coda uses a high-dimensional discriminative model to encode a generative noise process. The tool transfers information from generative noise processes to a flexible discriminative model that can be used to denoise new data. It has the potential to improve data quality at reduced costs. The Coda’s performance depends on the similarity of the noise distributions and underlying data distributions in the test and training sets.
Permits users to compare ChIP-Seq data sets. MAnorm is an application designed for quantitative comparison of ChIP-Seq data sets that have a substantial number of peak regions in common. This method is useful for both epigenetic modifications and transcription factors. This application can serve for obtaining cell type-specific and cell state-specific regulation during organism development and disease onset.
Allows comparison and integration of multiple ChIP-seq datasets and extraction of qualitative as well as quantitative information. seqMINER can handle the biological complexity of most experimental situations and proposes methods to the user for data classification according to the analysed features. In addition, through multiple graphical representations, seqMINER allows visualization and modelling of general as well as specific patterns in a given dataset.
Allows correction of nucleotide composition bias, mappability variations and differential local DNA structural effects in deep sequencing data. BEADS is a three-step normalization scheme that estimates and corrects for data set-specific biases. The software can unmask real binding patterns in ChIP-seq data and remove systematic biases present in high-throughput sequencing data. BEADS normalization can assist in detecting genome copy number variations, where bias in the distribution of mapped sequence reads could mask or enlarge differences in copy number.
Allows users to discretize ChIP-seq data. Zerone combines an arbitrary number of ChIP-seq replicates in a single discretized profile, where conflicts are resolved by maximizing the likelihood of the underlying statistical model. This program controls the quality of its output to detect potential anomalies. This tool produces congruent window-based outputs, and it can process hundreds of experiments per day on average hardware.
Converts the raw fastq files into gene/isoform expression matrix and differentially expressed genes or isoforms. hppRNA is a one-in-all solution composed of four scenarios such as pre-mapping, core-workflow, post-mapping and sequence variation detection. It also turns the identification of fusion genes, single nucleotide polymorphisms (SNP), long noncoding RNAs and circular RNAs. Finally, this pipeline is specifically designed for performing the systematic analysis on a huge set of samples in one go, ideally for the researchers who intend to deploy the pipeline on their local servers.
Designed for mRNA, miRNA and circRNA identification and differential expression analysis, applicable to any sequenced organism. miARma-Seq is presented as a stand-alone tool that provides different well-established softwares at ease of installation process. It can analyse a large number of samples due to its multithread design. During the whole analysis miARma-Seq performs several quality control analysis creating quality reports for an easy evaluation of the data.
A quantitative method for comparing two biological ChIP-seq samples. The method employs a new global normalization method: nonparametric empirical Bayes (NEB) correction normalization, utilizes pre-defined enriched regions identified from single-sample peak calling programs, uses statistical methods to define differential enriched regions, then defines binding (histone modification) pattern information for those differential enriched regions. QChIPat was tested on a benchmark data: histone modifications data used by ChIPDiffs. It was then applied on two study cases: one to identify differential histone modification sites for ChIP-seq of H3K27me3 and H3K9me2 data in AKT1-transfected MCF10A cells; the other to identify differential binding sites for ChIP-seq of TCF7L2 data in MCF7 and PANC1 cells.
An R package for the statistical analysis of ChIP-seq experiments. CSAR calculates single-nucleotide read-enrichment values, taking the average size of DNA fragments subjected to sequencing into account. After normalization, sample and control are compared using a test based on the ratio test or the Poisson distribution. Test statistic thresholds to control the false discovery rate are obtained through random permutations. Computational efficiency is achieved by implementing the most time-consuming functions in C++ and integrating these in the R package.
Serves as a two-stage statistical method to normalize ChIP-seq data. ChIPnorm finds differential regions in the genome, given two libraries of histone modifications of different cell types. ChIPnorm method removes most of the noise and bias in the data and offers a normalization that allows direct comparison of values. It also focuses on the discovery of differentially enriched histone-modification sites.
A user friendly program for analysis of ChIP-Seq experiments. PAPST allows users to interact with the significant peaks called from ChIP-seq experiments. It allows powerful co-localization analysis of multiple peaks sets. Written in pure Java, PAPST facilitates fast and interactive exploratory research of DNA binding by transcription factors and epigenetic modifications.
Estimates normalization factor between the ChIP and the control samples. NCIS can accommodate both low and high sequencing depth datasets. This software proceeds by evaluating available ChIP-seq normalization factor estimators through databased simulations. The method utilized is data-adaptive and extends CisGenome’s estimator by choosing the optimal value of bin-width and the threshold of total read counts in a data-adaptive manner.
Measures Next-generation sequencing (NGS)/ChIP-seq experiment quality through global peak alignment comparison. COPAR can extract genomic features based on spectrum method for in-depth analysis of ChIPsequencing profiles. It is able to process mapped read file in BED format and output statistically sound results for diverse high-throughput sequencing (HTS) experiments.
A diagnostic tool to examine the appropriateness of the estimated normalization procedure. By plotting the empirical densities of log relative risks in bins of equal read count, along with the estimated normalization constant, after logarithmic transformation, the researcher is able to assess the appropriateness of the estimated normalization constant.
Enables optimal processing of datasets from different enrichment patterns. Epimetheus is a quantile-based multi-profile normalization tool. Users have the possibility to exclude specific genomic regions like, for example, repetitive elements or any other genomic locations for which artefactual enrichments might be expected. The Epimetheus pipeline involves four main steps: (i) processing of the raw alignment data, (ii) generation of read count intensity (RCI) matrices, (iii) computation of two subsequent levels of normalization (quantile and Zscore) and (iv) generation of outputs and plots.
Assists users with ChIP-on-chip data analysis. ChIPMonk is an application that offers both per array and per probe normalization. It also includes a number of different filters allowing the identification of probes of interest. The program comes with a built-in viewer for genome annotations and all array data is visualized in its genomic context. It is supplied with access to a range of genome assemblies for eukaryotic model organisms.
Provides tools for ChIP-Seq spike-in normalization, assessment and analysis. ChIPSeqSpike is an application that includes visual assessment of the data with meta-profiles, heatmaps, boxplots, correlation, and heatscatter plots. It offers a fast way to process the data through different modes and provides publication quality figures through a well-defined set of functions.