Methylation scoring software tools | Bisulfite sequencing data analysis
High-throughput bisulfite sequencing is widely used to measure cytosine methylation at single-base resolution in eukaryotes. It permits systems-level analysis of genomic methylation patterns associated with gene expression and chromatin structure.
Provides a background correction method which uses a mixture of exponential and truncated normal distributions to flexibly model signal intensity and uses a truncated normal distribution to model background noise. Depending on data availability, three approaches are employed to estimate background normal distribution parameters using (i) internal chip negative controls, (ii) out-of-band Infinium I probe intensities or (iii) combined methylated and unmethylated intensities. Evaluation results in both duplicates and experimental standard samples showed that ENmix outperformed commonly used background subtraction methods in terms of improvement in replicability and accuracy as well as reducing probe design bias. After ENmix background correction the resulting data can be used with other commonly-used preprocessing methods including quantile normalization for between-sample normalization and BMIQ for further correction of probe-design bias.
Allows to map the bisulfite-treated short reads. BS Seeker is a bisulfite sequencing (BS) alignment tool that performs genome indexing, read alignment and DNA methylation levels calling for each cytosine. The software was improved utilizing multiple short-read mapping aligners, supporting gapped mapping and local alignment and building special indexes for handling reduced represented bisulfite sequencing (RRBS) data.
An R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. methylKit is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods such as Agilent SureSelect methyl-seq. In addition, methylKit can deal with base-pair resolution data for 5hmC obtained from Tab-seq or oxBS-seq. It can also handle whole-genome bisulfite sequencing data if proper input format is provided.
A software package to map and determine the methylation state of BS-Seq reads. Bismark is easy to use, very flexible and is the first published BS-Seq aligner to seamlessly handle single- and paired-end mapping of both directional and non-directional bisulfite libraries. The output of Bismark is easy to interpret and is intended to be analysed directly by the researcher performing the experiment.
An open-source software tool capable of analysing whole-genome bisulfite sequencing data with either a gene-centric or gene-independent focus. GBSA’s output can be easily integrated with other high-throughput sequencing data, such as RNA-Seq or ChIP-seq, to elucidate the role of methylated intergenic regions in gene regulation. In essence, GBSA allows an investigator to explore not only known loci but also all the genomic regions, for which methylation studies could lead to the discovery of new regulatory mechanisms.
Provides a standard for bisulfite sequencing data related manipulation. CGmapTools is a command-line bisulfite sequencing analysis toolkit with enhanced features on single-nucleotide variant (SNV) calling and allele specific methylations and visualizations. It includes modules for better data storage, extraction, visualization and improved performance in single-nucleotide polymorphism (SNP) calling.
Aligns Bi-Seq reads obtained either from SOLiD or Illumina. An accompanying methylation-caller program creates a genomic view of methylated and unmethylated Cs on both DNA strands. The output of PASS-bis is a SAM file that besides the standard mapping information includes an additional field indicating on which of the four possible bisulfite-modified strands the alignment occurred.
A package based on the Genome Analysis Toolkit (GATK) map-reduce framework for genotyping and accurate DNA methylation calling in bisulfite treated massively parallel sequencing. At an average 30× genomic coverage, Bis-SNP correctly identified 96% of SNPs using the default high-stringency settings.
An efficient and accurate general-purpose bisulfite sequence mapping program. BatMeth can be deployed for the analysis of genome-wide bisulfite sequencing using either base reads or color reads. It allows asymmetric bisulfite conversion to be detected by labeling the corresponding reference genome with the hit. It integrates novel mismatch counting, list filtering, mismatch stage filtering and fast mapping onto two indexes components to improve unique mapping rate, speed and precision. Experimental results show that BatMeth is faster and more accurate than existing tools.
A generative model to quantify DNA methylation modifications from any combination of bisulfite sequencing approaches, including reduced, oxidative, TET-assisted, chemical-modification assisted, and methylase-assisted bisulfite sequencing data. Lux models all cytosine modifications (C, 5mC, 5hmC, 5fC, and 5caC) simultaneously together with experimental parameters, including bisulfite conversion and oxidation efficiencies, as well as various chemical labeling and protection steps. We show that Lux improves the quantification and comparison of cytosine modification levels and that Lux can process any oxidized methylcytosine sequencing data sets to quantify all cytosine modifications.
A comprehensive tool for identification and analysis of the methylation patterns of genomic regions from bisulfite sequencing data. CpG_MPs first normalizes bisulfite sequencing reads into methylation level of CpGs. Then it identifies unmethylated and methylated regions using the methylation status of neighboring CpGs by hotspot extension algorithm without knowledge of pre-defined regions. Furthermore, the conservatively and differentially methylated regions across paired or multiple samples (cells or tissues) are identified by combining a combinatorial algorithm with Shannon entropy.
A computational tool to accurately identify footprints from bisulfite-sequencing data. MethylSeekR incorporates several methodological improvements and extensions that make it robust and generally applicable. The method is based on a cutoff approach that identifies hypomethylated regions as stretches of consecutive CpGs with methylation levels below a fixed threshold. To achieve high accuracy and sensitivity, MethylSeekR incorporates important preprocessing and filtering steps, and controls segmentation parameters via false discovery rate (FDR) calculations. MethylSeekR is an easy-to-use package that describes in detail each step of the analysis and produces several control plots to facilitate the interpretation of the results and to avoid potential pitfalls in the analysis.
Provides tools for analysing single-nucleotide resolution methylation data. COHCAP is a pipeline that covers most user needs for differential methylation and integration with gene expression data. The software includes quality control metrics, defining differentially methylated CpG sites, defining differentially methylated CpG islands and visualization of methylation data. It contains two different methods of CpG island analysis. COHCAP has been shown scalable for high-quality integrative analysis of cell line data as well as large heterogeneous patient samples.
Analyses small RNA sequencing data from multiple biological sources, taking into account replicate information, to identify robust sets of siRNA precursors. The segmentSeq R package has been extended to identify methylation loci from high-throughput sequencing data from multiple conditions. A statistical model is then developed that accounts for biological replication and variable rates of non-conversion of cytosines in each sample to compute posterior likelihoods of methylation at each locus within an empirical Bayesian framework.
Generates per-base methylation data given a set of bisulfite-treated reads. MethylCoder provides the option to use either of two existing short-read aligners, each with different strengths. It accounts for soft-masked alignments and overlapping paired-end reads. MethylCoder outputs data in text and binary formats in addition to the final alignment in SAM format, so that common high-throughput sequencing tools can be used on the resulting output. It is more flexible than existing software tool and competitive in terms of speed and memory use.
A fragment-based approach for investigating DNA methylation patterns for reduced representation bisulphite sequencing data. DMAP can directly import the output from any bisulphite aligner in sequence alignment/map (SAM) format and identify differential methylation.
Integrates read quality assessment/clean-up, alignment, methylation data extraction, annotation, reporting and visualization. SAAP-RRBS facilitates a rapid transition from sequencing reads to a fully annotated CpG methylation report to biological interpretation.