Computational protocol: Charting a dynamic DNA methylation landscape of the human genome

Similar protocols

Protocol publication

[…] In house generated WGBS libraries were aligned using MAQ in bisulfite mode to the hg19/GRCh37 reference assembly. Subsequently, CpG methylation calls were made using custom software, excluding duplicate, low quality reads as well as reads with more than 10% mismatches. Methylation ratios of individual CpGs were modeled using a beta-binomial model estimating parameters from the number of methylated and total reads overlapping a given CpG, incorporating replicates. Only CpGs covered by ≥5× reads were considered for further analysis. Differential methylation values of individual CpGs were estimated using the beta-difference distributions. CpG cluster differential methylation was determined by pooling CpG level methylation differences using a random effects model. CpG cluster methylation specificity was determined using the Jensen-Shannon divergence of a CpG cluster’s methylation level distribution across all samples and a reference distribution representing either of the two extremes: completely unmethylated or fully methylated. In silico identified CpG islands were defined by genomic regions of at least 700bp length, an CpG observed vs. expected ratio of greater than 0.6 and a GC content greater or equal than 0.5. For the SNP analysis, we obtained the CEPH SNP set from USCS. GWAS SNPs were retrieved from the GWAS catalog, while most of the GWAS SNP grouping was taken from Maurano et al.. For TFBS analysis, we retrieved peak files from the ENCODE projects and collapsed replicates. Motif analysis was carried out using FIMO. […]

Pipeline specifications

Software tools MAQ, MEME Suite
Databases GWAS Catalog
Application BS-seq analysis
Organisms Homo sapiens, Caenorhabditis elegans