Computational protocol: High Resolution Methylome Map of Rat Indicates Role of Intragenic DNA Methylation in Identification of Coding Region

Similar protocols

Protocol publication

[…] We downloaded the rat genome sequence and mapping information (rn4) from the University of California Santa Cruz Genome Bioinformatics Site (http://genome.ucsc.edu). The reads were mapped onto the rat genome reference sequence using the high-performance alignment software ‘maq’ version 0.7.1 (http://maq.sf.net) and those with maq quality less than 10 were removed from further analysis. We used MACS (version 1.4.0 beta) for peak detection and analysis of immunoprecipitated single-end sequencing data to find genomic regions that are enriched in a pool of specifically precipitated DNA fragments.The Browser Extensible Data (BED) files of the Human Brain MeDIP seq was downloaded from the SRA012488 . These BED files were then merged and analyzed by MACS to generate peak summit coordinates. The summit files were then used for further downstream analysis. The data for the analysis of alternate splicing events was downloaded from the EBI ASTD database version 1.1 (http://www.ebi.ac.uk/astd/main.htmljsessionid=8E5318CC1D7E9AF0E003465EE3084922).The IPI IDs of identified liver proteins were searched for their gene IDs in ENSEMBLE genome browser and then in UCSC Genome Bioinformatics for gene coordinates. Of the 524 proteins, we could get 494 gene IDs and further analysis was done using these proteins.For analyzing the methylation pattern between the highly vs lowly expressed genes we downloaded microarray gene expression data for control rat liver from Gene Expression Omnibus (GSE 19830). Data analysis was done using Bioconductor package Affy, via R programming language. Average of the normalized intensities of all three replicates was converted to log base 2, and then statistically highly and lowly expressed (mean ± standard deviation) genes were used to check the methylation pattern across their TSS in a 100 kb sliding window.The RefSeq genes, repeat element and CGI coordinates of human and rat were downloaded from UCSC Genome Bioinformatics. The CGIs in our study follow the three basic characteristics, a) length greater than 200 bp, b) GC content >50% and c) CpG Observed/Expected >0.6. The methylation status of the CpG Islands was determined by mapping the methylation peak summits (from MeDIP-Seq data) on the CpG islands. Islands having methylation peak summits were designated methylated islands while the rest were termed unmethylated.For describing the methylation of any event, we have used the term “methylation density”, which in the case of all bar plots is the ratio of methylation peak summit count in the given region to the area in base pairs of that region (, , ). While in the case of line plots, methylation density refers to the ratio of methylation peak count vs number of data points (, , ). [...] The .raw spectral files containing MS and MS/MS data were submitted to Proteome Discoverer 6.0 (Thermo Scientific, San Jose, CA) and searched using Sequest algorithm in IPI rat database (IPI.rat.v3.67.). The search was performed against IPI database V3.74 with specified precursor ion mass tolerance of 10 ppm and fragment ion mass tolerance of 0.8 Dalton with 2 missed tryptic cleavages. Oxidation of methionine was set as dynamic modification while carbamidomethylation of cysteine was set as static modification. To eliminate false discovery, the spectra were searched against decoy database 1% targeted and 5% relaxed FDR. The results of all five fractions were combined to give a multi-consensus report. […]

Pipeline specifications

Software tools Proteome Discoverer, Comet
Application MS-based untargeted proteomics
Organisms Rattus norvegicus, Homo sapiens