Computational protocol: Misregulation of Alternative Splicing in a Mouse Model of Rett Syndrome

Similar protocols

Protocol publication

[…] Total RNA was extracted from cortices of 6-weeks-old WT and Mecp2 KO mice using Qiagen RNeasy Mini Plus kit. Genomic DNA was removed by a gDNA Eliminator column. 150ng total RNA was used to prepare sequencing library according to manufacturer’s instructions (Nugen Encore Complete). Each Library was subject to one lane of 100bp single end sequencing using Illumina Hi-Seq 2000. Reads were mapped to the mouse genome (mm9) using Tophat (2.0.8). Reads count for each gene was calculated using htseq-count function in the HTSeq package. Differential gene expression analysis was done using edgeR in R. Splicing analysis was performed using the Mixture of Isoforms pipeline (MISO 0.4.7). Considering the high similarity of the two replicates for each genotype (Correlation = 0.98 for each), reads from two replicates were combined for each genotype and processed with MISO. A stringent filter (total reads for the event ≥ 1000, reads supporting inclusion or exclusion isoform ≥ 50, total reads supporting inclusion and exclusion isoform ≥ 100, |ΔPSI| ≥ 0.20 and Bayes-factor ≥ 20) was used to generate a list of differential splicing events. Read density plot was generated using sashimi plot built in MISO.RNA-Seq data from Chen et al[] and Gabel et al[] were processed as above for splicing analysis. A less stringent filter (total reads for the event ≥ 20, |ΔPSI| ≥ 0.05 and Bayes-factor ≥ 1) was applied to allow for generating more events for further overlap analysis. [...] Gene Ontology (GO) analysis was done using DAVID[]. Briefly, official gene symbols were submitted to DAVID. We used our own RNA-seq data and applied a cutoff of RPKM ≥ 0.5 to generate a list of genes expressed in the mouse cortex (13846 genes). This set of genes expressed in the mouse cortex was used as background for all GO analysis in this manuscript. Terms with Benjamini adjusted P-value < = 0.05 was considered as significant. [...] ChIP-Seq data were generated from two biological replicates (referred to as WT1 and WT2). Raw data was aligned to the mouse genome version mm9 with Bowtie (0.12.7). After excluding non-mapping reads, we had 72, 221, 924 reads for WT1 ChIP and 31, 333, 769 for its input and 84, 871, 157 reads for WT2 ChIP and 22, 412, 408 for its input. We firstly evaluated the quality of these data with respect to ENCODE’s ChIP-seq quality control metrics[]. The Normalized Strand Cross Correlation (NSC) for WT1 ChIP and WT2 ChIP is 1.3 and 1.4, respectively. Another quality control measure is PCR Bottleneck Coefficient (PBC), which gives an estimate of the complexity of the ChIP-seq library[]. PBC<0.5 indicates PCR bottlenecks are present in sequenced libraries. The PBC ranged within [0.63 0.83] across WT1 ChIP sample and [0.85, 0.94] for the WT1 input sample. Similarly, the PBC ranged within [0.63 0.83] across WT2 ChIP sample and was 0.93 for the WT2 input sample. These numbers suggest our libraries were of good quality.We carried out peak calling using MOSAiCS package in R[] using default parameters except for fdrRelaxed = 0.1 for WT1 and WT2 and fdrRelaxed = 0.2 for pooled replicates. Bin and fragment sizes were set to 200 bps for all the runs. We followed a conservative strategy and obtained peaks for individual replicates at false discovery rate of 0.1 and for pooled sample run at 0.2. Then, we identified the peaks in the intersection of the three peak lists and filtered them with mosaics parameters: logMinP > = -log10(0.05) & peakSize > = 150 & aveLog2Ratio > = log2(1.5). This resulted in a total of 20, 652 peaks with median size of 1731 bps. We performed location analysis using mm9 Refseq genes and the nomenclature in Blahnik et al[]. […]

Pipeline specifications

Software tools TopHat, HTSeq, edgeR, MISO, DAVID
Application RNA-seq analysis
Organisms Mus musculus, Homo sapiens
Diseases Rett Syndrome