Computational protocol: Intronic regulation of Aire expression by Jmjd6 for self-tolerance induction in the thymus

Similar protocols

Protocol publication

[…] Total RNA was extracted from WT and Jmjd6−/− FTOC samples with (two samples for each category) or without (one sample for each category) RANKL stimulation. One mcirogram of total RNA was used for library construction with TruSeq RNA Sample Prep kit v2 (Illumina) according to the manufacture's protocol. Briefly, poly-A-containing mRNAs were purified using poly-T oligo-attached magnetic beads. The purified mRNAs were fragmented using divalent cations under elevated temperatures and then converted to dsDNA by two rounds of cDNA synthesis using reverse transcriptase and DNA polymerase I. After an end repair process, DNA fragments were ligated with adaptor oligos. The ligated products were amplified by eight cycles of PCR to generate RNA-seq library. Library integrity was verified by Bioanalyzer DNA1000 assay (Agilent Technologies). Sequencing was performed in 101-bp paired-end mode using an Illumina HiSeq (Illumina). A total of 177,060,020 reads were obtained for six samples. Filtered reads were mapped to the UCSC mm10 using the TopHat program (v2.0.10) with the default parameters. The Cufflinks program (v2.1.1) was then used to assemble 22,448 transcripts and to calculate the fragments per kilobase of exon per million mapped fragments (FPKM) values, which are normalized measurement of gene expression levels, with the non-default parameters: -u—library-type fr-secondstrand. To identify differentially expressed genes, the ratio of the maximum FPKM to the minimum FPKM was compared among six samples. When the ratio was more than 3, the gene was regarded as being significantly altered in expression level. We added 0.1 to the FPKM value to avoid division by 0. This led us to identify 3,212 genes with differential expression. Among these, the expression levels of 2,536 genes were significantly associated with either RANKL treatment or Jmjd6 expression (P-value<0.05), and these genes were used for further analyses. Analysis of intron retention was performed as follows. According to the current gene annotation (‘known genes' in UCSC mm10), there are 188,208 introns in total. As intron retention events should be observed in the genes with relatively high expression, we only focused on the genes with the maximum FPKM value more than 10 at least in one of the six samples. As a result, we obtained 84,708 introns. The reads mapped to these intronic regions were counted by the intersectBed program in the BEDTools utilities (v2.17.0) with –c option, and the counts are converted into the FPKM values for each intron (intronic FPKM). There are 1,051 introns with intronic FPKM more than 10 for at least one of the six samples, and the degree of intron retention (IR value) was calculated by dividing intronic FPKM value by conventional FPKM value for each gene. By filtering IR value of Jmjd6−/− sample to that of WT sample more than 1.5, we finally selected 57 introns that are preferentially expressed in Jmjd6−/− samples under RANKL stimulation. The 3′ splice site score was calculated by ‘Splice-Site analyser tool' (http://ibis.tau.ac.il/ssat/SpliceSiteFrame.htm). Shortly, the score expresses to what extent the splice-site sequences match the following consensus sequence: TTTTTTTTTTTCAG/G (‘/' indicate the intron/exon junction). […]

Pipeline specifications

Software tools TopHat, Cufflinks, BEDTools
Application RNA-seq analysis
Organisms Mus musculus
Diseases Deficiency Diseases