Computational protocol: Targeted RNA-Seq profiling of splicing pattern in the DMD gene: exons are mostly constitutively spliced in human skeletal muscle

Similar protocols

Protocol publication

[…] Raw reads were edited and filtered prior to analysis. A dedicated analysis pipeline was developed using the Galaxy framework (http://galaxyproject.org). First, relevant adapter sequences were removed with Cutadapt (v.1.3, default parameters), and quality-based trimming at the 3′ ends of reads was performed using the Qtrim tool (v.1.1, parameters: mean quality = 25, window size = 20, minimum read length = 40 nt). Cleaned reads were then mapped to the Human X chromosome reference sequence (hg19, UCSC) with STAR (v. 2.3, annotations from UCSC, parameters: -sjdbOverhang 29 -outFilterMismatchNoverLmax 0.05 -outSJfilterReads Unique, outFilterMultimapNmax 1). A script developed in-house was developed to annotate identified splice junctions, which can be obtained upon request. The output data is processed to obtain DMD mRNA coverage, as well as a list of identified splice junctions and their counts. Only new junctions covered by a minimum of 5 reads in at least two out of the four biological replicates were considered for further analysis. The Percent-Spliced-In (PSI) was calculated for each splicing event using intron-centric metrics, with the SJPIPE pipeline, from the ipsa package (parameters: margin = 0, deltaSS = 0, mincount = 0, https://github.com/pervouchine/ipsa). For the comparison of the DMD-targeted versus publicly available total mRNA sequencing data from the Illumina Human Body Map project 2.0 (skeletal muscle tissue, 2 × 50 bp, GEO sample GSM759515), raw reads were mapped to the Human X chromosome sequence with Bowtie2 (v2.default parameters) and exon junctions detection and count was performed by TopHat2 (v2.0.2, Gene Model Annotations option: chrX_GTF downloaded from UCSC). Two different splice site prediction algorithms were used for computational scoring of 5′ and 3′ splice sites: the Human Splicing Finder tool (http://www.umd.be/HSF3/), which uses the weight matrix model and the MaxEntScan (MES) (http://genes.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq_acc.html) based on the maximum entropy principle. […]

Pipeline specifications

Software tools cutadapt, QTrim, STAR, Bowtie2, TopHat, HSF, MaxEntScan
Application WGS analysis
Organisms Homo sapiens
Diseases Muscular Dystrophy, Duchenne