Computational protocol: Plakophilin-2 is required for transcription of genes that control calcium cycling and cardiac rhythm

[…] RNAs for five control and four PKP2-cKO mice at 21 dpi were extracted using RNA-easy Mini kit (Qiagen). RNA-Seq library preps were made using the Illumina TruSeq RNA Library Preparation Kit v2 using 500 ng of total RNA as input, amplified by 12 cycles of PCR, and run on an Illumina 2500 (v4 chemistry), as single read 50 at the Genome Technology Center at NYUMC. Approximately 200 million reads per sample were generated. Sequencing results were demultiplexed and converted to FASTQ format using Illumina Bcl2FastQ software. Quality Control (QC) for the RNA-Seq reads was assessed using FastQC software. Next, reads were aligned to the mouse genome (build mm10/GRCm38) with Spliced Transcripts Alignment to a Reference (STAR). PCR duplicates were removed using the Picard toolkit (open-source, MIT license). HTSeq package was utilized to generate counts for each gene. The read counts of each transcript were normalized to the length of the individual transcript and to the total mapped read counts in each sample and expressed in counts per millions. In order to validate the intra-group homogeneity we first performed a principal component analysis (PCA). PCA is based on two principal components PC1 and PC2. The first principal component (PC1) is the direction along which the samples show the largest variation. The second principal component (PC2) is the direction uncorrelated to the first component along which the samples show the largest variation. Next, we visualized the genes with a largest variance using a hierarchical clustering heatmap. For each gene, we compared the expression levels between Control and PKP2-cKO RNAs. Gene expression differences were evaluated using Fisher’s exact test after normalizing by the total number of mapped reads in each lane. The resulting p-values were corrected via the Benjamini and Hochberg method. Differentially expressed genes were defined as those with log2 changes of at least 1.5 fold between a pair of samples at FDR of 0.001 for genes with a count above 70. Supplementary Table  lists all the dataset. For differentially expressed genes, we carried out functional annotation analysis using DAVID, . Differentially expressed genes were used as input gene list, and all mouse genes that were expressed in the heart were used as the background. We looked for enrichment for genetic association with KEGG pathways. Dataset was analyzed using R software version 3 and ad hoc packages. […]

Pipeline specifications

Software tools BCL2FASTQ Conversion Software, FastQC, STAR, Picard, HTSeq, DAVID
Databases KEGG
Application RNA-seq analysis
Organisms Homo sapiens, Mus musculus
Diseases Cardiomyopathies
Chemicals Calcium, Flecainide, Isoproterenol, Ryanodine, Tamoxifen