Computational protocol: Pseudogymnoascus destructans transcriptome changes during white-nose syndrome infections

Similar protocols

Protocol publication

[…] RNA sequencing was performed using Illumina sequencing as summarized in . Prior to analysis all data sets were quality trimmed using Trimmomatic v.0.35 with the parameters SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25. For samples with paired-end sequencing, only reads with both pairs remaining after trimming were used for further analysis. Analysis of the reads using FastQC v0.11.5 and the results of STAR mapping indicate that there are no significant differences in the quality of the RNA in any of the cultured samples from the MyLu samples. [...] The quality trimmed reads were aligned using STAR v.2.5.1b to the concatenated genomes of M. lucifugus and P. destructans. For M. lucifugus, we used genome assembly Myoluc2.0 and gene models from Ensembl release 84. For P. destructans, we used the genome assembly and gene models from Drees et al.. RSEM v1.2.2958 was then used to apply an expectation maximization algorithm to predict gene expression counts for each transcript. The expected count matrix for all samples is available in Data Set S1. To determine if the number of reads mapped to P. destructans transcripts provided sufficient statistical power to detect differential expression of these genes, we used Scotty to analyze the expected counts generated by RSEM. We determined that 65% of P. destructans genes expressed at a minimum of 4-fold change could be detected with a p-value cutoff of 0.05. Transcripts per million (TPM) was calculated by normalizing read counts for the length of each transcript and adjusting for the library size of mapped reads for each sample. The M. lucifugus transcripts were then removed from the analysis and differential expression was determined using only P. destructans transcripts.Differential expression between conditions was determined using either DESeq2 v1.10.1 or edgeR v.3.12.1 after normalizing across samples using the trimmed mean of M-values (TMM) method and a minimum expression level of 2 TPM combined across all samples. False discovery rate (FDR) was used to control for multiple comparisons using the Benjamini-Hochberg procedure. Hierarchical clustering was performed using R stats package v3.3.1 with Pearson correlation complete-linkage clustering of Euclidean distances. Clustering was confirmed by bootstrap analysis using pvclust v2.0–0 at an α level of 99% and 100 000 iterations. Genes without expression (expected count < 1) in at least 2 MyLu samples were excluded from the final analysis. Annotations for each gene were determined by using Trinotate v3.0, NCBI BLAST v2.2.29+ with the UniProtKB/SwissProt database (E-value cutoff of 1 × 10−4), and InterProScan v.5.20–59.0. Gene ontology annotations were extracted from the InterProScan results and gene ontology enrichment analysis was performed using GOATOOLS v0.6.9 with enrichment or purification measured by Fisher's exact test after FDR correction. […]

Pipeline specifications

Software tools Trimmomatic, FastQC, STAR, RSEM, Scotty, DESeq2, edgeR, Pvclust, Trinotate, BLASTN, InterProScan, GOAtools
Databases UniProtKB
Applications RNA-seq analysis, Transcription analysis
Organisms Pseudogymnoascus destructans, Fungi, Myotis lucifugus
Diseases Infection, Mycoses, Leukoencephalopathies