[…] Average raw RNA sequencing yield was 36 million (M) reads per sample (range 12–172 M), of which 16 M reads (44%) mapped uniquely to protein‐coding exons (range 5–74 M) (Supporting Information Fig. S1; Table S2). Reads were mapped with GSNAP v2014‐12–28 (“–maxsearch = 100 –npaths = 1 –max‐mismatches = 1 –novelsplicing = 0”) to human reference GRCh37 and then assigned to Ensembl gene models (build 75) using HTSeq (“htseq‐count ‐f bam ‐t exon ‐s no”). rRNA genes were removed from the Ensembl gene set before read counting and thus excluded from all subsequent analyses. Known single nucleotide polymorphisms (SNPs) and splice sites for GSNAP were extracted from the database SNP (dbSNP) build 138 and Ensembl GRCh37 build 75, respectively. After read mapping and counting, DESeq2 was used to call differentially expressed genes (nbinomWaldTest, minReplicates = 5, cooksCutoff = 0.7, trim = 0.4). In addition, we used DESeq2 to generate a normalized (function “fpm”, robust = TRUE) and variance‐stabilized (function “vst”) gene expression matrix for import into and further analysis in Qlucore v3.2. All samples used in this study passed internal quality control (QC) checks, including base qualities, mapping rates, duplication rates, and 5′–3′ coverage, which were performed on the basis of FastQC and RSeQC reports. Gene set enrichment analysis (GSEA) was performed with the xtools.gsea.GseaPreranked module implemented in Broad's javaGSEA stand‐alone desktop application (v2.0.13)., Gene sets for GSEA were downloaded from MSigDB 5.0. Input genes were ranked by DESeq2 p values from most to least significant, considering the directionality of change. False discovery rates were empirically assessed by permutation analysis with 1,000 iterations. GSEA command line options included “‐collapse false –mode Max_probe –norm meandiv –scoring_scheme weighted –include_only_symbols true –make_sets true –rnd_seed 149 –gui false –nperm 1000 –set_max 5000 –set_min 5” (all other options default).The complete RNA‐Seq analysis pipeline (including alignment, QC, differential gene expression analysis, and GSEA) was implemented and run on the workflow management platform Anduril ( […]

Pipeline specifications

Software tools GSNAP, HTSeq, DESeq2, Qlucore, FastQC, RSeQC, GSEA, Anduril
Application RNA-seq analysis
Organisms Homo sapiens, Puma concolor
Diseases Neoplasms, Neuroblastoma