Computational protocol: Cellular senescence mediates fibrotic pulmonary disease

Similar protocols

Protocol publication

[…] GraphPad Prism 6.05 and R 3.2.0 were used for statistical analysis and generation of graphs. Data are expressed as the mean±s.e.m. P≤0.05 was considered statistically significant. Unless otherwise indicated, the statistical method used for multiple comparisons was one-way analysis of variance with Tukey's post-hoc comparison. Binary variables were compared using t-tests. Pearson's correlation coefficients were used to summarize biomarker and functional data relationships. For human subject characteristics, continuous and categorical variables were compared using the analysis of variance F-test and the χ2-test, respectively.For RNAseq, specimens were randomly assigned to assay processing, to balance batch preparation, flow cell and lane. The primary endpoint for gene expression was number of reads per gene, which were sequenced cDNA strands mapped back to the reference genome. The number of strands per region was used to evaluate the expression level of the region. Sample quality was evaluated using box plots to visualize gene counts per sample. In addition, minus versus average plots were used to assess global bias. The influence of GC content and gene length on gene expression was also examined. Transcripts quantified by RNAseq that had median counts of <32 in both control (n=19) and IPF (n=20) groups were considered ‘non-expressed'. Normalized count data were evaluated in the same manner as the un-normalized count data, namely by utilizing minus versus average plots and visualizations of the GC content and gene length. Conditional quantile normalization using the cqn Bioconductor package was applied to the RNAseq data, to reduce variability introduced by GC content, gene size and total gene counts per sample. Gene expression was evaluated using empirical Bayes estimates obtained through the use of edgeR in R. The results of the gene expression evaluation were ranked based on P-value and false discovery rate to account for multiple comparisons.For human transcriptome analyses, we report expression changes that were identified in IPF versus control tissue by both microarray and RNAseq at a significance level of q<0.05. For microarray comparisons, subject data were stratified based on FVC scores as follows: least severe (FVC >80%), n=17; moderate (FVC 50–80%), n=60; most severe (FVC <50%), n=16; control, n=64. Significance of differential expression was determined using linear regression via the functions lmFit and eBayes from the limma package. Models were adjusted for age.For mouse model data, linear regression was used to fit a model for each endpoint and included indicator variables adjusting for time period and group comparisons using Bleo-Vehicle as the comparison group. Model assumptions including identification of influential points and the distribution of the residuals were assessed and, when appropriate, transformations were used on the dependent variables to improve the model fit. The models were summarized using the Dunnett's test, which adjusts for multiple comparisons to a single control. […]

Pipeline specifications

Software tools CQN, edgeR, limma
Application RNA-seq analysis
Organisms Mus musculus
Diseases Lung Diseases, Idiopathic Pulmonary Fibrosis
Chemicals Bleomycin, Quercetin