Computational protocol: The Caenorhabditis elegans Female-Like State: Decoupling the Transcriptomic Effects of Aging and Sperm Status

Similar protocols

Protocol publication

[…] RNA integrity was assessed using an RNA 6000 Pico Kit for Bio-Analyzer (Agilent Technologies #5067–1513) and mRNA was isolated using a NEBNext Poly(A) mRNA Magnetic Isolation Module (New England Biolabs, NEB, #E7490). RNA-seq libraries were constructed using the NEBNext Ultra RNA Library Prep Kit for Illumina (NEB #E7530), following manufacturer’s instructions. Briefly, mRNA isolated from ∼1 μg of total RNA was fragmented to the average size of 200 nt by incubating at 94° for 15 min in first-strand buffer, cDNA was synthesized using random primers and ProtoScript II Reverse Transcriptase, followed by second-strand synthesis using Second Strand Synthesis Enzyme Mix (NEB). Resulting DNA fragments were end-repaired, dA-tailed and ligated to NEBNext hairpin adaptors (NEB #E7335). After ligation, adaptors were converted to the ‘Y’ shape by treating with USER enzyme, and DNA fragments were size-selected using Agencourt AMPure XP beads (Beckman Coulter #A63880) to generate fragment sizes between 250 and 350 bp. Adaptor-ligated DNA was PCR amplified, followed by AMPure XP bead clean-up. Libraries were quantified with Qubit dsDNA HS Kit (ThermoFisher Scientific #Q32854) and the size distribution was confirmed with High Sensitivity DNA Kit for Bioanalyzer (Agilent Technologies #5067–4626). Libraries were sequenced on Illumina HiSeq2500 in single-read mode with a read length of 50 nt, following manufacturer’s instructions. Base calls were performed with RTA 1.13.48.0 followed by conversion to FASTQ with bcl2fastq 1.8.4. [...] RNA-seq alignment was performed using Kallisto () with 200 bootstraps. Kallisto was run in single-end read mode, setting the average fragment length of 200bp, and a standard deviation of 60bp for all samples. Differential expression analysis was performed using Sleuth (Pimentel et al. 2016). The following general linear model (GLM) was fitted:log(yi)=β0,i+βG,i⋅G+βA,i⋅A+βA::G,i⋅A⋅G,where yi is the TPM count for the ith gene; β0,i is the intercept for the ith gene; βX,i is the regression coefficient for variable X for the ith gene; A is a binary age variable indicating first-day adult (0) or sixth-day adult (1); G is the genotype variable indicating wild type (0) or fog-2(lf) (1); and βA::G,i refers to the regression coefficient accounting for the interaction between the age and genotype variables in the ith gene. Genes were called significant if the FDR-adjusted q-value for any regression coefficient was <0.1. Our script for differential analysis is available on GitHub.Regression coefficients and TPM counts were processed using Python 3.5 in a Jupyter Notebook (). Data analysis was performed using the Pandas, NumPy and SciPy libraries (; ; ). Graphics were created using the Matplotlib and Seaborn libraries (; ). Interactive graphics were generated using Bokeh ().Tissue, phenotype, and gene ontology enrichment analyses (TEA, PEA, and GEA, respectively) were performed using the WormBase Enrichment Suite for Python (, ). Briefly, the WormBase Enrichment Suite accepts a list of genes and identifies the terms with which these genes are annotated. Terms are annotated by frequency of occurrence, and the probability that a term appears at this frequency under random sampling is calculated using a hypergeometric probability distribution. The hypergeometric probability distribution is extremely sensitive to deviations from the null distribution, which allows it to identify even small deviations from the null. […]

Pipeline specifications

Software tools BCL2FASTQ Conversion Software, kallisto, sleuth, Jupyter Notebook, Numpy, matplotlib, Seaborn, Bokeh
Databases WormBase
Applications Miscellaneous, RNA-seq analysis
Organisms Caenorhabditis elegans