Computational protocol: Virus and dsRNA-triggered transcriptional responses reveal key components of honey bee antiviral defense

Similar protocols

Protocol publication

[…] Quantitative PCR was utilized to examine the relative abundance of virus and honey bee host gene expression in each sample using previously described methods that are in accordance with published guidelines. All qPCR reactions were performed in triplicate using 2 μL of cDNA as template. Each 20 μl reaction was composed of cDNA template, 1X SYBR Green (Invitrogen), 1X Choice Taq Master Mix (Denville Scientific Inc.), 3 mM MgCl2, and forward and reverse primers (600 nM each). A CFX Connect Real Time instrument (BioRad) was utilized for qPCR, the thermo-profile for virus (e.g., SINV-GFP and BQCV) and A. mellifera rpl8 analyses consisted of a single pre-incubation 95 °C (3 min), 40 cycles of 95 °C (5 s), 60 °C (20 s), 72 ºC (30 s), and a final elongation 72 ºC (4 min)(Supplementary Table ). Positive and negative controls, including the use of RNA templates from no RT enzyme cDNA reactions, were included for all qPCR analyses and exhibited the expected results.To quantify the viral RNA (i.e., genome and transcript) abundance in each sample target SINV-GFP qPCR amplicons were cloned into the pGEM-T (Promega) vector, as described in Flenniken and Andino et al.. Plasmid standards, containing 109 to 103 copies per reaction, were used as qPCR templates to assess primer efficiency and generate the standard curve used for viral genome copy quantification. The qPCR primers for RNAseq validation were designed using Primer3Plus and with 60 °C annealing temperatures (Supplementary Table ). Melt point analysis and 2% agarose gel electrophoresis ensured qPCR specificity. Primer efficiencies were evaluated using qPCR assays of cDNA and plasmid dilution series, and calculated by plotting log10 of the concentration versus the crossing point threshold (C(t)) values and using the primer efficiency equation, (10(1/Slope)−1) × 100) (Supplementary Table ).The ΔΔC(t) method was used to calculate relative abundance of SINV-GFP in individual bees (n = 10) because it was most accurate; the ΔΔC(t) method ensures that results are not skewed by inadvertent differences in RNA reverse-transcription efficiencies and starting cDNA template abundance, , . The ΔC(t) for each sample was calculated by subtracting the Am rpl8 C(t) from the SINV-GFP C(t). The honey bee gene encoding ribosomal protein 8, Am rpl8, was selected as an appropriate housekeeping gene for qPCR because it has been utilized in several other studies– and analysis of the RNASeq data presented herein confirmed that rpl8 expression levels were similar in all sequenced libraries. The ΔΔC(t) was calculated by subtracting the average virus-infected ΔC(t) values from the ΔCt values for each treatment group. For host gene expression analyses and RNAseq validation, the percent gene expression for each gene of interest (GOI) was calculated using the following formula: 2−ΔΔC(t) × 100 = % gene expression, in which ΔC(t) = GOI C(t)- rpl8 C(t), and ΔΔC(t) = sample ΔC(t) – mock-infected control ΔC(t). Based on previous work, , we hypothesized that bees (n = 10) co-injected with dsRNA or poly(I:C) would have decreased relative virus abundance as compared to the virus-only treated group. To examine relative virus abundance between treatment groups (e.g., virus-infected bees and dsRNA or poly(I:C) co-treated bees) that had equal variance and normal distribution we performed one-tailed Student’s t-tests. Analysis of honey bee host gene expression revealed unequal variance between treatments groups and thus Welch’s t-tests were used to identify statistical differences in host gene expression. [...] Individual bee cDNA was screened for pre-existing infections via PCR for several honey bee pathogens (Supplementary Table ) using the PCR thermocycler protocol: 95 °C (5 min); 35 cycles of 95 °C (30 s), 57 °C (30 s), and 72 °C (30 s), followed by final elongation at 72 °C for 4 minutes. If the sample was positive for a pathogen, the quantity was then assessed using qPCR. The RNA isolated from the abdomens of at least three representative bees with low (<2,000 DWV and/or BQCV virus genome copies versus 7 × 104–7 × 106 SINV-GFP copies) to no pre-existing infections for each treatment group and time point were selected for transcriptome sequencing for a total of 47 individual bees (Supplementary Table ).Prior to RNASeq library preparation, RNA from each sample was further purified and DNase treated using Qiagen RNeasy columns. RNA quality was assessed using an Agilent 2200 Bioanalyzer and quantified via spectrophotometer. RNA was sent to the Roy J. Carver Biotechnology Center at the University of Illinois for library preparation (Illumina TruSeq Stranded RNA Sample Prep kit). Libraries were prepared and pooled by experimental time point and quantitated using an Illumina Library quantification kit (Kapa). Each pool was paired-end sequenced (2 × 100 nt) on a HiSeq. 2500 using a TruSeq SBS sequencing kit version 4, yielding ~12 million reads per sample, corresponding to at least 9.7 fold coverage (Supplementary Table ), which is in the range of coverage reported in other honey bee transcriptome studies, , . Sequence data was deposited into the NCBI Sequence Read Archive under accession number SRP101337 and is linked with NCBI BioProject #PRJNA377749.FastQC and fastx-toolkit were used to remove low quality reads (Trimmomatic, and reads were aligned to the A. mellifera genome assembly 4.5 from NCBI with Tophat v2.0.14; on average, ~77% of reads from each sample were mapped (Supplementary Table ). The normalized number of Fragments Per Kilobase of transcript per Million mapped reads (FPKM) was determined using CuffDiff, using the default classic FPKM normalization method and the default pooled dispersion model (Benjamini-Hochberg correction; significantly differentially expressed genes (DEGs) had q-value ≤ 0.05). Venn diagrams were generated using Vennt. To further investigate the function of the DEGs, representative protein sequences (the longest sequence if there were splice variants) of every known honey bee gene were blasted against the D. melanogaster protein database via reciprocal BLAST+ to identify fruit fly orthologs and homologs of the honey bee genes because there is a greater amount of gene ontology information for D. melanogaster genes compared to Apis mellifera genes. The honey bee genome encodes approximately 15,000 genes of which 13,592 genes are mapped and provided in the Amel4.5 genome annotation. We annotated 8,944 genes (~66%) as homologs (of which 7,006 were reciprocal best hits or orthologs) to genes encoded by the fruit fly D. melanogaster genome, which encodes ~13,600 genes, . Biological processes (BP) functional enrichment analysis was performed with DAVID. Gene ontology and biological processes (BP-FAT) enrichment analysis was performed with DAVID. [...] RNASeq analysis determined that reads aligning to LOC725387 were more abundant in virus-infected bees. To identify the gene or genes encoded by these differentially expressed reads, the consensus nucleotide sequence was used to query the NCBI Nucleotide collection (nr/nt) and A. mellifera databases using blastn, Sanger sequencing was performed to verify transcript sequence and length, and the results were evaluated using Geneious. Together, these analyses revealed that we identified a previously unrecognized transcript, A. mellifera probable cyclin-dependent serine/threonine-protein kinase (MF116383, 5,158 nt), which is longer than the originally annotated A. mellifera probable serine/threonine-protein kinase clkA (LOC725387, XM_001121241.4, 1,403 nt).In brief, we utilized LOC725387 RNASeq consensus sequence to query the NCBI Nucleotide nr/nt data base and identified an A. cerana transcript annotated as a probable cyclin-dependent serine/threonine-protein kinase DDB_G0292550 (LOC107994302, XM_017051141.1) as the top blastn result, which contained 95% of the submitted sequence and shared 91% identity (E-value = 0, 95% query coverage, 91% identity, 1–6% gaps); additional top blastn hits included A. dorsata GATA zinc finger domain-containing protein 14-like. When the LOC725387 RNASeq consensus sequence was used to query the A. mellifera database, the top blastn result only covered 24% of the query sequence (i.e., A. mellifera probable serine/threonine-protein kinase clkA, XM_001121241.4; E-value = 0, 24% query coverage, 99% identity, 0% gaps). To further characterize the LOC725387 transcript, we Sanger sequenced 5,027 nts (~2–3× coverage) and obtained the most 5′ end of this transcript from RNASeq data (131 bp, >2,000× coverage) (Supplementary Table  and Fig. ). Together nucleotide and amino acid alignments indicate the RNAseq reads aligning to LOC725387 are most similar to a computationally predicted A. cerana cyclin-dependent serine/threonine-protein kinase DDB_G0292550 (Supplementary Fig. ). Therefore, we refer to the gene identified herein as A. mellifera probable cyclin-dependent serine/threonine-protein kinase (Supplementary Fig. ) and submitted the sequence of this transcript to NCBI (MF116383). […]

Pipeline specifications

Software tools Primer3, FastQC, FASTX-Toolkit, Trimmomatic, TopHat, Cufflinks, Vennt, DAVID, BLASTN, Geneious
Databases SRA
Applications RNA-seq analysis, qPCR
Organisms Human poliovirus 1 Mahoney, Apis mellifera
Diseases Infection