Computational protocol: Into the Wild: Parallel Transcriptomics of the Tsetse-Wigglesworthia Mutualism within Kenyan Populations

Similar protocols

Protocol publication

[…] The bacteriomes of G. pallidipes field collected flies were dissected and DNA was extracted following the Holmes–Bonner protocol (). DNA samples were subject to PCR amplification with general eubacterial 16S rRNA primers, 27F′ and 1492R′ (; ) (Ta= 50 °C; 28 cycles). Amplicons were ligated into pGEM-T vector (Promega, WI) and Escherichia coli JM109 cells (Promega, WI) were transformed. Colonies were verified for a 16S rRNA insertion and sequenced at the West Virginia University’s Department of Biology Genomics Center on an ABI 3130xl analyzer Applied BiosystemsCA) using a 3.1 BigDye protocol (Applied Biosystems). All sequences were quality trimmed and assembled into contigs using CLUSTALW (available at; last accessed January 3, 2017). A consensus 16S rRNA nucleotide sequence for the G. pallidipes Wigglesworthia isolate was deposited in the NCBI Genbank database under accession number MF148851.The evolutionary model used for Bayesian analyses (General time-reversible plus invariant sites plus gamma; GTR + I+ G) of 16S rRNA sequences was determined using the Akaike Information Criterion in MRMODELTEST 2.3 (). Bayesian analyses were performed in MRBAYES 3.2.6 () with the number of categories used to approximate the gamma distribution set at four. Additionally, six Markov chains () were run for 3,000,000 generations. Posterior probability (PP) values were calculated, with stabilization of model parameters (i.e., burn-in) occurring at 2,800,000 generations (standard deviation of split frequencies < 0.01). Every 100th tree following stabilization (burn-in) was sampled to calculate a 50% majority-rule consensus tree. All trees were constructed using the program FIGTREE v1.4.3 (; last accessed January 3, 2017). [...] The bacteriomes of 12 mature tsetse flies were pooled for one biological sample, resulting in the construction of six biological samples analyzed from male and female adult samples. Bacteriome samples were homogenized and total RNA was extracted using a MasterPure RNA purification kit (Epicentre, Madison, WI) according to the manufacturer’s protocol for tissue samples. DNA was removed from the RNA samples using a Turbo DNA-free kit (Ambion, Austin, TX) following the rigorous DNase treatment option. RNA of sufficient quality for cDNA synthesis was confirmed using the Agilent 2000 Bioanalyzer RNA Nano chip. RNA samples were subsequently processed with a Ribo-Zero magnetic kit for Gram-negative bacteria (Epicentre, Madison, WI) according to the manufacturer’s protocol. The resulting mRNA-enriched RNA was then purified using an RNeasy MinEluteCleanup kit (Qiagen, Valencia, CA) and eluted in RNAse-free H2O. The eluted RNA (∼1 µg) was then processed using a Kapa stranded mRNA-seq kit (Kapa Biosystems, Wilmington, MA), with the omission of the poly(A) pulldown, by the WVU Genomics Core Facility. The resulting cDNA libraries were sequenced using the Illumina HiSeq 1000 platform (2 by 51 bp) at Marshall University. Following sequencing, raw reads were postprocessed in order to remove Illumina adapters/primer sequences.FASTQC analysis was performed on the RNA-Seq data sets to validate read quality. In order to capture both Wigglesworthia and tsetse fly reads, the KallistoSleuth pipeline () was first used for the identification and quantification of G. pallidipes-specific read counts based on the genome available at Vectorbase (, last accessed February 16, 2015, Gpal1.2 version). Following the parsing out of tsetse-specific reads from the total pool of bacteriome reads, the remaining sequences were mapped to the Wigglesworthia morsitans genome (NC_016893.1; ) using the STAR alignment tool (). TPM (Transcripts per Million) was used as a measure of gene expression (). Relative fold differences in gene expression between samples were determined as a ratio of each TPM. [...] Differential expression profiles of specific loci between male and female bacteriomes were identified using DESeq (), Kallisto–Sleuth, and ANOVA with an internal multiple tests correction in R using custom scripts (only performed for the Wigglesworthia data set with scripts available upon request). Transcripts were considered differentially expressed if showing an adjusted P value ≤ 0.05. A web-scraping Python script (available at; last accessed October 26, 2016.) merged biologically relevant information obtained from VectorBase to all tsetse genes with significant differences in expression levels between the bacteriomes of different sexes. For those loci that lacked any associated VectorBase annotation, NCBI blastx analyses to the nonredundant protein sequences (nr) database were performed with results filtered to retain only hits with an E-value of < 1e−10 and a BitScore of >50. TRANSPORTDB 2.0 () (; last accessed June 15 2017) was used to describe the predicted cytoplasmic transport protein complement within the Wigglesworthia (WGM) genome with those identified then used to assess transporter expression within field flies. Further, sequences were assigned to Gene Ontology (GO) terms falling within biological process, molecular function and cellular component according to GO hierarchy using BLAST2GO (;; last accessed March 14 2017). Differentially expressed GO terms were identified using Fisher’s exact test followed by a False Discovery Rate (FDR) corrections. Raw RNA-Seq data were uploaded to the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database (Bio-project PRJNA387614). [...] Tsetse flies (G. morsitans) were reared in standard conditions and mated as described previously (). Tsetse flies, of known ages and female fecundity status, were sacrificed and bacteriomes dissected. Total RNA was isolated following the TRIzol protocol (Invitrogen, CA) and treated with DNAseI (Ambion, TX). First strand cDNA synthesis was performed with 140 ng total RNA, a 2 μM primer cocktail of gene specific 3′ end primers (, online) and Superscript Reverse Transcriptase II (Invitrogen) following manufacturer instructions. qRT-PCR was performed with SsoFast EvaGreen supermix (Bio-Rad), 0.4 mM gene-specific primers designed by the Primer3 server (, online), and 2 μl of cDNA template in a Bio-Rad CFX96 real-time PCR detection system (Bio-Rad, CA) with 35 amplification cycles. Primers were checked with the BLAST tool at NCBI to exclude potential unspecific amplification. Additionally, primer specificity was confirmed by a melting curve analysis where the dwell temperatures increased from 65 to 95 °C in 0.5 °C increments every 5 s. Primer efficiency was evaluated using the standard curve method and ranged from 90 to 110%. The threshold cycle (2−ΔΔCT) method was used to calculate relative expression. The Wigglesworthia rpsC (30S ribosomal subunit) was used as the reference gene. At least five individual bacteriomes were processed per group, with each sample being analyzed in triplicate, and the average quantification cycle (CT) obtained. The fold change in gene expression, as compared with the same sex teneral stage (i.e., newly emerged, nonfed adult), was determined for each sample. Negative controls were included in all amplification reactions. Values are represented as the mean ± the standard error of the mean (SEM) with statistical significance determined with ANOVA followed by Tukey’s multiple comparisons post hoc analyses. […]

Pipeline specifications

Software tools FastQC, kallisto, sleuth, DESeq, BLASTX, Blast2GO,, Primer3
Databases VectorBase TransportDB
Applications RNA-seq analysis, qPCR
Organisms Drosophila melanogaster, Trypanosoma brucei
Diseases Genetic Diseases, Inborn
Chemicals Amino Acids