Computational protocol: Tissue-Specific Signatures in the Transcriptional Response to Anaplasma phagocytophilum Infection of Ixodes scapularis and Ixodes ricinus Tick Cell Lines

Similar protocols

Protocol publication

[…] RNAseq was undertaken for both tick cell lines with duplicate RNA samples from uninfected and infected cells at each time point. For I. scapularis cell line ISE6, RNAseq was conducted as reported previously (Villar et al., ). For I. ricinus cell line IRE/CTVM20, RNAseq was undertaken using a similar experimental approach. Briefly, RNA (200 ng) was reverse transcribed to generate double-stranded cDNA using the cDNA Synthesis System (Roche, Basel, Switzerland) and random hexamers. Illumina sequencing libraries were prepared using the Nextera XT system (Illumina, San Diego, CA, USA) and sequenced using an Illumina GA IIx instrument. Sequence analysis was undertaken using multiplexed paired-end samples. Pre-analysis sequence quality checking was performed using the FastQC programme (Babraham Institute, Babraham, Cambridgeshire, United Kingdom). The program BowTie2 (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) was used as an assembler to align sequenced reads with the reference I. scapularis genome sequence (assembly JCVI_ISG_i3_1.0; http://www.ncbi.nlm.nih.gov/nuccore/NZ_ABJB000000000). TopHat2 was then used to analyze the mapping results and identify splice junctions between exons. The Cufflinks program was used to provide an estimation of gene abundance and differential gene expression, allowing for splice variants and gaps due to the genome reference. Within Cufflinks, Cuffmerge was used to merge Cufflinks assemblies to provide normalization of biological replicates. Cuffquant was used to provide abundance estimation across normalized samples. The Cuffdiff algorithm was used to account for biological variability between samples and identify differentially expressed genes; this included non-statistical analysis (log2 fold-change) and statistical analysis (test for variance), in order to identify statistically significant fold-changes in gene expression (p ≤ 0.05; q ≤ 0.05). The TopHat-Cufflinks-Cuffmerge-Cuffquant-Cuffdiff pipeline was also used to analyze RNAseq data for I. scapularis cell line ISE6 (Villar et al., ) and differentially expressed genes were selected for this study based on the same criteria used for the I. ricinus cell line IRE/CTVM20 (p ≤ 0.05; q ≤ 0.05). The RNAseq data for I. scapularis cell line ISE6 were deposited in NCBI's Gene Expression Omnibus and are accessible through GEO Series accession number GSE68881 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE68881; Villar et al., ). For I. ricinus cell line IRE/CTVM20, the RNAseq data have been deposited in NCBI's Gene Expression Omnibus and are accessible through GEO Series accession number GSE76906 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE76906). [...] Functional annotations for each gene were obtained from Uniprot (www.uniprot.org) using GO annotations, Enzyme Commission (EC) number, and InterPro (www.ebi.ac.uk/interpro) using DAVID functional annotation tool (http://david.abcc.ncifcrf.gov/tools.jsp). Configuration for GO annotation included an E-value-Hit-filter of 1.0E-6, annotation cut off of 55, and GO weight of 5. For visualizing the GO annotations for molecular function (MF) and biological process (BP), the analysis tool of the Blast2GO software (version 2.6.6; www.blast2go.org) was used.The InterPro motifs obtained using DAVID were used to evaluate the fold enrichment of differentially expressed genes. Functional annotation provides a Chart Report containing an annotation-term-focused view, which lists annotation terms and their associated genes under study. To avoid over counting duplicated genes, the Fisher Exact statistics was calculated based on corresponding DAVID gene IDs in which all redundancies in original IDs were removed. The results of Chart Report have to pass the thresholds (by default, Maximum Probability ≤ 0.1 and Minimum Count ≥ 2) in Chart Option section to ensure that only statistically significant ones are displayed. To evaluate the fold enrichment of differentially expressed genes, which corresponds to a set of genes highly associated with certain terms, the EASE Score Threshold (Maximum Probability) was used. The threshold of EASE Score, a modified Fisher Exact p-value, ranges from 0 to 1. Fisher Exact p = 0 represents perfect enrichment. We used Fisher Exact p ≤ 0.05 to consider enrichment in the annotation categories for differentially expressed genes. Finally, Panther (www.pantherdb.org) was used to calculate overrepresented and underrepresented GO categories. The fold enrichment (FE) of the genes observed in the uploaded list is divided by the expected number. If it is >1, then the category is overrepresented in the dataset. Conversely, the category is underrepresented if it is less than 1. For all BP and MF categories in upregulated and downregulated genes, values were compared between tick cell lines by Chi2 test (p = 0.01). […]

Pipeline specifications

Software tools InterPro, DAVID, Blast2GO, PANTHER
Application Protein sequence analysis
Organisms Anaplasma phagocytophilum, Ixodes scapularis, Ixodes ricinus, Homo sapiens
Diseases Animal Diseases