Computational protocol: RNA-seq Profiles of Immune Related Genes in the Staghorn Coral Acropora cervicornis Infected with White Band Disease

[…] Total RNA was extracted from diseased and healthy Acropora cervicornis sampled from Crawl Cay reef in Bocas del Toro, Panama under Autoridad Nacional del Ambiente (ANAM) Collecting permit SE/A-71-08. For the diseased samples, corals with active mobile WBD interfaces were identified by monitoring the mobility of disease interfaces for two days, and then sampling a 2 cm region of tissue at and above the disease interface. A comparably sized and located tissue sample was taken from healthy (i.e. asymptomatic) corals. The coral tissues were flash frozen in liquid nitrogen and stored at -80°C. Total RNA was extracted in TriReagent (Molecular Research Center, Inc.) following the manufacturer's protocol. Total RNA quality was assessed using the RNA Pico Chips on an Agilent Bioanalyzer 2100, and only extractions showing distinctive 28S and 18S bands and RIN values of 6 or higher were prepped for RNA sequencing.RNA sequencing was performed on five diseased and six healthy coral samples using a multiplexed Illumina mRNA-seq protocol [] with the following modifications. Instead of fragmenting the mRNA prior to cDNA synthesis, we obtained much better success fragmenting the double stranded cDNA using DNA fragmentase (New England Biolabs) for 30 minutes at 37°C. RNA-seq libraries were then prepared using next-generation sequencing modules (New England Biolabs) and custom paired-end adapters with 4bp barcodes. Multiplexed samples were run (2-3 samples per lane) on the Illumina GAII platform (Illumina, Inc, San Diego, California, USA) at the FAS Center for System Biology at Harvard University. Barcoded samples were de-multiplexed and raw sequencing reads were quality trimmed to remove sequences and regions with a Phred score of less than 30 and a read length less than 15bp long using custom Perl Scripts in the FASTX-Toolkit (http://hannonlab. de novo transcriptome was assembled using Trinity [] from 463.5 million single-end Illumina RNA-Seq reads from 39 A. cervicornis and 6 A. palmata samples, including the 11 A. cervicornis samples included in this paper. The assembled transcriptome produced 95,389 transcripts with a N50 of 363 and N75 of 696. RNA-seq data were produced using whole coral tissue, which putatively contains sequences from the coral host, its algal symbiont Symbiodinium, and other members of the coral holobiont (e.g. fungi, bacteria, and viruses).In order to resolve the holobiont, and putatively classify the source of the transcripts that were assembled as either coral or non-coral, we utilized a multistep pipeline leveraging the existing genomes of two congener species – A. digitifera [] and A. millepora []. RNA-seq reads were mapped against both Acropora reference genomes using Bowtie [] to produce two exomes. Transcripts from our de novo assembly were aligned using BLAST [] against each exome. Transcripts were assigned as putatively coral if they matched either exome with an e-value of less than 10-10. Transcripts without significant coral hits were assigned as non-coral and could potentially include novel coral and/or algal symbiont Symbiodinium transcripts, as well as other associated eukaryotes, like endolithic fungi. Bacterial and viral transcripts are possible, but less likely given that A-tail selection to isolate eukaryotic mRNAs was performed prior to cDNA synthesis.Putative gene identities for each transcript were identified by performing homology searches against the Swiss-Prot and TReMBLE protein databases [], using tBLASTx. Matches with an e-value of less than10-5 were considered homologous protein-coding genes. Subsequently, GenBank Flat Files corresponding to the hits’ Accession ID’s were downloaded and used to extract taxonomic data for each used as a second method to identify the putative source of the transcripts. GO terms and gene functions were obtained for the annotated transcripts on UniProt. The reference transcriptome sequences are available on Bioproject (accession number PRJNA222758).Differences in gene expression between healthy and disease A.cervicornis specimens were estimated using the R package DESeq []. First, all contigs were separated into two datasets –i.e. coral and non-coral- based on their matches to the Acropora genomes. Size factor estimation and normalization were then performed separately on each dataset using the functions estimateSizeFactors and estimateDispersions, respectively. Differentially expressed contigs were detected by running a negative binomial test using the function nbinomTest. Only differentially expressed transcripts (adjusted p-value < 0.05) that were also annotated (e-values < 10-5) were used for this study. […]

Pipeline specifications

Software tools FASTX-Toolkit, Trinity, Bowtie, TBLASTX, DESeq
Databases UniProt
Applications RNA-seq analysis, Nucleotide sequence alignment
Diseases Infection, Leukoencephalopathies
Chemicals Calcium