Computational protocol: The chemosensory receptors of codling moth Cydia pomonella–expression in larvae and adults

Similar protocols

Protocol publication

[…] Samples of first instar larval heads, adult male antennae, and adult female antennae were prepared for RNA sequencing (see Insect Rearing and RNA Extraction; ) at the National Genomics Infrastructure sequencing facility (Uppsala, Sweden). RNA libraries for sequencing were prepared using TruSeq Stranded mRNA Sample prep kit with 96 dual indexes (Illumina, CA, USA) according to the manufacturer’s instructions, with the following changes: the protocols were automated using an Agilent NGS workstation (Agilent, CA, USA) using purification steps as described. Samples were clustered using cBot and sequenced on a HiSeq2500 (HiSeq Control Software 2.2.38/RTA 1.18.61) with a 2 × 126 setup in RapidHighOutput mode. Bcl to Fastq conversion was performed using bcl2Fastq v1.8.3 from the CASAVA software suite. The quality scale is Sanger/phred33/Illumina 1.8+.All sequence read files were delivered to our project account on the UPPMAX Computational Science server (Uppsala, Sweden). For each sample, two fq files were produced, one containing all left-pair reads (sampleX_1.fq) and one containing all right-pair reads (sampleX_2.fq). [...] Initial quality control measures were undertaken prior to transcriptome assembly. For this, Trimmomatic software (version 0.32) was utilized to remove reads in which the sequencing adapter information was present, and also trim low quality bases from the 3′ end of each read. For this, starting from the 3′ terminal nucleotide and moving in the 5′ direction, each base having a PHRED score lower than 20 was removed until a base is encountered with a PHRED score greater than or equal to 20. For execution of the Trimmomatic software, the ILLUMINACLIP (sequencing adapters file: TruSeq3-PE.fa:2:30:10) and TRAILING:20 commands were used. The output of this was two fq files for each input fq file, as described above, with trimmed paired and unpaired reads (sampleX_1_paired.fq and sampleX_1_unpaired.fq).Trimmomatic-processed reads from all of the sample fq files were assembled, de novo, into one transcriptome with Trinity software (release version r20140717). The Trinity perl script was executed with the following parameters specified: –seqType fq –JM 30G –CPU 16 –bflyCPU 3. The output transcriptome file from this process was the Trinity.fasta file. To facilitate deeper analysis of transcript expression in the neonate larvae, a secondary transcriptome was generated from the Trimmomatic-processed larval head sequence files only using the same procedures described above.In order to facilitate unambiguous read mapping of individual sample reads back to unique locations on assembled transcriptome sequences for downstream quantitative analyses, the software cd-hit-est (version 4.5.4-2011-03-07) was used to identify and remove redundant sequences that share 98% or greater identity with other sequences. The Trinity.fasta file was used as input, program parameters -c 0.98 -n 10 were specified and the output file was named Trinity98.fasta. In cases where sequences shared greater than 98% identity, but were of different sizes, the largest of the sequences were retained in the fasta file.To assess completeness of the transcriptome, a D. melanogaster CEGMA transcript file, consisting of transcripts of 457 genes that are highly conserved across all eukaryotes, was blasted against the Trinity98.fasta transcriptome. For this process, a BLAST nucleotide database was generated from the Trinity98.fasta file, and a tblastx query was performed (blast version 2.2.29+) with an e-value threshold of 1e-5 required for reporting of blast hits. […]

Pipeline specifications

Software tools HSC, BCL2FASTQ Conversion Software, BaseSpace, Trimmomatic, Trinity, CD-HIT, CEGMA, TBLASTX
Applications RNA-seq analysis, Nucleotide sequence alignment
Organisms Cydia pomonella, Ilex paraguariensis, Malus domestica
Diseases Protein Deficiency