Computational protocol: Pyrosequencing the transcriptome of the greenhouse whitefly, Trialeurodes vaporariorum reveals multiple transcripts encoding insecticide targets and detoxifying enzymes

Similar protocols

Protocol publication

[…] Blast homology searches and sequence annotations were carried out following a method that was successfully used for a midgut transcriptome of the tomato hornworm, Manduca sexta Linnaeus (Lepidoptera: Sphingidae) []. BLAST2GO software v.2.3.1 ( was used to perform several analyses of the EST assembly (contigs) []. Initially, homology searches were performed remotely on the NCBI server through QBLAST in a sequential strategy. Firstly, contig sequences were searched via BLASTx against the NCBI non-redundant (nr) database, using an E-value cut-off of 1E-3 and selecting predicted polypeptides of a minimum length of 10 amino acids. Secondly, the sequences that did not receive any BLASTx hit were searched via BLASTn against the NCBI nr nucleotide database using an E-value cut-off of 1E-10. Also, BLASTx searches with an E-value cut-off of 1E-5 were performed against the D. melanogaster uniprot (100) database. For gene ontology mapping (GO;, the program extracts the GO terms associated with homologies identified with NCBI's QBLAST and returns a list of GO annotations represented as hierarchical categories of increasing specificity. BLAST2GO allows the selection of a significance level for the false discovery rate, here used at a 0.05% probability level cut-off. GO terms were modulated using the annotation augmentation tool ANNEX [], followed by GOSlim. GOSlim consists of a subset of the GO vocabulary encompassing key ontological terms and a mapping function between the full GO and the GOSlim. Here, we used the 'generic' GOSlim mapping term (goslim_generic.obo) available in BLAST2GO. Enzyme classification (EC) codes, and KEGG (Kyoto Encyclopedia of Genes and Genomes) metabolic pathway annotations, were generated from the direct mapping of GO terms to their enzyme code equivalents. Finally, InterPro (InterProScan, EBI) searches were performed remotely from BLAST2GO via the InterPro EBI web server. Potential ORFs (open reading frames) were identified using the ORF-predictor server ( []. An ORF cut-off of 200 bp was used. [...] Contigs that had a protein motif of a cytochrome P450 or a protein domain of a CCE or a GST, as well as contigs that corresponded to the target sites of the most important chemical classes of insecticides were searched by BLASTn against all the assembled processed reads ( using an E-value cut-off of 1E-4. Each contig was reassembled from the reads that returned a BLAST hit and manually curated using Geneious software v.4.8.5 (Biomatters Ltd, Auckland, New Zealand), to check for potential frame-shifts and SNPs. Nucleotide sequences were dynamic translated using the EXPASY Proteomics Server (, Swiss Institute of Bioinformatics). All the identified sequences were searched by BLASTx against all the assembled contigs in the iceblast server using an E-value cut-off of 1E-4 and the results with more than 99% similarity with the query sequence were eliminated as allelic variants (note that from those sequences, only the longest contigs with the best coverage were manually curated). MEGA 4.0 software [] was used to perform multiple sequence alignment of P450s, CCEs, GSTs and nAChRs prior to phylogenetic analysis and to construct consensus phylogenetic trees using the neighbour-joining method. Bootstrap analysis of 1,000 replication trees was performed in order to evaluate the branch strength of each tree. The manually curated re-assembled contigs that encoded an insecticide target were investigated for the presence of SNPs arising due to nucleotide divergence between the two strains. […]

Pipeline specifications

Software tools Blast2GO, BLASTX, BLASTN, InterProScan, OrfPredictor, Geneious, MEGA
Applications Phylogenetics, Transcription analysis
Organisms Trialeurodes vaporariorum