Computational protocol: Ovarian Transcriptome Analysis of Vitellogenic and Non-Vitellogenic Female Banana Shrimp (Fenneropenaeus merguiensis)

Similar protocols

Protocol publication

[…] Before further analysis, the raw sequence reads were preprocessed by eliminating the low quality reads or adapter sequences introduced during the construction of the cDNA library. The data was then further cleaned by using the SOAPnuke program to remove reads containing more than 10% of unknown bases (N) and those defined as low quality because more than 50% of their bases had a quality value ≤ 5. The sequence reads of only the non-vitellogenic library were de novo assembled using Trinity RNA-seq software (version 2.0), then SOAPaligner/SOAP2 was used to map the clean reads to the reference genome. The contigs and singletons were normally referred as unigenes. The functional annotation of unigenes, which provides the information of protein function, Clusters of Orthologous Group (COG) functional annotation and Gene Ontology (GO) functional annotation were first analyzed via blastx against protein databases such as NCBI non-redundant (Nr) protein database, Swiss-Prot, KEGG and finally COG database (e-value < 0.00001). Then, they were aligned to the nucleotide database NT (e-value < 0.00001) via blastn. The Blast2GO (version 2.2.5) program was used to get GO annotation. Following this step, WEGO software was used to obtain the GO functional classifications and distribution of the genes. To refine further our understanding of the biological functions of a particular gene, the unigenes was assigned to KEGG pathway analysis by the online KEGG Automatic Annotation Server (KAAS) (http://www.genome.jp/kegg/kaas/). [...] To identify the DEGs between the 2 samples, the false discovery rate (FDR) method was used to determine a threshold value of p-. The FDR value was ≤ 0.001, the p-value was ≤ 0.001. A level change greater than 2 fold and an absolute value of log2Ratio(vitellogenic stage RPKM/non-vitellogenic stage RPKM) ≥ 1 were used to judge the significance of the gene expression differences. NOISeq was used to screen the differentially expressed genes of the vitellogenic and non-vitellogenic ovaries. In addition, RPKM (reads per kilobase per million reads) is a method of quantifying the gene expression from RNA sequencing data by normalizing the total read length and the number of mappable reads []. […]

Pipeline specifications

Software tools SOAPnuke, Trinity, SOAPaligner, BLASTX, BLASTN, Blast2GO, WEGO, KAAS, NOISeq
Databases COGs UniProt KEGG KEGG PATHWAY
Application RNA-seq analysis
Organisms Musa acuminata, Fenneropenaeus merguiensis