Computational protocol: Analysis of codon usage pattern in Taenia saginata based on a transcriptome dataset

[…] In this study, a total of 91,487 T. saginata unigenes were obtained. Based on a sequence similarity with known proteins, a total of 59,262 unigenes were annotated. Up to 57,607 of which were annotated against the NCBI non-redundant (Nr) protein database, 24,860 were assigned to the protein database Clusters of Orthologous Groups (COG), 26,476 were assigned to the term annotation database of Gene Ontology (GO), and 43,575 were assigned to 200 pathways in the database of Kyoto Encyclopedia of Genes and Genomes (KEGG). Among the annotated unigenes, 61,941 coding sequences (CDS) were obtained by the BLASTx algorithm []. All CDSs were analyzed using the FrameDP software [], which has the ability to self-train directly on EST clusters instead of requiring curated cDNA sets to train the underlying ESTScan and DECODER software [].To minimise the sampling error, only CDS sequences longer than 300 bp were used for this study. The final sequence collection containing 11,399 CDSs was used for our analyses. […]

Pipeline specifications

Software tools BLASTX, FrameDP, ESTScan
Databases KEGG
Application Transcription analysis
Organisms Taenia saginata, Homo sapiens
Diseases Ataxia Telangiectasia