Computational protocol: Genome Sequence of Enterococcus pernyi, a Pathogenic Bacterium for the Chinese Oak Silkworm, Antheraea pernyi

Similar protocols

Protocol publication

[…] The Chinese oak silkworm (Antheraea pernyi [Guérin-Méneville, 1855]), which belongs to Lepidoptera: Saturniidae, is the most well-known wild silkworm for insect food and silk production. Rearing of the Chinese oak silkworm has a history of about 400 years in China (). Empty-gut disease is one of the most important diseases in A. pernyi, and this disease seriously affects the yield of the tussah cocoon and causes great economic losses. The pathogen of the disease was established as Streptococcus pernyi sp. nov., based on morphological, physiological, biochemical, and serological characteristics (). However, S. pernyi sp. nov. has been reclassified and renamed as Enterococcus pernyi, based on phylogenic analysis of 16S rRNA and tuf gene sequences (). Now, the pathogen of empty-gut disease in A. pernyi has been defined as E. pernyi in the taxonomy of the National Center for Biotechnology Information. The genome of E. pernyi was sequenced to gain a better understanding of the taxonomic status of this bacterium and provide more genomic information for further studies to prevent and cure the disease.The complete genome sequence was determined by Illumina Solexa technology at Novogene Bioinformatics Technology Co., Ltd. (Beijing, China). Sequence assembly was performed using SOAPdenovo version 2.04 (). Coding sequences (CDSs) were predicted using GeneMarkS software () and further annotated into databases through BLASTp, including NCBInr, COG, GO, KEGG, Swiss-Prot, and TrEMBL. tRNAscan (), RNAmmer (), and Rfam () were used to predict tRNAs, rRNAs, and small RNAs, respectively. Gene islands and prophages were predicted using IslandPath-DIOMB () and PHAST software ().The genome size of E. pernyi is 3.09 Mb with a G+C content of 38.35%. A total of 626 Mb of clean data were generated, reaching a genome coverage depth of over 200-fold. Sequences were assembled into 23 contigs with a total length of 3,181,210 bp (largest, 603,828 bp, and smallest, 654 bp) and with an N50 contig size of 370,188 bp. Finally, there were a total of 9 scaffolds with a total length of 3,188,572 bp (largest, 3,086,269 bp, and smallest, 654 bp) and with an N50 scaffold size of 3,086,269 bp. The genome contains 3,153 CDSs with an average length of 854 bp, which represent 84.48% of the whole genome. The annotation results showed that only 224 CDSs (7.1%) were not annotated into any databases; there were 2,916, 1,537, 1,577, 1,487, 1,242, and 2,812 CDSs annotated into NCBInr, COG, GO, KEGG, Swiss-Prot, and TrEMBL, respectively. Meanwhile, 65 tRNAs, 18 rRNAs, and 13 small RNAs were identified. Furthermore, the genome contains 9 genomic islands with an average length of 14,058 bp, and contains 3 prophages with an average length of 37,430 bp. There was no clustered regularly interspaced short palindromic repeat identified in the genome. […]

Pipeline specifications

Software tools SOAPdenovo, GeneMarkS, BLASTP, tRNAscan-SE, RNAmmer, IslandPath, PHAST
Databases UniProt Rfam KEGG
Applications Genome annotation, Phylogenetics
Organisms Bombyx mori, Antheraea pernyi