Computational protocol: Complete genome sequence of a serotype 11A, ST62 Streptococcus pneumoniae invasive isolate

Similar protocols

Protocol publication

[…] The generated sequences were annotated identifying coding genes by cross prediction from the FGENESB package http://www.softberry.com/, the GeneMark program [] and the GLIMMER program []. We considered an open reading frame (ORF) prediction to be good when it was identified by each of the three prediction tools. Discrepant ORFs were manually verified by the Artemis viewer [] and by identification of putative ribosomal binding sites. Each gene was functionally classified by assigning a cluster of orthologous group (COG) number or a Kyoto encyclopedia of genes and genomes (KEGG) number, and each predicted protein was compared against every protein in the non- redundant (nr) protein databases http://ncbi.nlm.nih.gov. In order to associate a function with a predicted gene, we used a minimum cut-off of 30% identity and 80% coverage of the gene length, checking at least two best hits among the COG, KEGG, and non- redundant protein databases. The rRNA genes were identified by the FGENESB tool on the basis of sequence conservation, while tRNA genes were detected with the tRNAscan-SE program. The BLASTp algorithm was used to search for protein similarities with other pneumococcal genomes or deposited sequences referred in the present study, following these criteria: >50% similarity at the amino acid level and >50% coverage of protein length. […]

Pipeline specifications

Software tools FGENESB, GeneMark, Glimmer, tRNAscan-SE, BLASTP
Application Genome annotation
Organisms Streptococcus pneumoniae, Homo sapiens, Mycoplasma pneumoniae, Streptococcus pneumoniae AP200, Streptococcus pyogenes, Finegoldia magna, Anaerococcus prevotii, Clostridioides difficile
Chemicals Erythromycin