Computational protocol: Exopolysaccharide (EPS) Synthesis by Oenococcus oeni: From Genes to Phenotypes

Similar protocols

Protocol publication

[…] Genomic sequences were recovered from databases or produced by GeT-PlaGe Genotoul (Castanet Tolosan France) and Macrogen (Seoul Korea) (unpublished). All 36 new sequences were annotated by RAST (Rapid Annotation using Subsystem Technology, rast.nmpdr.org) and Kaas (KEGG Automatic Annotation Server) . These sequences have been deposited at DDBJ/EMBL/GenBank under the accession numbers listed in . The versions described in this paper for eps gene content are versions XXXX01000000.Multilocus sequence typing (MLST) was performed for all strains according to the procedure described by Bilhère et al. with some modifications. The sequence type (ST) of each strain was constructed from six housekeeping genes: gyrB, g6pd, pgm, dnaE, purK and rpoB whose sequences were obtained by genome analysis in Seed Viewer application of RAST. Sequence treatment was performed by using BioEdit 7.2.3 and the phylogenetic tree was constructed by the neighbor-joining method with a Kimura two-parameter distance model, using MEGA 4 software . Bootstrap values were obtained after 1,000 iterations.From the 3 genomes sequences publicly available at the beginning of our work (genomes of strains O. oeni PSU- 1, ATCC BAA-1163 and AWRI B429), we created a database of 82 protein sequences (, panel initial database), potentially associated with the EPS metabolism including glycosyltransferases, flippases (wzx) and polymerases (wzy) but also glycoside-hydrolases and protein sequences involved in the synthesis of precursors (sugar nucleotides). The 47 other annotated genome sequences were then analyzed for the presence of orthologs of these 82 proteins (BLASTP). Once an ortholog was identified, the gene genomic environment was examined. In addition, all the genes encoding proteins different from those in the initial database (identity <70%), but displaying significant homology (BLASTP or TBLASTX cutoff level of 1e−30), suggesting proteins with related enzymatic activity, were listed and their genomic environment was analyzed. A second analysis was done by searching, among the proteins deduced from the annotated genomes, the conserved motifs of glycoside-hydrolases and glycosyltransferases. Both methods gave the same results, i.e. the same list of eps genes and proteins. To assign protein functions, we used the Pfam database (http://pfam.sanger.ac.uk/). Glycosyltransferase genes were also assigned to GT families, based on the CAZy database. Genes were named () according to the bacterial polysaccharide gene nomenclature (BPGN) system : this system is applicable to all species; it distinguishes different classes of genes and provides a single name for all genes of a given function. The prefix wo–. was chosen in reference to Oenococcus. The genes in cluster eps1 were named woa- and those in eps2 cluster wob-, woc-, wod- and woe-. The A majuscule was used only for the initial transferase. […]

Pipeline specifications

Software tools RAST, KAAS, BioEdit, MEGA, BLASTP, TBLASTX
Databases DDBJ CAZy Pfam
Applications Phylogenetics, Nucleotide sequence alignment
Organisms Oenococcus oeni
Chemicals Galactose, Glucose, Rhamnose, Sucrose