Glycosylation site detection software tools | Post-translational modification data analysis
Glycosylation is a recently identified post-translational modification of proteins in prokaryotes. A glycan moiety is attached enzymatically to a protein by the process of glycosylation. Glycosylation is known to influence biological properties like activity, solubility, folding, conformation, stability, half-life, and/or immunogenicity of different cellular proteins thereby modulating the structure/function of these proteins for a variety of cellular/extracellular functions in a living cell. Determination of glycosite(s) is one important aspect of glycoprotein characterization. The experimental characterization of glycosite(s) and the glycoproteins, however, could be difficult, technically demanding, and time-consuming owing to the labile nature of modification involved as well as lack of high-senstivity yet cost-effective methods for glycoprotein detection. Therefore, the computational algorithms/models to predict glycosites in protein sequences are very useful in complementing and facilitating such studies.
Predicts N-Glycosylation sites in human proteins using artificial neural networks that examine the sequence context of Asn-Xaa-Ser/Thr sequons. At most 2,000 sequences and 200,000 amino acids per submission; each sequence not more than 4,000 amino acids.
Produces neural network predictions of mucin type GalNAc O-glycosylation sites in mammalian proteins. NetOGlyc is based on a carefully selected enlarged database of 299 O-glycosylation sites extracted from O-GLYCBASE, an averaging of eight independently trained networks and an additional variable threshold feature based on the surface accessibility. The validity of the prediction method was assessed by four-fold cross-validation on independent test sets.
Provides Internet-based access to reasonable 3D model of glycoproteins. The aims of GlyProt are (i) to evaluate whether a potential N-glycosylation site is spatially accessible, (ii) to generate reasonable three-dimensional models of glycoproteins with user-definable glycan moieties and (iii) to provide some evidence on how the physicochemical parameters can change between the varying glycoforms of a protein.
Allows users to forecast O-G1cNAcylation sites. O-GlcNAcPRED-II compiles four approaches including: (i) a K-means principal component analysis oversampling technique (KPCA) and fuzzy undersampling method (FUS); (ii) eight types of feature to encode each protein peptide; (iii) four types of classifiers, random forest (RF), k-nearest neighbor (KNN), naive Bayesian (NB) and support vector machine (SVM) used as the sub-classifiers of rotation forest and (iv) majority voting.
Identifies sites from amino acid sequence of proteins. EnsembleGly is a web application for glycosylation site prediction that provides a server for prediction of O-, N-, and C-Linked glycosylation sites with ensemble learning. This web resource offers results of machine learning methods. These methods offer one of the most cost-effective approaches to construction of predictive models in applications where representative training data are available.