1 - 14 of 14 results


Allows protein sequence analysis. ANTHEPROT is able to interactively couple multiple alignments with secondary structure predictions. It can submit tasks on a remote server and retrieve data from a remote Web server. This tool is a complete solution for Intranet protein sequence analysis for universities, biological research institutes or biomedical companies. It permits users to integrate secondary structure predictions within multiple alignment and full interactive editing of alignments.


Permits analysis of further expression and solubility datasets. Protein_ML serves as a template for machine learning on high-throughput expression and solubility data. It reconciles systems-level, multi-omic and computational biology with high-throughput protein expression and solubility. The tool is based on multiple machine learning methods, including linear regression, support vector machines (SVMs), random forest decision trees, and neural networks and can characterize the diverse landscape of expression and solubility characteristics.


Predicts protein solubility. Protein-sol returns the predicted solubility and other properties of an amino acid sequence in a graphical format. It was demonstrated with E. coli thioredoxin, known to enhance solubility of co-produced proteins in E. coli. The tool is able to highlight lysine and arginine content in regard to modifying protein solubility. It can interpret subdomain structures and introduces the feature of windowed net charge, which may inform on charge-charge interactions between subdomains.

Periscope / PERIplasmic expreSsion Classifier for sOluble Protein Expression

A predictor for soluble protein expression in the periplasm of Escherichia coli. Periscope is a sequence-based predictor with a two-stage architecture that estimates the expression level and yield of soluble protein in the periplasm of E. coli. Given an input of protein sequence, Periscope classifies the input sequence into low, medium or high expression level, along with the probability of each predicted class. Next, it determines the estimated protein yield in soluble form upon expression in the periplasm of E. coli. Periscope was built from a total of 98 non-redundant protein samples along with 7903 initial features. The two-stage architecture of Periscope consists of first stage support vector machine (SVM) classifier and second-stage support vector regression (SVR) classifiers. Periscope records an overall prediction accuracy of 78%; and Pearson's correlation coefficient of 0.77 when tested on independent test dataset.


A web server for aqueous solubility prediction available through the ChemDB chemoinformatics portal. Aquasol predicts aqueous solubility of small molecules using undirected graph recursive neural networks (UG-RNN) ensembles. One important difference between UG-RNN-based approaches with respect to other methods, is the ability to automatically extract internal representations from the molecular graphs that are well suited for the specific tasks. This aspect is an important advantage for a problem like aqueous solubility prediction, where the optimal feature set is not known and may even vary from one dataset to the other. It also saves time and avoids other costs and limitations associated with the use of human expertise to select features.