1 - 50 of 59 results

SMS / STING Millennium Suite

Provides a variety of algorithms and validated data, wrapped-up in a user friendly web interface. STING Millennium Suite (SMS) is a new web-based suite of programs and databases providing visualization and a complex analysis of molecular sequence and structure for the data deposited at the Protein Data Bank (PDB). It is described in terms of a solution that brings together a number of protein analysis tools at a single web server. SMS is a very powerful tool which enables a quick estimate of the level of engagement for each amino acid within its own protein chain and functionally more importantly, in the mechanism of binding to substrate and/or inhibitor.


A web server for ab initio protein contact and tertiary structure prediction without using any templates. In particular, CoinFold predicts contacts by joint evolutionary coupling (EC) analysis via group graphical lasso (GGL) of multiple (distantly) related protein families which may have divergent sequences but similar folds (i.e. co-evolution patterns). By enforcing co-evolution pattern consistency among a set of related families, we can significantly improve contact prediction accuracy. CoinFold further improves prediction accuracy by integrating supervised learning with this joint EC analysis. Since EC analysis and supervised learning use different types of information, their combination leads to much better prediction. Finally, CoinFold predicts secondary structure using a new in-house tool DeepCNF and then tertiary structure by feeding predicted contacts and secondary structure to the Crystallography & NMR System (CNS) software package, but without using any templates. It significantly outperforms other servers of similar category in both contact prediction and 3D model prediction, especially for those proteins without very good templates.


A set of Random Forest algorithm based models, for predicting residue–residue contact maps. ProC_S3 is based on a collection of 1490 non–redundant, high-resolution protein structures using >1280 sequence-based features. ProC_S3 delivers a 3-fold cross-validated accuracy of 26.9% with coverage of 4.7% for top L/5 predictions (L is the number of residues in a protein) of long-range contacts (sequence separation ≥24). It helps in better understanding the biophysics behind the problem.

COUSCOus / Contact predictiOn Using Shrinked COvariance

A residue-residue contact detecting method approaching contact inference in a similar manner as PSICOV, by applying the sparse inverse covariance estimation technique. COUSCOus combines the best shrinkage approach, the empirical Bayes covariance estimator and GLasso. By analysing the original PSICOV benchmark test set and proteins from the Critical Assessment of techniques for protein Structure Prediction 11 (CASP11) experiments, COUSCOus seems to be significantly outperforms PSICOV.

GREMLIN / Generative REgularized ModeLs of proteINs

A method to learn an undirected probabilistic graphical model of the amino acid composition within the multiple sequence alignments. GREMLIN employs regularization to penalize complex models and thus reduce the tendency to over-fit the data. The strength of measured co-evolution is strongly predictive of residue-residue contacts in the 3D structure of the protein. GREMLIN has also been referred to as a maximum-entropy model or a global statistical model.

RBO Aleph

A protein structure prediction web server for template-based modeling, protein contact prediction and ab initio structure prediction. The server has a strong emphasis on modeling difficult protein targets for which templates cannot be detected. RBO Aleph's unique features are (i) the use of combined evolutionary and physicochemical information to perform residue-residue contact prediction and (ii) leveraging this contact information effectively in conformational space search.


A Bayesian statistical model using knob-socket information that maximizes contact prediction accuracy from a combination of priors and posteriors. The resulting program was then compared over 3 different and difficult test sets to gauge the overall performance on contact predictions against current leading methods. The first is the set of 150 structural families used to comprehensively compare a number of current contact prediction routines that was originally used to characterize PSICOV. The last 2 are the more challenging sets of structures from CASP10.

R2C / Residue-Residue contact

A residue-residue contact predictor that combines machine learning-based and correlated mutation analysis-based methods, together with a two-dimensional Gaussian noise filter to enhance the long-range residue contact prediction. Our results show that the outputs from the machine learning-based method are concentrated with better performance on short-range contacts; while for correlated mutation analysis-based approach, the predictions are widespread with higher accuracy on long-range contacts. An effective query-driven dynamic fusion strategy proposed here takes full advantages of the two different methods, resulting in an impressive overall accuracy improvement.


Analyzes contact maps. PROTMAP2D aims to ease comparative analysis of protein structures, mainly in case which the mutual position and interactions of residues have to been take into account. The software allows quantitative and qualitative analysis of contact maps of protein structures such as individual models, different models of the same protein, ensembles of structures and trajectories. In addition, the software can calculate full distance maps for individual models and converted it into contact maps.

pyMDmix / python Molecular Dynamics simulations with mixed solvents

Identifies high affinity interaction spots over macromolecular systems by means of molecular dynamics simulations using solvent mixtures as solvation conditions. pyMDMix is a python module and a user interface that aims to ease the application of such technique. It allows an easy set up of several simulations for the same system under different conditions: solvent, temperature, restraining schemes, etc. Moreover, after simulations are done, many analysis tools will help to quality check and extract useful information from these simulations in a aqueous-organic environments.

COLORS / improving COntact prediction using LOw-Rank and Sparse matrix decomposition

Permits to remove background correlations. COLORS is based on the low-rank and sparse matrix decomposition (LRS) technique. It can differentiate true correlations from background correlations according to their different characteristics. The tool uses a matrix to measure correlations among residues in the target protein. The capabilities of the method to predict contacts were evaluated using the mean prediction precision.


Uses the Bayes classifier (NBC) theorem to combine eight state of the art contact methods that are built from co-evolution and machine learning approaches. NeBcon (Neural-network and Bayes-classifier based contact prediction) is an algorithm for sequence-based protein contact prediction, built on multiple contact prediction programs, which are machine-learning, co-evolution and meta-server based. It first uses the naive Bayes classifier to calculate the posterior probability of multiple contact predictors. Neural Network is then used to train the actual contact maps against the secondary structure, solvent accessibility, Shannon entropy of multiple sequence alignments, in combination with the posterior probability scores calculated from the predictors.

STING Contacts

Identifies and visualizes amino acid contacts within protein structure and across protein interfaces. STING Contacts calculates atomic contacts among amino acids based on a table of predefined pairs of the atom types and their distances, and then display them using number of different forms. The inventory of currently listed contact includes hydrogen bonds (in nine different flavors), hydrophobic interactions, charge–charge interactions, aromatic stacking and disulfide bonds.


Contact maps are a convenient method for the structural biologist to identify structural features through two-dimensional simplification. Binary (yes/no) contact maps with a single cutoff distance can be generalized to show continuous distance ranges. RRDistMaps is a UCSF Chimera tool to compute such generalized maps in order to analyze pairwise variations in intramolecular contacts. An interactive utility, RRDistMaps visualizes conformational changes, both local (e.g., binding-site residues), and global (e.g., hinge motion), between unbound and bound proteins through distance patterns. Users can target residue pairs in RRDistMaps for further navigation in Chimera. The interface contains the unique features of identifying long-range residue motion and aligning sequences to simultaneously compare distance maps.

CMWeb / Contact Map Web Viewer

An online tool for studying basic properties of residue-residue contact formation and contact clusters. CMWeb can be used for inspecting proteins only with 3D structure, for visualizing contact maps, for linking contacts and displaying them in 3D structures and in multiple sequence alignments, for predicting residue contacts using various contact prediction methods (currently five prediction methods are implemented) and for calculating various statistics on contacts.


An add-on software to molecular visualization program PyMOL. CMPyMOL combines the protein 3D visualization capabilities of PyMOL and the protein's 2D contact map with an interactive interface for scientific analysis. Launching CMPyMOL automatically invokes the PyMOL executable and generates a contact-map (for a specified cut-off distance) for a given Protein DataBase (PDB) file. Visualizing multi-frame PDB trajectories are also supported. CMPyMOL allows for manual selection of interacting residues on the 2D contact map while the program highlights the corresponding residues in the PyMOL 3D visualization. This provides an intuitive bridge between the 2D and 3D representations of the protein.

MemConP / Membrane Contact Prediction

A contact prediction method for α-helical transmembrane proteins, in which evolutionary couplings are combined with a machine learning approach. MemConP achieves a substantially improved accuracy (precision: 56.0%, recall: 17.5%, MCC: 0.288) compared to the use of either machine learning or co-evolution methods alone. The method also achieves 91.4% precision, 42.1% recall and a MCC of 0.490 in predicting helix-helix interactions based on predicted contacts. The approach was trained and rigorously benchmarked by cross-validation and independent testing on up-to-date non-redundant datasets of 90 and 30 experimental three dimensional structures, respectively.


A balanced network deconvolution algorithm to identify optimized dependency matrix without limit on the eigenvalue range in the applied network systems. The algorithm was used to filter contact predictions of five widely-used co-evolution methods. On the test of proteins from three benchmark datasets of CASP9, CASP10 and PSICOV database experiments, the BND can improve the medium- and long-range contact predictions at the L/5 cutoff by 55.59% to 47.68%, respectively, without additional CPU cost.


An algorithm for protein residue-residue contact prediction. SVM-SEQ generates the predictions only based on sequence information, where secondary structures, solvent accessibility, sequence profile and sequence separations derived from the sequences are trained on contact maps by the support vector machine (SVM) technique. Based on the same number of predictions, the accuracy of the contact prediction by SVM-SEQ is comparable to the top sequence-based machine-learning methods published in the literature and in recently CASP7 experiments.

plmDCA / pseudolikelihood maximization Direct-Coupling Analysis

Separates direct from indirect interactions in the context of protein sequences. plmDCA was applied to 21-state Potts models describing the statistical properties of families of evolutionarily related proteins. It outperforms existing approaches to the direct-coupling analysis, the latter being based on standard mean-field techniques. plmDCA should provide a natural choice for analysts interested in applying state-of-the-art protein structure prediction (PSP) to their protein of interest, as well as for researchers looking to further extend the theory and practical applicability of direct-coupling analysis (DCA).