Structural domain detection software tools | Protein data analysis
Protein structures are comprised of modular elements known as domains. These units are used and re-used over and over in nature, and usually serve some particular function in the structure. Thus it is useful to be able to break up a protein of interest into its component domains, prior to similarity searching for example.
Predicts 3D structure of a protein sequence. Phyre is a web application that investigates known homologues, builds a hidden Markov model (HMM) of the targeted sequence based on the detected homologues and scans it against a database of HMMs of known protein structures. It also provides advanced features such as a batch submission of a large number of protein sequences for modelling or Phyre Investigator, that allows users to analyze model quality, function and effects of mutations.
An independent web server that integrates our leading methods for structure and function prediction. The server provides a simple unified interface that aims to make complex protein modelling data more accessible to life scientists. The server web interface is designed to be intuitive and integrates a complex set of quantitative data, so that 3D modelling results can be viewed on a single page and interpreted by non-expert modellers at a glance.
A computer program using network flow algorithms for protein domain decomposition. DomainParser partitions a protein structure into domains. Through utilization of various types of structural information including hydrophobic moment profile, we have developed an effective method for assessing the most probable number of domains a structure may have. The core of this method is a neural network, which is trained to discriminate correctly partitioned domains from incorrectly partitioned domains. When compared with the manual decomposition results given in the SCOP database, DomainParser algorithm achieves higher decomposition accuracy (81.9%) on the same data set.
Enables users of the CDD (conserved domain database) resource to examine curated hierarchies. CDD and CDTree used in concert, serve as a powerful tool in protein classification, as they allow users to analyze protein sequences in the context of domain family hierarchies.
Identifies the structural domains and determines the evolutionary superfamilies of a query protein structure. fastSCOP uses 3D-BLAST to scan quickly a large structural classification database and the top 10 hit domains, which have different superfamily classifications, are obtained from the hit lists. fastSCOP is robust and can be a useful server for recognizing the evolutionary classifications and the protein functions of novel structures.
Delineates energy hierarchy of protein domain structure. DHcL is a web server that detects domains at different levels of this hierarchy. It identifies closed loops locks which constitute a structural basis for the protein domain hierarchy. This application can be a useful tool for an express analysis of protein structures and their alternative domain decompositions. It maintains a regularly updated database of domains, closed loop and van der Waals locks for all X-ray structures.
An accurate and sensitive superfamily discrimination, combining information from both sequence and structure to produce highly accurate domain alignments. The method employs the same underlying threading algorithm as pGenTHREADER, however it aligns sequences to a domain-based template library rather than a chain-based template library. The use of smaller regions of structure for templates means that different features of the alignments are required for optimal scoring. The final prediction score results from an SVM trained on a combination of 5 different feature inputs; template coverage, alignment score, template length, solvation and pairwise potentials.
A fast docking algorithm for assembling multi-domain protein structures, guided by the ab initio folding potential. AIDA can be extended to discontinuous domains (i.e. domains with 'inserted' domains). This server also provides access to a recursive protocol, which combines template-based modeling with domain assembly in an iterative method suitable for automated domain assignment, modeling and assembly for a one-stop structure prediction of multi-domain proteins.
Allows users to fold, molecular function and functional sites at the domain level. COPRED allows any user to access this method and generate predictions using, in the simplest case, only the query sequence as input. The predictions can be inspected in a graphical interactive interface and downloaded in a number of standard formats.
Predicts two-class b-turns and the individual b-turn types. To proceed, NetTurnP uses evolutionary information and predicted protein sequence features. It was tested on a dataset of 426 non-homologous protein chains. The tool has obtained Matthews correlation coefficients values of 0.36 and 0.31 for the type specific b-turn predictions, type I and II, respectively. It can predict if an amino acid is located in a Beta-turn or not.
Predicts enzyme cofactor specificity using only primary amino acid sequence information. Cofactory identifies potential cofactor binding Rossmann folds and predicts the specificity for the cofactors FAD(H2), NAD(H), and NADP(H). The Rossmann fold sequence search is carried out using hidden Markov models (HMM) whereas artificial neural networks are used for specificity prediction. Training was carried out using experimental data from protein-cofactor structure complexes. The overall performance was benchmarked against an independent evaluation set obtaining Matthews correlation coefficients of 0.94, 0.79, and 0.65 for FAD(H2), NAD(H), and NADP(H), respectively.
Discovers residue ranges that are suitable for the global superposition of protein domains. CYRANGE can serve for: structure bundles of high and low precision and multi-domain proteins. It can deal with symmetric multimers and protein complexes. This tool can automatically recognize ordered regions and allows users to represent the structure in a clear manner. It assists users in the choice of residue ranges for the superposition of protein structures.
A clustering-based approach to domain identification, which works equally well on individual chains or entire complexes. The method is simple and fast, taking only a few milliseconds to run, and works by clustering either vectors representing secondary structure elements, or buried alpha-carbon positions, using average-linkage clustering. Each resulting cluster corresponds to a domain of the structure. The method is competitive with others, achieving 70% agreement with SCOP on a large non-redundant data set, and 80% on a set more heavily weighted in multi-domain proteins on which both SCOP and CATH agree.
An integrated computational framework to predict optimal structural domains and identify target molecules for antibodies. PAT automatically analyses various structural properties, evaluates the folding stability, and identifies possible structured units in a given protein sequence. PAT is able to identify the traditional domains with strongly conserved stretches of protein sequence and putative structural units with parts of the protein that adopt stable folds.
A Hidden Markov Model based method, capable of predicting the topology of transmembrane proteins and the existence of kinase specific phosphorylation and N/O-linked glycosylation sites along the protein sequence. It integrates a novel feature in transmembrane protein topology prediction, which results in improved performance for topology prediction and reliable prediction of phosphorylation and glycosylation sites.
Provides a visualization tool for interactive fitting atomic protein domain structures with the cryo-Electron Microscopy (EM) density map. To achieve the best match, MVP-Fit can conveniently adjust the loop and tail outliers of individual domains to accommodate the local conformational changes from the rigid-body rotation and translation of protein domains. MVP-Fit is based on MVP, a visualization system which generates quickly and accurately triangulated isosurfaces for density maps. The software is freely available for download.
Predicts protein structure. Prime is based on homology models and fold recognition. It uses a combination of sequence and secondary structure information to calculate alignments. The tool is able to generate accurate homology models for further structure-based studies. Users can adjust and specify parameters to optimize the quality of predictions. It is particularly useful for early structural investigations or functional annotation in cases of low or no-sequence identity.
Assigns domain boundaries in a given structure using the superpositions stored in DBAli. ModDom is a web application relies on the relationship between recurrent structures and structural units to predict domain boundaries. The software consists of the following steps: (i) building of a residue co-occurrence matrix based on structural alignments selected from the DBAli database and (ii) clustering of residue co-occurrences to find common fragments in the query protein structure.