1 - 50 of 50 results

RaptorX

A protein structure prediction server excelling at predicting 3D structures for protein sequences without close homologs in the Protein Data Bank (PDB). Given an input sequence, RaptorX predicts its secondary and tertiary structures as well as solvent accessibility and disordered regions. RaptorX also assigns the following confidence scores to indicate the quality of a predicted 3D model: P-value for the relative global quality, GDT (global distance test) and uGDT (un-normalized GDT) for the absolute global quality, and RMSD for the absolute local quality of each residue in the model.

DescFold / Descriptor-based Fold Recognition System

Permits remote homology identification. DescFold relies on a support vector machine-learning algorithm. It shows reasonable performance in any category, although its relative rankings change slightly in three different categories. This tool was trained on the Lindahl dataset. It demonstrates competitive performance in comparison to the existing methods. It allows the incorporation of more descriptors into a fold recognition system, which yields better performance.

GREMLIN / Generative REgularized ModeLs of proteINs

A method to learn an undirected probabilistic graphical model of the amino acid composition within the multiple sequence alignments. GREMLIN employs regularization to penalize complex models and thus reduce the tendency to over-fit the data. The strength of measured co-evolution is strongly predictive of residue-residue contacts in the 3D structure of the protein. GREMLIN has also been referred to as a maximum-entropy model or a global statistical model.

proFold

Obsolete
A web server for protein fold classification. proFold is an ensemble classifier combining the protein structural and functional information. proFold uses a feature extraction method combining the existing methods of the global description of amino acid sequence, position specific scoring matrix (PSSM), and protein functional information proposed by other researchers. This feature extraction method extracts eight types of secondary structure states from Protein Data Bank (PDB) files by the Definition of Secondary Structure in Proteins (DSSP) software.

SFoldRate

Predicts the folding rates for proteins of diverse classes based only on the amino acid sequence of the protein. SFoldRate works even when protein sequences are based on alphabets of only two residue types. The effectiveness of the model can be demonstrated in a jackknife test, in which the coefficients ws of the model omitting one protein were calculated and the folding rate of the omitted protein is computed. The result is significantly better (R ¼ 0.82) than prediction results using chain length (R ¼ 0.69).

Cofactory

Predicts enzyme cofactor specificity using only primary amino acid sequence information. Cofactory identifies potential cofactor binding Rossmann folds and predicts the specificity for the cofactors FAD(H2), NAD(H), and NADP(H). The Rossmann fold sequence search is carried out using hidden Markov models (HMM) whereas artificial neural networks are used for specificity prediction. Training was carried out using experimental data from protein-cofactor structure complexes. The overall performance was benchmarked against an independent evaluation set obtaining Matthews correlation coefficients of 0.94, 0.79, and 0.65 for FAD(H2), NAD(H), and NADP(H), respectively.

NIAS-Server / Neighbors Influence of Amino Acids and Secondary Structures

A server to help the analysis of the conformational preferences of amino acid residues in proteins. NIAS is a web-based tool used to extract information about conformational preferences of amino acid residues and secondary structures in experimental-determined protein templates. This information is useful, for example, to characterize folds and local motifs in proteins, molecular folding, and can help the solution of complex problems such as protein structure prediction, protein design, among others.

FoRSA / Fold Recognition using a Structural Alphabet

Obsolete
A unique fold recognition algorithm which is based on calculation of conditional probability for the amino acid sequence of a protein to fit to a particular fold. We use a structural alphabet, known as “Protein Blocks” (PBs) which is a library of 16 local structural prototypes named a to p based on a sliding window of pentapeptides, to encode existing folds into PB sequences. The method relies on the usage of 16 amino acid occurrence matrices, one for each PB, to calculate conditional probability of a window of 15 residues to have a local structure corresponding to a particular PB. These probabilities were used to score dynamic programming based global and local alignments of query amino acid sequences to PB sequences derived from a library of known folds.

PESS / Protein Empirical Structure Space

Facilitates sensitive protein fold recognition via an empirical structure space and first-nearest neighbor (1NN) classifiers. PESS identifies all folds with less threading and makes it is appropriate for classification of large, proteome-scale datasets. It addresses the problem of fold recognition and can be used to identify novel structure groups. This software necessitates a single training example per fold when used with a 1NN classifier, allowing users to make predictions for all currently known folds in structural classification of proteins-extended (SCOPe).

DeepSF

Obsolete
Classifies proteins of variable-length into all known folds defined in SCOP 1.75 database. DeepSF is a 1D deep convolution neural network method that directly extracts hidden features from any protein sequence of any length through convolution transformation. It also can classify it into one of thousands of folds. The software was tested using three test datasets: (1) new proteins in SCOP 2.06 database, (2) template-based targets in the past CASP experiments, and (3) template-free targets in the past CASP experiments.

ModLink+

Uses both sequence similarity and protein–protein interactions to assign a Structural Classification Of Proteins (SCOP) fold and a family classification to uncharacterized proteins. ModLink+ also includes an improved procedure for extrapolating links that iteratively varies the number of interactions required to consider a protein as a hub. It is applicable to a significant number of sequences for which the assignment of fold with other methods fails. It also can enlarge the sequence coverage with structure upon the predictions of PSI-BLAST, also improving the coverage of HHSearch and PRC with the same accuracy.

SSThread

A template-free protein structure prediction program. SSThread predicts the structure of contacting pairs of α-helices and β-strands that are derived from experimental structures followed by the assembly overlapping pair predictions to generate an ensemble of core structures. Then the loops are predicted using a database search and cyclic coordinate descent. Predictions are scored using a coarse-grained knowledge-based potential, secondary structure prediction and contact map prediction. All atom predictions are then generated using SIDEpro and GROMACS. The best predictions are identified using the all-atom knowledge-based potential dDFIRE.

DN-Fold

A deep learning network method to predict if a given query-template protein pair belongs to the same structural fold. The input used stemmed from the protein sequence and structural features extracted from the protein pair. Specifically, DN-Fold predicts if two proteins are from the same fold, taking as input pairwise protein features such as sequence or family information, sequence alignment, sequence–profile alignment, profile–profile alignment, structural features, and structure-seeded profiles.

FOLDpro

A two-stage machine learning, information retrieval, approach to fold recognition. First, we use alignment methods to derive pairwise similarity features for query-template protein pairs. We also use global profile-profile alignments in combination with predicted secondary structure, relative solvent accessibility, contact map and beta-strand pairing to extract pairwise structural compatibility features. Second, we apply support vector machines to these features to predict the structural relevance (i.e. in the same fold or not) of the query-template pairs. For each query, the continuous relevance scores are used to rank the templates. The FOLDpro approach is modular, scalable and effective.

SPARKS-X

A fold recognition server. SPARKS-X improves the single-method fold recognition technique called SPARKS by changing the alignment scoring function and incorporating the SPINE-X techniques that make improved prediction of secondary structure, backbone torsion angle and solvent accessible surface area. SPARKS-X was tested with the SALIGN benchmark for alignment accuracy, Lindahl and SCOP benchmarks for fold recognition, and CASP 9 blind test for structure prediction. The method is compared to several state-of-the-art techniques such as HHPRED and BoostThreader. Results show that SPARKS-X is one of the best single-method fold recognition techniques.

PROSPECT / PROtein Structure Prediction and Evaluation Computer Toolkit

A threading-based protein structure prediction system. PROSPECT is designed particularly for the recognition of the fold template whose sequence has insignificant homology to the target sequence. The system finds optimal alignments for a given energy function with any combination of the following terms: (1) mutation energy (including position-specific score matrix derived from multiple-sequence alignments), (2) singleton energy (including matching scores to the predicted secondary structures), (3) pairwise contact potential (distance dependent or independent), and (4) alignment gap penalties.

RF-Fold

A random forest method to recognize protein folds. RF-Fold was systematically validated by varying the input features and the class distribution of training datasets on a standard fold recognition dataset. The random forest consisting of 500 decision trees yielded a low error rate than a single decision tree on a highly imbalanced dataset. The random forest also delivered a good, steady performance regardless of the different ratios of negative and positive examples. Compared with 17 other different fold recognition methods, the performance of the RF-Fold is generally comparable to the best performance.

SEGMER

A segmental threading algorithm designed to recognizing substructure motifs from the Protein Data Bank (PDB) library. SEGMER first splits target sequences into segments which consists of 2-4 consecutive or non-consecutive secondary structure elements (alpha-helix, beta-strand). The sequence segments are then threaded through the PDB to identify conserved substructures. It often identifies better conserved structure motifs than the whole-chain threading methods, especially when there is no similar global fold existing in the PDB.

partiFold-Align

Uses dynamic programming schemes to simultaneously list the complete space of structures and sequence alignments and compute the optimal solution. PartiFold-Align is an algorithm for simultaneous alignment and folding pairs of unaligned protein sequences. This tool exploits scarcity in the set of super-secondary structure pairings and alignment candidates to attain an effectively cubic running time. It also get better secondary structure prediction where current approaches fail.

FUGUE

A program for recognizing distant homologues by sequence-structure comparison. It utilizes environment-specific substitution tables and structure-dependent gap penalties, where scores for amino acid matching and insertions/deletions are evaluated depending on the local environment of each amino acid residue in a known structure. Given a query sequence (or a sequence alignment), FUGUE scans a database of structural profiles, calculates the sequence-structure compatibility scores and produces a list of potential homologues and alignments.

QUARK

A computer algorithm for ab initio protein folding and protein structure prediction, which aims to construct the correct protein 3D model from amino acid sequence only. QUARK models are built from small fragments (1-20 residues long) by replica-exchange Monte Carlo simulation under the guide of an atomic-level knowledge-based force field. QUARK was ranked as the No 1 server in Free-modeling (FM) in CASP9 and CASP10 experiments. Since no global template information is used in QUARK simulation, the server is suitable for proteins which are considered without homologous templates.

MUSTER / MUlti-Sources ThreadER

A protein threading algorithm to identify the template structures from the PDB library. MUSTER generates sequence-template alignments by combining sequence profile-profile alignment with multiple structural information. It combines various sequence and structure information into single-body terms which can be conveniently used in dynamic programming search: (1) sequence profiles; (2) secondary structures; (3) structure fragment profiles; (4) solvent accessibility; (5) dihedral torsion angles; (6) hydrophobic scoring matrix.

KineticDB

Obsolete
Provides information about diverse data on protein folding kinetics. KineticDB is a relational database that contains single-domain proteins, and separates protein domains and short peptides without disulfide bonds in their native structure. Each record of the database relates to a single protein folding kinetics measurement extracted from the original paper and gives details of the experimentally studied protein, its best available tertiary structure, experimental conditions, reference to the original paper and experimental results.

CATHEDRAL

An iterative protocol for determining the location of previously observed protein folds in novel multidomain protein structures. CATHEDRAL builds on the features of a fast secondary-structure–based method (using graph theory) to locate known folds within a multidomain context and a residue-based, double-dynamic programming algorithm. This algorithm is used to align members of the target fold groups against the query protein structure to identify the closest relative and assign domain boundaries. To increase the fidelity of the assignments, a support vector machine is used to provide an optimal scoring scheme.