1 - 50 of 160 results

RCSB PDB / The Research Collaboratory for Structural Bioinformatics Protein Data Bank

Helps students and researchers to understand all aspects of biomedicine and agriculture, from protein synthesis to health and disease. RCSB PDB is a database that provides a resource powered by the Protein Data Bank (PDB) archive - information about the 3D shapes of proteins, nucleic acids, and complex assemblies. Users can perform simple and complex queries on the data, analyze, and visualize the results.


star_border star_border star_border star_border star_border
star star star star star
forum (1)
Develops many interactive web-based databases and software to help the life-scientists understand the complexity of systems biology. Systems biology efforts focus on understanding cellular networks, protein interactions involved in cell signaling, mechanisms of cell survival and apoptosis leading to development or identification of drug candidates against a variety of diseases.


A protein structure prediction server excelling at predicting 3D structures for protein sequences without close homologs in the Protein Data Bank (PDB). Given an input sequence, RaptorX predicts its secondary and tertiary structures as well as solvent accessibility and disordered regions. RaptorX also assigns the following confidence scores to indicate the quality of a predicted 3D model: P-value for the relative global quality, GDT (global distance test) and uGDT (un-normalized GDT) for the absolute global quality, and RMSD for the absolute local quality of each residue in the model.


Predicts oligomerization, functional sites, and conformational changes in transmembrane proteins. EVfold_membrane applies a maximum entropy approach to infer evolutionary co-variation in pairs of sequence positions within a protein family and then generates all-atom models with the derived pairwise distance constraints. The method predicts the structures of 11 transmembrane proteins of unknown structure, including six pharmacological targets. It appears to achieve a useful level of accuracy.


Predicts protein 3D structure by using single template homology model. CPHmodels was created to make a front-end that was easy to understand for users without any prior knowledge of homology modelling. It provides a result that is as accurate as possible. The tool is based on an optimized alignment scoring function and employs a double-sided Z-score to rank individual template hits. One of its major feature is the speed: for most queries the response time of the server is inferior to 20 minutes.


A web server predicting structure property of a protein sequence without using any templates. RaptorX-Property outperforms other servers, especially for proteins without close homologs in PDB or with very sparse sequence profile (i.e. carries little evolutionary information). This server employs a powerful in-house deep learning model DeepCNF (Deep Convolutional Neural Fields) to predict secondary structure (SS), solvent accessibility (ACC) and disorder regions (DISO). DeepCNF not only models complex sequence-structure relationship by a deep hierarchical architecture, but also interdependency between adjacent property labels.


Stores 3D structures of proteins and other molecules, with descriptive text and hyperlinks that change the adjacently displayed structures to coincide with points made in the text. Proteopedia is a wiki-based web-resource for all scientists who need to utilize three-dimensional (3D) structural information in their research, and for educators requiring a medium for compelling presentation of structure-function relationships. 3D scenes of molecules and molecular complexes can be created easily by Proteopedia users and immediately shared with and viewed by all.

I-TASSER / Iterative Threading ASSEmbly Refinement

Allows automated protein structure prediction and structure-based function annotation. I-TASSER constructs, starting from the amino acid sequence, 3D structural models by reassembling fragments excised from threading templates. I-TASSER servers provides a confidence score (C-score) to estimate the models’ global accuracy. The I-TASSER Suite pipeline was tested in community-wide structure and function prediction experiments, including CASP10 and CAMEO.

SABBAC / Structural Alphabet-based protein BackBone reconstruction from Alpha-Carbon trace

Assists users in reconstruction of protein backbone. SABBAC is an online tool that relies on an approach to fragment selection and assembly. It uses the encoding of the alpha-carbon trace using a hidden Markov model derived structural alphabet. It selects at each position in the structure a small set of candidates among a complete set of over 150 candidate fragments describing all the letters of the structural alphabet.


Implements an algorithm which uses Euclidean distance transform (EDT) to convert the target protein structure into a 3D gray-scale image, where depths of atoms in the protein can be conveniently and precisely derived from the minimum distance of the pixels to the surface of the protein. EDTSurf allows to construct triangulated surfaces for macromolecules. It generates three major macromolecular surfaces: van der Waals surface, solvent-accessible surface and molecular surface (solvent-excluded surface). EDTSurf also identifies cavities which are inside of macromolecules. Furthermore, EDTSurf has been extended to calculate atom depth and residue depth to solvent-accessible surface.


Provides distance-dependent atomic potential for protein structure modeling and structure decoy recognition. RW was derived from 1,383 high-resolution protein data bank (PDB) structures using an ideal random-walk chain as the reference state. The RW potential has been extensively optimized and tested on a variety of protein structure decoy sets and demonstrates a significant power in protein structure recognition and a strong correlation with the RMSD of decoys to the native structures. RW is freely available for download.

CNS / Crystallography & NMR System

Provides a flexible multi-level hierachical approach for the most commonly used algorithms in macromolecular structure determination. CNS allows heavy atom searching, experimental phasing (including MAD and MIR), density modification, crystallographic refinement with maximum likelihood targets, and NMR structure calculation using NOEs, J-coupling, chemical shift, and dipolar coupling data. CNS is the result of an international collaborative effort among several research groups.


A comparative modeling web-server for protein structure modelling closely connected to ModBase. ModWeb accepts one or many sequences in the FASTA format and calculates their models using ModPipe based on the best available templates from the Protein Data Bank (PDB). Alternatively, ModWeb also accepts a protein structure as input and calculates models for all identifiable sequence homologs in the UniProt database. The latter mode is a useful tool for structural genomics efforts to assess the impact of a newly determined protein structure on the modeling of sequences of unknown structure. It is also used to identify new members of sequence superfamilies with at least one member of known structure. The results of ModWeb calculations are available through the ModBase interface as private datasets protected with passwords.


Identifies protein modifications observed in 3-D structures archived in the Protein Data Bank (PDB). BioJava-ModFinder collects information on more than 400 types of protein modifications curated from annotations in PDB, RESID, and PSI-MOD. These modifications are classified into three categories: modified residues, attachment modifications, and cross-links. A systematic method identifies these modifications in 3-D protein structures. This package was integrated with the RCSB PDB web application and added protein modification annotations to the sequence diagram and structure display. By scanning all 3-D structures in the PDB using BioJava-ModFinder, more than 30,000 structures were identified with protein modifications, which can be searched, browsed, and visualized on the RCSB PDB website.


Calculates exact pairwise sequence alignment using edit distance. Edlib supports the global alignment method and the semi-global alignment method where gaps at the end of the query sequence are not penalized. It can find the optimal alignment path for all three supported alignment methods in linear space. The tool was compared to SeqAn library, Parasail library and original Myers’s algorithm implementation. Edlib exhibits significant improvement in speed with increase of sequence similarity, in contrast to other libraries.

SPIDER / Sequence-based Prediction of Local and Nonlocal Structural Features for Proteins

Predicts different sets of structural protein properties. SPIDER is an iterative deep-learning neutral network. It obtains secondary structure, torsion angles, Cα−atom based angles and dihedral angles, and solvent accessible surface area. It utilises both local and nonlocal structural information in iterations. At each iteration, SPIDER employs a deep-learning neural network to predict a structural property based on structural properties predicted in the previous iteration.


Ranks and clusters macromolecular structures, including proteins and RNAs. uQlust combines versatile structural profiles of both proteins and nucleic acids with linear time algorithm for comparison of all pairs of models using 1D-Jury and profile hashing for efficient and low memory footprint clustering of macromolecular structures, including hierarchical clustering. While reducing dramatically the computation time and memory requirements with respect to existing methods, uQlust yields comparable accuracies in protein and RNA clustering and model quality assessment.


A mutation prediction tool which is based on per residue root mean square deviation values of superimposed 3D protein models. MODICT predicts the effect of mutations on the structure of the protein. The algorithm takes into account the global structural changes in the 3D protein model. The mathematical model underlying MODICT can also incorporate the information from conservation and weight scores. MODICT is not only a prediction tool, but also a tool to scrutinize changes in the protein structure independent of the score.

PSCPP / Protein Side-Chain Packing Problem

Estimates the side-chain conformation of every protein’s residue. PSCPP aims to find a set of rotamers from a rotamer library that minimizes the given scoring function. The method is composed of different parts: (1) a rotamer library, (2) a scoring function (SF), and (3) a search algorithm. It realizes a relaxation process through a molecular dynamics simulation considering only the asymmetric unit of monomeric proteins surrounded by water to take into account a realistic environment for the protein.

tetraBASE / tetrahedron-based backbone statistical energy model

Realizes realistic modeling of through-space packing of polypeptide backbones. tetraBASE derives statistical energies from known sequence and structural data of native proteins and their complexes. It can consider the effects of peptide local conformation, local structural environment as well as inter-residue geometries on amino-acid sequences. This method provides a representation of inter-backbone site packing geometries.


Classifies the protein fold by combing the results from two algorithms, HH-fold and support vector machines (SVM)-fold. TA-fold is an ensemble approach proposed to combine the results of these algorithms. HH-fold is a template-based fold assignment algorithm using the Hidden Markov Model (HMM)-HMM alignment program HHsearch. SVM-fold is a support vector machine-based ab-initio classification algorithm. When there are homologous templates to the query protein, HH-fold prediction is reported. Otherwise, the SVM-fold prediction is returned.


Explores simultaneously protein sequence space and protein structure space by cross-modal learning. CMsearch has several advantages over existing methods: (i) instead of exploring a single space built from the mixture of sequence and structure similarities, CMsearch builds two separate spaces and explores the two spaces simultaneously. (ii) CMsearch is completely different from threading methods because it uses not only sequence and structure information, but also Xuefeng Cuisequence and structure space information. (iii) CMsearch is a generic framework such that any sequence similarity metric and any structure similarity metric can be adopted.


A de novo protein structure prediction method that performs stepwise synthesis and assembly of foldon units via conditional sampling from a novel united-residue probabilistic model, which captures local conformational bias of backbone and side chain simultaneously in a united residue representation. The rationale for choosing united-residue representation is to integrate both backbone and side chain during structure modeling. It is found that (1) stepwise sampling produces lower energy conformations with higher accuracy than random sampling when everything else remains the same; (2) UniCon3D attains comparable performance with top five automated methods of CASP11 and CASP10 in a dataset of 30 and 15 difficult target domains, respectively; and (3) UniCon3D outperforms a baseline counterpart of UniCon3D that performs traditional random sampling as well as GDFuzz3D and FT-COMAR, two state-of-the-art approaches for de novo protein structure prediction aided by residue-residue contacts in a dataset containing 45 CASP10 targets.

[email protected]

A protein structure prediction server focusing on remote homologue identification. The design of [email protected] is based on the observation that a structural template, especially for remote homologous proteins, consists of conserved regions interweaved with highly-variable regions. The highly-variable regions lead to vague alignments in threading approaches. Thus, [email protected] first extracts conserved regions from each template and then aligns a query protein with conserved regions only rather than the full-length template directly. This helps avoid the vague alignments rooted in highly-variable regions, improving remote homologue identification.

IBiSS / Integrative Biology of Sequences and Structures

A web-based tool that is designed for interactively displaying 3D structures and selected sequences of subunits from large macromolecular complexes thus allowing simultaneous structure-sequence analysis such as conserved residues involved in catalysis or protein-protein interfaces. This tool comprises a Graphic User Interface and uses a rapid-access internal database, containing the relevant pre-aligned multiple sequences across all species available and 3D structural information. These annotations are automatically retrieved and updated from UniProt and crystallographic and cryo-EM data available in the Protein Data Bank (PDB) and Electron Microscopy Data Bank (EMDB).

SCEC / Structural Class prediction based on Evolutionary Collocation based sequence representation

Predicts four structural classes (all-α, all-β, α/β and α+β) based on a protein sequence. SCEC allows users to perform structural class prediction thanks to a web service using a PSI BLAST profile-based collocation of amino acid (AA) pairs. It provides representation that can be extended to other protein prediction tasks such as fold, solvent accessibility, membrane protein type, and enzyme family predictions.