Provides a motif descriptor database. PROSITE offers an annotated collection of biologically meaningful motif descriptors dedicated to the identification of protein families and domains. This database uses two kinds of motif descriptors: (i) patterns or regular expressions in which the most significant residue information is discarded, and (ii) generalized profiles and quantitative motif descriptors that consider the overall similarity on the entire length of domains or proteins.
Provides a set of multiple sequence alignments and hidden Markov models (HMMs) for protein families. Pfam is constructed by capturing the diversity of a set of evolutionarily related sequences. It aligns a representative subset of the entire set of matching sequences to build the seed alignment. This database provides more than 17000 entries which are related by similarity of sequence, structure or profile-HMM.
Stores best representative profiles (BRP) of protein families. 3PFDB is a database designed to find the best representative sequence (BRS) for each PFAM family. Users can also search new sequences against the representative profiles using two sequence homology detection methods, HMMER and FASSM. This approach was tested for over 100-family dataset.
Consists in an online protein sequence collection. SUPERFAMILY is both a database and website resource offering a variety of methods to explore whole proteins and domains. The database focuses on the superfamily level and provides protein domain assignments at the family level. The website also offers a server-side pipeline for processing of whole genome protein annotations in a timely manner.
Provides known and new protein domains identified by Co-Occurrence Domain Detection (CODD) on several major human pathogens selected from EupathDB database. EuPathDomains can be queried by protein names, domain identifiers, Pfam or Interpro identifiers. It offers users the possibility to limit the search on an organism or a taxon. This database improves the domain coverage in all genomes, by localizing new occurrences of domains that are already known.
Provides integrated access to proteome sequence comparison data. MTB-PCDB is a comprehensive database that help users in easy navigation and retrieval of information for analysis. It includes five strains of Mycobacterium tuberculosis (H37Rv, H37Ra, CDC 1551, F11 and KZN 1435) sequenced completely so far. This information also facilitates design of new antitubercular vaccines and therapeutic agents based on the identified virulence-associated mutations.