Main logo
?
tutorial arrow
×
Submit new tools
Share tools covering the current topic. Provide easy-to-follow guidelines to improve their usability.
Share new tools with the community
Sign up for free to promote the availability of bioinformatics tools

Protein family databases | Comparison analysis

A protein family is a group of proteins that share a common evolutionary origin, reflected by their related functions and similarities in sequence or structure.

COGs
Dataset

COGs the Clusters of Orthologous Groups of proteins

Provides clusters of orthologous groups (COGs) and updated annotation of those…

Provides clusters of orthologous groups (COGs) and updated annotation of those COGs. COGs is a database where organisms are sorted according to the NCBI Taxonomy database. Each gene entry in a COG is…

InterPro
Web
Desktop

InterPro

Provides functional analysis of protein sequences. InterPro is a software which…

Provides functional analysis of protein sequences. InterPro is a software which allows to classify sequences into protein families and to predict the presence of important domains and sites. The…

G T A T C G C T A
PANTHER
Web

PANTHER Protein ANalysis THrough Evolutionary Relationships

A widely used online resource for comprehensive protein evolutionary and…

A widely used online resource for comprehensive protein evolutionary and functional classification, and includes tools for large-scale biological data analysis. The latest version of PANTHER, 10.0,…

SMART
Dataset

SMART Simple Modular Architecture Research Tool

A web resource providing simple identification and extensive annotation of…

A web resource providing simple identification and extensive annotation of protein domains and the exploration of protein domain architectures. In the current version, SMART contains manually curated…

SCOP
Dataset

SCOP Structural Classification of Proteins

Orders all proteins of known structure, according to their evolutionary and…

Orders all proteins of known structure, according to their evolutionary and structural relationships. SCOP focuses on knowledge-based expert analysis and classification of proteins that are…

PASS2
Dataset

PASS2 Protein Alignments organised as Structural Superfamilies

Provides structure-based sequence alignments of protein domain superfamilies in…

Provides structure-based sequence alignments of protein domain superfamilies in correspondence with Structural Classification of Proteins (SCOP) definitions. PASS2 deals with distantly related…

Pfam
Dataset

Pfam

The database is a large collection of protein families, each represented by…

The database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).

PROSITE
Dataset

PROSITE

Provides a motif descriptor database. PROSITE offers an annotated collection of…

Provides a motif descriptor database. PROSITE offers an annotated collection of biologically meaningful motif descriptors dedicated to the identification of protein families and domains. This…

PRINTS-S
Dataset

PRINTS-S

Allows protein sequence analysis and genome annotation. PRINTS-S stores motifs…

Allows protein sequence analysis and genome annotation. PRINTS-S stores motifs in the form of un-gapped, local sequence alignments. It can model relationships explicitly by defining parent–child…

ProDom
Dataset

ProDom

A comprehensive set of protein domain families automatically generated from the…

A comprehensive set of protein domain families automatically generated from the UniProt Knowledge Database.

pHMM-tree
Desktop
Web

pHMM-tree

Builds a phylogeny of protein families using the distance matrix of their…

Builds a phylogeny of protein families using the distance matrix of their poisson Hidden Markov Models (pHMMs). Although not designed for subfamily classification, if given a set of protein…

TIGRFAMs
Dataset

TIGRFAMs

A resource consisting of curated multiple sequence alignments, Hidden Markov…

A resource consisting of curated multiple sequence alignments, Hidden Markov Models (HMMs) for protein sequence classification, and associated information designed to support automated annotation of…

SPARCLE
Dataset

SPARCLE Subfamily Protein Architecture Labeling Engine

A resource for the functional characterization and labeling of protein…

A resource for the functional characterization and labeling of protein sequences that have been grouped by their characteristic conserved domain architecture. SPARCLE interface proposes to associate…

ECOD
Dataset

ECOD Evolutionary Classification Of protein Domains

A hierarchical classification of protein domains according to their…

A hierarchical classification of protein domains according to their evolutionary relationships. ECOD classifies proteins with experimentally determined spatial structures from the Protein Data Bank…

SYSTERS
Dataset

SYSTERS

Aims to provide a meaningful partitioning of the whole protein sequence space…

Aims to provide a meaningful partitioning of the whole protein sequence space by a fully automatic procedure.

GPU MrBayes
Desktop

GPU MrBayes

Improves computational efficiency when analyzing Bayesian phylogenetic…

Improves computational efficiency when analyzing Bayesian phylogenetic inference on protein data sets. GPU MrBayes is an efficient task mapping strategy which makes better use of GPU cores and GPU…

iProClass
Dataset

iProClass

Provides value-added views for UniProt protein entries and Protein Information…

Provides value-added views for UniProt protein entries and Protein Information Resource Super Family (PIRSF) entries with extensive annotation information and graphical displays. iProClass offers a…

VIDA
Dataset

VIDA

Contains a collection of homologous protein families derived from open reading…

Contains a collection of homologous protein families derived from open reading frames from complete and partial virus genomes.

ProtoNet
Dataset

ProtoNet

Consists in a data structure of protein families. ProtoNet aims to achieve an…

Consists in a data structure of protein families. ProtoNet aims to achieve an automatic hierarchical clustering of the protein sequences space. The database generates automatically, with no…

SMS
Web

SMS STING Millennium Suite

Provides a variety of algorithms and validated data, wrapped-up in a user…

Provides a variety of algorithms and validated data, wrapped-up in a user friendly web interface. STING Millennium Suite (SMS) is a new web-based suite of programs and databases providing…

SUPERFAMILY
Dataset

SUPERFAMILY

Consists in an online resource and protein sequence collection. SUPERFAMILY is…

Consists in an online resource and protein sequence collection. SUPERFAMILY is both a database and website resource that offers a variety of methods to analyze whole proteins and domains. The…

GenDiS
Dataset

GenDiS Genomic Distribution of Protein Structural Domain Superfamilies

Provides structural assignments to genes listed within the non-redundant…

Provides structural assignments to genes listed within the non-redundant protein sequence database at the superfamily level. GenDiS is a compendium of sequence domains of evolutionarily related…

CyanoLyase
Dataset

CyanoLyase

A manually curated sequence and amino acid motif database gathering all the…

A manually curated sequence and amino acid motif database gathering all the different phycobilin lyases and related protein sequences available in public databases. CyanoLyase provides an extensive…

CUBE-DB
Dataset

CUBE-DB

Allows to find sub-family specific residues. CUBE-DB is a database of…

Allows to find sub-family specific residues. CUBE-DB is a database of pre-calculated results which includes visualizations and modifiable spreadsheets. This database serves to detection of functional…

LEAPdb
Dataset

LEAPdb Late Embryogenesis Abundant Proteins database

Harbors a comprehensive data set for late embryogenesis abundant proteins…

Harbors a comprehensive data set for late embryogenesis abundant proteins (LEAP) with tools designed for their online analysis. LEAPdb provides a curated archive of LEAP families to navigate,…

PRINTS
Dataset

PRINTS

Stores a collection of diagnostic protein family fingerprints. PRINTS is a…

Stores a collection of diagnostic protein family fingerprints. PRINTS is a public domain database. Each fingerprint has been defined and iteratively refined using database scanning procedures within…

FUnkFams
Dataset

FUnkFams Function Unknown Families of homologous proteins

Provides information about gene families. FUnkFams supplies information to…

Provides information about gene families. FUnkFams supplies information to detect protein families without annotation domains and identifies them in metagenomics data. It also assigns annotations to…

Plasmobase
Dataset

Plasmobase

Provides a platform for the comparative study of Plasmodium genomes. Plasmobase…

Provides a platform for the comparative study of Plasmodium genomes. Plasmobase is a database that reports known and new protein domains identified by DAMA and CLADE on the 11 fully sequenced genomes…

PIRSF
Web

PIRSF

The concept is being used as a guiding principle to provide comprehensive and…

The concept is being used as a guiding principle to provide comprehensive and non-overlapping clustering of UniProtKB sequences into a hierarchical order to reflect their evolutionary relationships.

FIGfams
Dataset

FIGfams

Provides a collection of over 100 000 protein families. FIGfams are sets of…

Provides a collection of over 100 000 protein families. FIGfams are sets of isofunctional homologues. Each one contains a set of proteins that are end-to-end homologous and share a common function…

dcGO
Dataset

dcGO

A comprehensive ontology database for protein domains. It is updated…

A comprehensive ontology database for protein domains. It is updated fortnightly, and the website provides downloads, search, browse, phylogenetic context and other data-mining facilities.

CyanoClust
Dataset

CyanoClust

A database of homolog groups in cyanobacteria and plastids that are produced by…

A database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. CyanoClust contains protein homology information for 38 cyanobacteria, 59 plastids and 1 Paulinella…

ProtoBug
Dataset

ProtoBug

A database and resource of protein families in Arthropod genomes. ProtoBug…

A database and resource of protein families in Arthropod genomes. ProtoBug platform presents the relatedness of complete proteomes from 17 insects as well as a proteome of the crustacean, Daphnia…

3PFDB
Dataset

3PFDB

Stores best representative profiles (BRP) of protein families. 3PFDB is a…

Stores best representative profiles (BRP) of protein families. 3PFDB is a database designed to find a best representative sequence (BRS) for each PFAM family. The database implements the two methods…

SCOOP
Desktop

SCOOP Simple Comparison Of Outputs Program

A simple way to compare protein families. Rather than directly comparing…

A simple way to compare protein families. Rather than directly comparing profiles, SCOOP compares the database search outputs that result from searching a sequence database with the profiles. It…

ADDA
Web

ADDA Automatic Domain Decomposition Algorithm

An automatic algorithm for domain decomposition and clustering of all protein…

An automatic algorithm for domain decomposition and clustering of all protein domain families. Alignments derived from an all-on-all sequence comparison are used to define domains within protein…

3DM
Desktop

3DM

Uses to extract clues about the function of specific amino acids and predict…

Uses to extract clues about the function of specific amino acids and predict effects of mutations. 3DM is a protein superfamily knowledgebase based upon a structure-based multiple sequence alignment…

PrePRINTS
Dataset

PrePRINTS

Provides increased protein family coverage based on the PRINTS database.…

Provides increased protein family coverage based on the PRINTS database. PrePRINTS is an automatically generated database that contains conserved motifs used to characterise a protein family and…

prosextract
Desktop

prosextract

Processes the PROSITE motif database for use by patmatmotifs. Prosextract reads…

Processes the PROSITE motif database for use by patmatmotifs. Prosextract reads the Prosite files from the specified directory and writes an output file with the following information as output:…

SUPFAM
Dataset

SUPFAM

Aids function association in genome analysis by remote homology detection.…

Aids function association in genome analysis by remote homology detection. SUPFAM is a database that includes sequence families of yet unknown structure in a known superfamily of known structures.…

PairsDB
Dataset

PairsDB Pairs Database

Offers a way to facilitate the establishment of family relationships between…

Offers a way to facilitate the establishment of family relationships between all known protein sequences. PairsDB aims to assign functions to novel proteins and to identify conserved parts in the…

FingerPRINTScan
Web

FingerPRINTScan

Allows users to discover gene function. FingerPRINTScan exploits contextual…

Allows users to discover gene function. FingerPRINTScan exploits contextual information to identify distant sequence relationships. It aims to improve confidence in the identification of fingerprints…

MACHOS
Algorithm

MACHOS MArkov Clusters of HOmologous Subsequences

Delineates homologous sequence families by resolving individual proteins into…

Delineates homologous sequence families by resolving individual proteins into subsequences. MACHOS is a method that (i) resolves proteins into sensible fragments which can show conflicting homology…

ProFITS
Dataset

ProFITS Protein Families Involved in the Transduction of Signalling

Categorizes transcription factors (TFs), protein kinases/phosphatases (PKs/PPs)…

Categorizes transcription factors (TFs), protein kinases/phosphatases (PKs/PPs) and ubiquitin proteasome-system (UPS)-related genes in maize. ProFITS is a database that provides users with a…

ELISA
Dataset

ELISA

Introduces and applies a novel algorithm for quantitative functional comparison…

Introduces and applies a novel algorithm for quantitative functional comparison between domains. ELISA is a database created to solve a long-standing problem in the domain evolution community of…

Related Websites
Information

By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.