tutorial arrow
×
Submit new tools
Share tools covering the current topic. Provide easy-to-follow guidelines to improve their usability.

Protein family databases | Comparison analysis

A protein family is a group of proteins that share a common evolutionary origin, reflected by their related functions and similarities in sequence or structure.

COGs
Dataset

COGs the Clusters of Orthologous Groups of proteins

Provides clusters of orthologous groups (COGs) and updated annotation of those…

Provides clusters of orthologous groups (COGs) and updated annotation of those COGs. COGs is a database where organisms are sorted according to the NCBI Taxonomy database. Each gene entry in a COG is…

InterPro
Web
Desktop

InterPro

Provides functional analysis of protein sequences. InterPro is a software which…

Provides functional analysis of protein sequences. InterPro is a software which allows to classify sequences into protein families and to predict the presence of important domains and sites. The…

G T A T C G C T A
PANTHER
Web

PANTHER Protein ANalysis THrough Evolutionary Relationships

A widely used online resource for comprehensive protein evolutionary and…

A widely used online resource for comprehensive protein evolutionary and functional classification, and includes tools for large-scale biological data analysis. The latest version of PANTHER, 10.0,…

SMART
Dataset

SMART Simple Modular Architecture Research Tool

A web resource providing simple identification and extensive annotation of…

A web resource providing simple identification and extensive annotation of protein domains and the exploration of protein domain architectures. In the current version, SMART contains manually curated…

Pfam
Dataset

Pfam

The database is a large collection of protein families, each represented by…

The database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs).

PROSITE
Dataset

PROSITE

Consists of documentation entries describing protein domains, families and…

Consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them.

ProDom
Dataset

ProDom

A comprehensive set of protein domain families automatically generated from the…

A comprehensive set of protein domain families automatically generated from the UniProt Knowledge Database.

pHMM-tree
Desktop
Web

pHMM-tree

Builds a phylogeny of protein families using the distance matrix of their…

Builds a phylogeny of protein families using the distance matrix of their poisson Hidden Markov Models (pHMMs). Although not designed for subfamily classification, if given a set of protein…

TIGRFAMs
Dataset

TIGRFAMs

A resource consisting of curated multiple sequence alignments, Hidden Markov…

A resource consisting of curated multiple sequence alignments, Hidden Markov Models (HMMs) for protein sequence classification, and associated information designed to support automated annotation of…

SPARCLE
Dataset

SPARCLE Subfamily Protein Architecture Labeling Engine

A resource for the functional characterization and labeling of protein…

A resource for the functional characterization and labeling of protein sequences that have been grouped by their characteristic conserved domain architecture. SPARCLE interface proposes to associate…

ECOD
Dataset

ECOD Evolutionary Classification Of protein Domains

A hierarchical classification of protein domains according to their…

A hierarchical classification of protein domains according to their evolutionary relationships. ECOD classifies proteins with experimentally determined spatial structures from the Protein Data Bank…

SYSTERS
Dataset

SYSTERS

Aims to provide a meaningful partitioning of the whole protein sequence space…

Aims to provide a meaningful partitioning of the whole protein sequence space by a fully automatic procedure.

GPU MrBayes
Desktop

GPU MrBayes

Improves computational efficiency when analyzing Bayesian phylogenetic…

Improves computational efficiency when analyzing Bayesian phylogenetic inference on protein data sets. GPU MrBayes is an efficient task mapping strategy which makes better use of GPU cores and GPU…

iProClass
Dataset

iProClass

Provides value-added views for UniProt protein entries and Protein Information…

Provides value-added views for UniProt protein entries and Protein Information Resource Super Family (PIRSF) entries with extensive annotation information and graphical displays. iProClass offers a…

VIDA
Dataset

VIDA

Contains a collection of homologous protein families derived from open reading…

Contains a collection of homologous protein families derived from open reading frames from complete and partial virus genomes.

ProtoNet
Dataset

ProtoNet

Consists in a data structure of protein families. ProtoNet aims to achieve an…

Consists in a data structure of protein families. ProtoNet aims to achieve an automatic hierarchical clustering of the protein sequences space. The database generates automatically, with no…

SMS
Web

SMS STING Millennium Suite

Provides a variety of algorithms and validated data, wrapped-up in a user…

Provides a variety of algorithms and validated data, wrapped-up in a user friendly web interface. STING Millennium Suite (SMS) is a new web-based suite of programs and databases providing…

SUPERFAMILY
Dataset

SUPERFAMILY

A database of structural and functional annotation for all proteins and…

A database of structural and functional annotation for all proteins and genomes. The SUPERFAMILY annotation is based on a collection of hidden Markov models, which represent structural protein…

GenDiS
Dataset

GenDiS Genomic Distribution of Protein Structural Domain Superfamilies

Provides structural assignments to genes listed within the non-redundant…

Provides structural assignments to genes listed within the non-redundant protein sequence database at the superfamily level. GenDiS is a compendium of sequence domains of evolutionarily related…

CyanoLyase
Dataset

CyanoLyase

A manually curated sequence and amino acid motif database gathering all the…

A manually curated sequence and amino acid motif database gathering all the different phycobilin lyases and related protein sequences available in public databases. CyanoLyase provides an extensive…

CUBE-DB
Dataset

CUBE-DB

Allows to find sub-family specific residues. CUBE-DB is a database of…

Allows to find sub-family specific residues. CUBE-DB is a database of pre-calculated results which includes visualizations and modifiable spreadsheets. This database serves to detection of functional…

PRINTS
Dataset

PRINTS

Stores a collection of diagnostic protein family fingerprints. PRINTS is a…

Stores a collection of diagnostic protein family fingerprints. PRINTS is a public domain database. Each fingerprint has been defined and iteratively refined using database scanning procedures within…

PIRSF
Web

PIRSF

The concept is being used as a guiding principle to provide comprehensive and…

The concept is being used as a guiding principle to provide comprehensive and non-overlapping clustering of UniProtKB sequences into a hierarchical order to reflect their evolutionary relationships.

FIGfams
Dataset

FIGfams

Provides a collection of over 100 000 protein families. FIGfams are sets of…

Provides a collection of over 100 000 protein families. FIGfams are sets of isofunctional homologues. Each one contains a set of proteins that are end-to-end homologous and share a common function…

dcGO
Dataset

dcGO

A comprehensive ontology database for protein domains. It is updated…

A comprehensive ontology database for protein domains. It is updated fortnightly, and the website provides downloads, search, browse, phylogenetic context and other data-mining facilities.

CyanoClust
Dataset

CyanoClust

A database of homolog groups in cyanobacteria and plastids that are produced by…

A database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. CyanoClust contains protein homology information for 38 cyanobacteria, 59 plastids and 1 Paulinella…

ProtoBug
Dataset

ProtoBug

A database and resource of protein families in Arthropod genomes. ProtoBug…

A database and resource of protein families in Arthropod genomes. ProtoBug platform presents the relatedness of complete proteomes from 17 insects as well as a proteome of the crustacean, Daphnia…

3PFDB
Dataset

3PFDB

Stores best representative profiles (BRP) of protein families. 3PFDB is a…

Stores best representative profiles (BRP) of protein families. 3PFDB is a database designed to find a best representative sequence (BRS) for each PFAM family. The database implements the two methods…

SCOOP
Desktop

SCOOP Simple Comparison Of Outputs Program

A simple way to compare protein families. Rather than directly comparing…

A simple way to compare protein families. Rather than directly comparing profiles, SCOOP compares the database search outputs that result from searching a sequence database with the profiles. It…

ADDA
Web

ADDA Automatic Domain Decomposition Algorithm

An automatic algorithm for domain decomposition and clustering of all protein…

An automatic algorithm for domain decomposition and clustering of all protein domain families. Alignments derived from an all-on-all sequence comparison are used to define domains within protein…

3DM
Desktop

3DM

Uses to extract clues about the function of specific amino acids and predict…

Uses to extract clues about the function of specific amino acids and predict effects of mutations. 3DM is a protein superfamily knowledgebase based upon a structure-based multiple sequence alignment…

prosextract
Desktop

prosextract

Processes the PROSITE motif database for use by patmatmotifs. Prosextract reads…

Processes the PROSITE motif database for use by patmatmotifs. Prosextract reads the Prosite files from the specified directory and writes an output file with the following information as output:…

TRIBE-MCL
Algorithm

TRIBE-MCL

Generates accurate protein families using the Markov Cluster (MCL) formalism…

Generates accurate protein families using the Markov Cluster (MCL) formalism for graph clustering by flow simulation. TRIBE-MCL is an algorithm that allows the efficient and rapid clustering of any…

MACHOS
Algorithm

MACHOS MArkov Clusters of HOmologous Subsequences

Delineates homologous sequence families by resolving individual proteins into…

Delineates homologous sequence families by resolving individual proteins into subsequences. MACHOS is a method that (i) resolves proteins into sensible fragments which can show conflicting homology…

Related Websites
Information

By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.