Main logo
?
tutorial arrow
×
Submit new tools
Share tools covering the current topic. Provide easy-to-follow guidelines to improve their usability.
Share new tools with the community
Sign up for free to promote the availability of bioinformatics tools

Protein sequence databases

One of the essential requirements of the proteomics community is a high quality annotated nonredundant protein sequence database with an archival service and stable identifiers to enable protein identification and characterization.

IGC
Dataset

IGC integrated gene catalog

Represents a comprehensive resource for further investigations of the gut…

Represents a comprehensive resource for further investigations of the gut microbiome, covering strains with a diverse range of occurrence frequencies. IGC allows rapid and multi-omic profiling of the…

EBI
Dataset

EBI EMBL-EBI - The European Bioinformatics Institute

Supplies an access to several biological data resources. EBI is a database that…

Supplies an access to several biological data resources. EBI is a database that covers the entire range of biological sciences: raw DNA sequences to curated proteins, chemicals, structures, systems,…

UniProt
Dataset

UniProt Universal Protein Resource

A comprehensive resource for protein sequence and annotation data. The UniProt…

A comprehensive resource for protein sequence and annotation data. The UniProt databases are the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), the UniProt Archive…

RefSeq
Dataset

RefSeq Reference Sequence

Maintains and curates a publicly available database of annotated genomic,…

Maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records. The RefSeq project leverages the data submitted to the International Nucleotide…

neXtProt
Dataset

neXtProt

Offers a seamless integration of and navigation through protein-related data.…

Offers a seamless integration of and navigation through protein-related data. NeXtProt contains proteomics data for over 85% of human proteins. Moreover, this tool includes over 8000 phenotypic…

PATRIC
Dataset

PATRIC Pathosystems Resource Integration Center

Aims to assist scientists in infectious-disease research. PATRIC is a National…

Aims to assist scientists in infectious-disease research. PATRIC is a National Institute of Health (NIH) supported bioinformatics resource center that has been built to enable comparative genomic…

VectorBase
Dataset

VectorBase

A National Institute of Allergy and Infectious Diseases supported…

A National Institute of Allergy and Infectious Diseases supported Bioinformatics Resource Center (BRC) for invertebrate vectors of human pathogens. VectorBase currently hosts the genomes of 35…

HPRD
Dataset

HPRD Human Protein Reference Database

Provides access to experimentally derived information about the human proteome…

Provides access to experimentally derived information about the human proteome including protein–protein interactions (PPIs), post-translational modifications (PTMs) and tissue expression. HPRD is…

UniMES
Dataset
CyBase
Dataset

CyBase

Provides information about cyclic proteins. Cybase aims to compile a…

Provides information about cyclic proteins. Cybase aims to compile a standardized library gathering proteic sequences, nucleic sequences, 3D structures and assay results. Moreover, the database…

RiboDB
Dataset

RiboDB

Provides a user-friendly tool allowing the rapid retrieval of ribosomal protein…

Provides a user-friendly tool allowing the rapid retrieval of ribosomal protein (r-protein) sequences for user-defined sets of prokaryotic species. The current version of RiboDB contains 90…

NCBI Protein
Dataset

NCBI Protein

A collection of sequences from several sources, including translations from…

A collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB.

TGD Wiki
Dataset

TGD Wiki Tetrahymena genome database Wiki

Gathers information about Tetrahymena thermophila genome sequence. TGD Wiki…

Gathers information about Tetrahymena thermophila genome sequence. TGD Wiki provides a curation interface that allows users to update information about each gene: gene names, descriptions, Gene…

Uniclust
Dataset

Uniclust

Analyses protein sequence, predicts function and searches sequence. Uniclust…

Analyses protein sequence, predicts function and searches sequence. Uniclust databases cluster UniProtKB sequences at the level of 90%, 50% and 30% pairwise sequence identity. The sequences in the…

PDBSite
Dataset

PDBSite

Provides annotated protein functional sites. PDBSite integrates…

Provides annotated protein functional sites. PDBSite integrates physicochemical, structural and functional characteristics of the sites. It was used to search novel functional sites in the mutants of…

eF-site
Dataset

eF-site electrostatic surface of Functional-site

Offers molecular surfaces of protein functional sites. eF-site is an online…

Offers molecular surfaces of protein functional sites. eF-site is an online resource that scan each molecular surface at the same time with atomic mode. It provides electrostatic-surface of…

HAMAP
Web

HAMAP High-quality Automated and Manual Annotation of Proteins-available

A system for the classification and annotation of protein sequences. HAMAP…

A system for the classification and annotation of protein sequences. HAMAP consists of a collection of manually curated family profiles for protein classification, and associated annotation rules…

ProtClustDB
Dataset

ProtClustDB NCBI Protein Clusters Database

Provides several information about proteins. ProtClustDB is a resource composed…

Provides several information about proteins. ProtClustDB is a resource composed of two functions: (1) update RefSeq genomes with curated gene and protein information; (2) provide a central…

UniRef
Dataset

UniRef UniProt Reference clusters

Gives access to clustered sets of sequences from the UniProt Knowledgebase and…

Gives access to clustered sets of sequences from the UniProt Knowledgebase and selected UniParc records. Uniref is designed to remove sequence redundancy and reduce the number of sequences…

CCDB
Dataset

CCDB CyberCell DataBase

A comprehensive collection of detailed enzymatic, biological, chemical,…

A comprehensive collection of detailed enzymatic, biological, chemical, genetic, and molecular biological data about E. coli (strain K12, MG1655).

EcoGene
Dataset

EcoGene

A database and website devoted to continuously improving the structural and…

A database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12.

PMD
Dataset

PMD Protein Mutant Database

Provides a compilation of protein mutant data, providing information on…

Provides a compilation of protein mutant data, providing information on functional and/or structural influences brought about by amino acid mutations at specific positions of a protein. PMD is an…

LIS
Dataset

LIS Legume Information System

A genomic data portal (GDP) for the legume family. LIS provides access to…

A genomic data portal (GDP) for the legume family. LIS provides access to genetic and genomic information for major crop and model legumes. With more than two-dozen domesticated legume species, there…

MisPred
Dataset

MisPred Miss Predict Protein Database

Allows to identify erroneous (abnormal, incomplete and mispredicted) protein…

Allows to identify erroneous (abnormal, incomplete and mispredicted) protein sequences in public databases. MisPred is a database that contains more than 80800 erroneous sequences identified in 19…

ChromDB
Dataset

ChromDB The chromatin database

Compiles information about chromatin-related proteins. ChromDB includes plant…

Compiles information about chromatin-related proteins. ChromDB includes plant proteins to over 7474 proteins among 3328 plants, 1779 animals, 2143 fungi, 167 stramenopiles, and 57 protists. The…

A A A
SmProt
Dataset

SmProt Small Proteins database

Contains several information about small proteins. SmProt is a database that…

Contains several information about small proteins. SmProt is a database that provides a user-friendly website for users to submit, browse, search, blast, download or export data about small proteins.…

YRC PDR
Dataset

YRC PDR Yeast Resource Center Public Data Repository

Serves as a single point of access for the experimental data produced from many…

Serves as a single point of access for the experimental data produced from many collaborations typically studying Saccharomyces cerevisiae (baker's yeast).

UniProtKB
Dataset

UniProtKB UniProt KnowledgeBase

A protein database partially curated by experts, consisting of two sections:…

A protein database partially curated by experts, consisting of two sections: UniProtKB/Swiss-Prot (containing reviewed, manually annotated entries) and UniProtKB/TrEMBL (containing unreviewed,…

UniParc
Dataset

UniParc UniProt Archive

A comprehensive and non-redundant database that contains most of the publicly…

A comprehensive and non-redundant database that contains most of the publicly available protein sequences in the world. Proteins may exist in different source databases and in multiple copies in the…

AAindex
Dataset

AAindex Amino-Acid Index

A database of numerical indices representing various physicochemical and…

A database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids. AAindex consists of three sections: AAindex1 for the amino…

CDD
Dataset

CDD Conserved Domain Database

Provides a public repository for annotation of proteins. CDD includes more than…

Provides a public repository for annotation of proteins. CDD includes more than 56 000 records from all sources database dispatched into 5 600 multi-model superfamilies. The database incorporates…

LEGER
Dataset

LEGER

Supports functional Listeria genome analyses by combining information obtained…

Supports functional Listeria genome analyses by combining information obtained by applying bioinformatics methods and from public databases to improve the original annotations. LEGER offers three…

LenVarDB
Dataset

LenVarDB

Provides information about length-variant protein domains. LenVarDB…

Provides information about length-variant protein domains. LenVarDB systematically and automatically gathers sequences, aligns them to pre-existing structure-guided Protein Alignments organized as…

PON-NMR
Web

PON-NMR

Classifies amino acid substitutions (AASs) in mismatch repair (MMR) proteins.…

Classifies amino acid substitutions (AASs) in mismatch repair (MMR) proteins. PON-NMR ranks variants and prioritizes them for experimental validation. It is based on a machine learning method and is…

PROF_PAT
Dataset

PROF_PAT

Provides comparisons of amino acid sequences of interest with the bank of…

Provides comparisons of amino acid sequences of interest with the bank of patterns in interactive mode. PROF_PAT is a database of patterns, constructed for groups of related proteins so that…

PHYTOPROT
Web

PHYTOPROT

Enables an immediate and intuitive representation of the relationships between…

Enables an immediate and intuitive representation of the relationships between sequences in a given cluster.

CIPRO
Dataset

CIPRO Ciona Intestinalis PROtein database

An integrated protein database for the tunicate species C. intestinalis. The…

An integrated protein database for the tunicate species C. intestinalis. The database is unique in two respects: first, because of its phylogenetic position, Ciona is suitable model for understanding…

PhID
Dataset

PhID

Gathers network pharmacology related interactions information at the systemic…

Gathers network pharmacology related interactions information at the systemic level. PhID aims to provides a repository for visualizing relationships between entities such as drugs, targets,…

LiverWiki
Dataset

LiverWiki

Provides the research community with comprehensive liver-related data, as well…

Provides the research community with comprehensive liver-related data, as well as to allow the community to share their liver-related data flexibly and efficiently. LiverWiki integrates liver-related…

hivmut
Dataset

hivmut HIV Mutation

A database of mutagenesis and mutation information on Human Immunodefiency…

A database of mutagenesis and mutation information on Human Immunodefiency Virus (HIV). Hivmut describes the phenotypes of 7,608 unique mutations at 2,520 sites in the HIV proteome, resulting from…

SIFGD
Dataset

SIFGD Setaria italica Functional Genomics Database

Provides search and analysis tools for bioinformatics analyses of gene function…

Provides search and analysis tools for bioinformatics analyses of gene function or regulatory modules. SIFGD was designed to integrate existing data from publications, to improve the proportion of…

RefProtDom
Dataset
UniProt…
Dataset

UniProt proteomes

Provides 'proteome' sets of proteins thought to be expressed by…

Provides 'proteome' sets of proteins thought to be expressed by organisms whose genomes have been completely sequenced. UniProt proteomes is a database that gives access to “Reference…

Hypo
Dataset

Hypo

Offers a set of over 1000 curated records of the known ‘unknown’ regions in…

Offers a set of over 1000 curated records of the known ‘unknown’ regions in the human genome. Hypo presents a Biomarkers Matcher, along with a local gene card Matcher, which retrieves the Symbol,…

Kfits
Web
Desktop

Kfits

Fits noisy kinetic data. Kfits aims to differentiate and isolate signal from…

Fits noisy kinetic data. Kfits aims to differentiate and isolate signal from outliers. It has been tested on two very different datasets obtained by light scattering or by ThT fluorescence. This tool…

CHOmine
Dataset

CHOmine

Provides a comprehensive overview and thus a valuable resource for finding…

Provides a comprehensive overview and thus a valuable resource for finding relevant data. CHOmine is an InterMine based data warehouse for Chinese Hamster Ovary (CHO) data that connects gene…

JCDB
Dataset

JCDB Jatropha curcas DataBase

A database which offers gene annotation of Jatropha curcas, also known as…

A database which offers gene annotation of Jatropha curcas, also known as Barbados nut, purging nut or physic nut. Jatropha curcas is currently attracting much attention as a plant with high…

PRDB
Dataset

PRDB Protein Repeat DataBase

Performs a global comparative analysis of protein tandem repeats. PRDB is a…

Performs a global comparative analysis of protein tandem repeats. PRDB is a curated database that includes the protein tandem repeats found in sequence databanks by the T-REKS program. It contains…

GPDE
Desktop

GPDE Griss Proteomics Database Engine

A biological proteomic database specifically designed for clinical proteomics…

A biological proteomic database specifically designed for clinical proteomics and biomarker discovery. GPDE combines experiments based on investigated cell types thereby supporting customizable…

MUFOLD-DB
Dataset

MUFOLD-DB

Allows users to collect and process the weekly Protein Data Bank (PDB) files.…

Allows users to collect and process the weekly Protein Data Bank (PDB) files. MUFOLD-DB provides a web interface for users to browse, search and download several types of data. This database includes…

RNaseP Database
Dataset

RNaseP Database Ribonuclease P Database

Provides a compilation of ribonuclease P sequences, sequence alignments,…

Provides a compilation of ribonuclease P sequences, sequence alignments, secondary structures, three-dimensional models and accessory information. RNaseP Database contains information on bacterial,…

PISCES
Web

PISCES

A public server for culling sets of protein sequences from the Protein Data…

A public server for culling sets of protein sequences from the Protein Data Bank (PDB) by sequence identity and structural quality criteria.

BioAfrica
Dataset

BioAfrica

Provides information on Human Immunodeficiency Virus (HIV) proteins. BioAfrica…

Provides information on Human Immunodeficiency Virus (HIV) proteins. BioAfrica is a resource collaborating with Swiss-Prot ViralZone group of the Swiss Institute of Bioinformatics (SIB). The database…

Harmonizome
Dataset

Harmonizome

A collection of processed datasets gathered to serve and mine knowledge about…

A collection of processed datasets gathered to serve and mine knowledge about genes and proteins from over 70 major online resources. We extracted, abstracted and organized data into approximately 72…

SInCRe
Dataset

SInCRe Structural Interactome Computational Resource

An integrated database for Mycobacterium tuberculosis H37Rv (Mtb) that collates…

An integrated database for Mycobacterium tuberculosis H37Rv (Mtb) that collates information on protein sequences, domain assignments, functional annotation and 3D structural information along with…

CHOPIN
Dataset

CHOPIN

A database of models of the genome of Mycobacterium tuberculosis (Mtb). The…

A database of models of the genome of Mycobacterium tuberculosis (Mtb). The CHOPIN database assigns structural domains and generates homology models for 2911 sequences, corresponding to approximately…

FixPred
Dataset

FixPred

A collection of protein sequences corrected by the FixPred pipeline. The…

A collection of protein sequences corrected by the FixPred pipeline. The database contains corrected UniProtKB/Swiss-Prot and NCBI/RefSeq sequences from Homo sapiens, Mus musculus, Rattus norvegicus,…

HypoxiaDB
Dataset

HypoxiaDB

Provides access to hypoxia-regulated proteins. HypoxiaDB is a comprehensive…

Provides access to hypoxia-regulated proteins. HypoxiaDB is a comprehensive non-redundant catalog of proteins where manual curation along with the information from other resources has been…

STATdb
Dataset

STATdb

Holds information about signal transducers and activators of transcription…

Holds information about signal transducers and activators of transcription (STAT). STATdb is an online repository including STAT sequences, representing the known STATome, integrating existing…

MuteinDB
Dataset

MuteinDB The Mutein Database

Provides information about kinetic characteristics of muteins. MuteinDB is a…

Provides information about kinetic characteristics of muteins. MuteinDB is a platform to collect, catalog, and store experimentally derived data about muteins. These data come from publicly available…

CSANDS
Dataset

CSANDS Coding Sequence and Structure

Simplifies a comprehensive analysis of codon usage. CSANDS provides an…

Simplifies a comprehensive analysis of codon usage. CSANDS provides an accurate, curated mapping between over 4400 protein structures and the mRNA that encodes them. It makes a comprehensive analysis…

dbDiarrhea
Dataset

dbDiarrhea

Gathers information about proteins involved in diarrhea pathogenesis.…

Gathers information about proteins involved in diarrhea pathogenesis. dbDiarrhea is an open source database which compiles over 800 proteins from 14 different species. Besides, the repository also…

CliPro
Dataset

CliPro

Reflects alterations between normal versus diseased conditions. CliPro contains…

Reflects alterations between normal versus diseased conditions. CliPro contains proteins, diseases, biomarker, biofluid, protein type and plasma catalogues. Enables comparison between protein…

Uniprot DAT…
Desktop

Uniprot DAT File Parser

Can read a Uniprot .Dat file and parse out the information for each entry,…

Can read a Uniprot .Dat file and parse out the information for each entry, creating a series of tab delimited text files or creating a FASTA file. Uniprot DAT File Parser is a command-line (console)…

eBLOCKs
Dataset

eBLOCKs

Enumerates several protein blocks with varied conservation levels for each…

Enumerates several protein blocks with varied conservation levels for each functional domain. eBLOCKs contains blocks from an unclassified sequence database. This database is built with three major…

MulPSSM
Dataset

MulPSSM Multiple position-specific scoring matrices

Consists of position-specific scoring matrices (PSSMs) of protein domain…

Consists of position-specific scoring matrices (PSSMs) of protein domain sequence and structural families. MulPSSM is a web-based database which allows users to select datasets of PSSMs corresponding…

IndelFR
Dataset

IndelFR Indel Flanking Region Database

Provides data regarding indels and their flanking regions within homologous…

Provides data regarding indels and their flanking regions within homologous domains. IndelFR offers sequence and structure information of more than 2 900 000 indels and their flanking regions,…

DetoxiProt
Dataset

DetoxiProt

Facilitates annotation and acquisition of information about the detoxification…

Facilitates annotation and acquisition of information about the detoxification proteins. DetoxiProt is a database for detoxification protein researchers in the fields of physiology, pharmacology,…

MannDB
Dataset

MannDB

Gathers sequence analyses for complete proteomes of bacterial and viral…

Gathers sequence analyses for complete proteomes of bacterial and viral pathogens from several governmental agencies' lists of bio-threat agents. MannDB is an open source database that…

Cyano2DBase
Dataset

Cyano2DBase

A protein-gene linkage map of the unicellular cyanobacterium Synechosystis. The…

A protein-gene linkage map of the unicellular cyanobacterium Synechosystis. The Cyano2Dbase must be able to generate important functional informations about ORFs that have no easily assignable role.…

Information

By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.