Main logo
?
tutorial arrow
×
Submit new tools
Share tools covering the current topic. Provide easy-to-follow guidelines to improve their usability.
Share new tools with the community
Sign up for free to promote the availability of bioinformatics tools

Protein sequence databases | Sequence analysis

One of the essential requirements of the proteomics community is a high quality annotated nonredundant protein sequence database with an archival service and stable identifiers to enable protein identification and characterization.

IGC
Dataset

IGC integrated gene catalog

Represents a comprehensive resource for further investigations of the gut…

Represents a comprehensive resource for further investigations of the gut microbiome, covering strains with a diverse range of occurrence frequencies. IGC allows rapid and multi-omic profiling of the…

EBI
Dataset

EBI EMBL-EBI - The European Bioinformatics Institute

Supplies an access to several biological data resources. EBI is a database that…

Supplies an access to several biological data resources. EBI is a database that covers the entire range of biological sciences: raw DNA sequences to curated proteins, chemicals, structures, systems,…

UniProt
Dataset

UniProt Universal Protein Resource

A comprehensive resource for protein sequence and annotation data. The UniProt…

A comprehensive resource for protein sequence and annotation data. The UniProt databases are the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), the UniProt Archive…

RefSeq
Dataset

RefSeq Reference Sequence

Offers annotation for over 95 000 genomes. RefSeq is an online resource that…

Offers annotation for over 95 000 genomes. RefSeq is an online resource that provides users with a resource of quality. It also assigns informative names to genes, provides some annotation for every…

neXtProt
Dataset

neXtProt

Offers a seamless integration of and navigation through protein-related data.…

Offers a seamless integration of and navigation through protein-related data. NeXtProt contains proteomics data for over 85% of human proteins. Moreover, this tool includes over 8000 phenotypic…

PATRIC
Dataset

PATRIC Pathosystems Resource Integration Center

Aims to assist scientists in infectious-disease research. PATRIC is a National…

Aims to assist scientists in infectious-disease research. PATRIC is a National Institute of Health (NIH) supported bioinformatics resource center that has been built to enable comparative genomic…

VectorBase
Dataset

VectorBase

A National Institute of Allergy and Infectious Diseases supported…

A National Institute of Allergy and Infectious Diseases supported Bioinformatics Resource Center (BRC) for invertebrate vectors of human pathogens. VectorBase currently hosts the genomes of 35…

HPRD
Dataset

HPRD Human Protein Reference Database

Provides access to experimentally derived information about the human proteome…

Provides access to experimentally derived information about the human proteome including protein–protein interactions (PPIs), post-translational modifications (PTMs) and tissue expression. HPRD is an…

UniMES
Dataset

UniMES UniProt Metagenomic and Environmental Sequences

A repository specifically developed for metagenomic and environmental data.

A repository specifically developed for metagenomic and environmental data.

CyBase
Dataset

CyBase

Provides information about cyclic proteins. Cybase aims to compile a…

Provides information about cyclic proteins. Cybase aims to compile a standardized library gathering proteic sequences, nucleic sequences, 3D structures and assay results. Moreover, the database…

RiboDB
Dataset

RiboDB

Provides a user-friendly tool allowing the rapid retrieval of ribosomal protein…

Provides a user-friendly tool allowing the rapid retrieval of ribosomal protein (r-protein) sequences for user-defined sets of prokaryotic species. The current version of RiboDB contains 90…

NCBI Protein
Dataset

NCBI Protein

A collection of sequences from several sources, including translations from…

A collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB.

TGD Wiki
Dataset

TGD Wiki Tetrahymena genome database Wiki

Gathers information about Tetrahymena thermophila genome sequence. TGD Wiki…

Gathers information about Tetrahymena thermophila genome sequence. TGD Wiki provides a curation interface that allows users to update information about each gene: gene names, descriptions, Gene…

Uniclust
Dataset

Uniclust

Analyses protein sequence, predicts function and searches sequence. Uniclust…

Analyses protein sequence, predicts function and searches sequence. Uniclust databases cluster UniProtKB sequences at the level of 90%, 50% and 30% pairwise sequence identity. The sequences in the…

PDBSite
Dataset

PDBSite

Provides annotated protein functional sites. PDBSite integrates…

Provides annotated protein functional sites. PDBSite integrates physicochemical, structural and functional characteristics of the sites. It was used to search novel functional sites in the mutants of…

eF-site
Dataset

eF-site electrostatic surface of Functional-site

Offers molecular surfaces of protein functional sites. eF-site is an online…

Offers molecular surfaces of protein functional sites. eF-site is an online resource that scan each molecular surface at the same time with atomic mode. It provides electrostatic-surface of…

ProtClustDB
Dataset

ProtClustDB NCBI Protein Clusters Database

Provides several information about proteins. ProtClustDB is a resource composed…

Provides several information about proteins. ProtClustDB is a resource composed of two functions: (1) update RefSeq genomes with curated gene and protein information; (2) provide a central…

UniRef
Dataset

UniRef UniProt Reference clusters

Gives access to clustered sets of sequences from the UniProt Knowledgebase and…

Gives access to clustered sets of sequences from the UniProt Knowledgebase and selected UniParc records. Uniref is designed to remove sequence redundancy and reduce the number of sequences…

CCDB
Dataset

CCDB CyberCell DataBase

A comprehensive collection of detailed enzymatic, biological, chemical,…

A comprehensive collection of detailed enzymatic, biological, chemical, genetic, and molecular biological data about E. coli (strain K12, MG1655).

EcoGene
Dataset

EcoGene

A database and website devoted to continuously improving the structural and…

A database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12.

PMD
Dataset

PMD Protein Mutant Database

Provides a compilation of protein mutant data, providing information on…

Provides a compilation of protein mutant data, providing information on functional and/or structural influences brought about by amino acid mutations at specific positions of a protein. PMD is an…

LIS
Dataset

LIS Legume Information System

A genomic data portal (GDP) for the legume family. LIS provides access to…

A genomic data portal (GDP) for the legume family. LIS provides access to genetic and genomic information for major crop and model legumes. With more than two-dozen domesticated legume species, there…

MisPred
Dataset

MisPred Miss Predict Protein Database

Allows to identify erroneous (abnormal, incomplete and mispredicted) protein…

Allows to identify erroneous (abnormal, incomplete and mispredicted) protein sequences in public databases. MisPred is a database that contains more than 80800 erroneous sequences identified in 19…

ChromDB
Dataset

ChromDB The chromatin database

Compiles information about chromatin-related proteins. ChromDB includes plant…

Compiles information about chromatin-related proteins. ChromDB includes plant proteins to over 7474 proteins among 3328 plants, 1779 animals, 2143 fungi, 167 stramenopiles, and 57 protists. The…

A A A
SmProt
Dataset

SmProt Small Proteins database

Contains several information about small proteins. SmProt is a database that…

Contains several information about small proteins. SmProt is a database that provides a user-friendly website for users to submit, browse, search, blast, download or export data about small proteins.…

YRC PDR
Dataset

YRC PDR Yeast Resource Center Public Data Repository

Serves as a single point of access for the experimental data produced from many…

Serves as a single point of access for the experimental data produced from many collaborations typically studying Saccharomyces cerevisiae (baker's yeast).

OWL
Dataset

OWL

Provides a non-redundant composite of the major publicly-available primary…

Provides a non-redundant composite of the major publicly-available primary sources, including a translated nucleic acid sequence database. OWL can be useful in the molecular biology community for…

UniProtKB
Dataset

UniProtKB UniProt KnowledgeBase

A protein database partially curated by experts, consisting of two sections:…

A protein database partially curated by experts, consisting of two sections: UniProtKB/Swiss-Prot (containing reviewed, manually annotated entries) and UniProtKB/TrEMBL (containing unreviewed,…

UniParc
Dataset

UniParc UniProt Archive

A comprehensive and non-redundant database that contains most of the publicly…

A comprehensive and non-redundant database that contains most of the publicly available protein sequences in the world. Proteins may exist in different source databases and in multiple copies in the…

AAindex
Dataset

AAindex Amino-Acid Index

A database of numerical indices representing various physicochemical and…

A database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids. AAindex consists of three sections: AAindex1 for the amino…

CDD
Dataset

CDD Conserved Domain Database

Provides a public repository for annotation of proteins. CDD includes more than…

Provides a public repository for annotation of proteins. CDD includes more than 56 000 records from all sources database dispatched into 5 600 multi-model superfamilies. The database incorporates…

LEGER
Dataset

LEGER

Supports functional Listeria genome analyses by combining information obtained…

Supports functional Listeria genome analyses by combining information obtained by applying bioinformatics methods and from public databases to improve the original annotations. LEGER offers three…

LenVarDB
Dataset

LenVarDB

Provides information about length-variant protein domains. LenVarDB…

Provides information about length-variant protein domains. LenVarDB systematically and automatically gathers sequences, aligns them to pre-existing structure-guided Protein Alignments organized as…

PROF_PAT
Dataset

PROF_PAT

Provides comparisons of amino acid sequences of interest with the bank of…

Provides comparisons of amino acid sequences of interest with the bank of patterns in interactive mode. PROF_PAT is a database of patterns, constructed for groups of related proteins so that…

PHYTOPROT
Web

PHYTOPROT

Enables an immediate and intuitive representation of the relationships between…

Enables an immediate and intuitive representation of the relationships between sequences in a given cluster.

CIPRO
Dataset

CIPRO Ciona Intestinalis PROtein database

An integrated protein database for the tunicate species C. intestinalis. The…

An integrated protein database for the tunicate species C. intestinalis. The database is unique in two respects: first, because of its phylogenetic position, Ciona is suitable model for understanding…

PhID
Dataset

PhID

Gathers network pharmacology related interactions information at the systemic…

Gathers network pharmacology related interactions information at the systemic level. PhID aims to provides a repository for visualizing relationships between entities such as drugs, targets,…

LiverWiki
Dataset

LiverWiki

Provides the research community with comprehensive liver-related data, as well…

Provides the research community with comprehensive liver-related data, as well as to allow the community to share their liver-related data flexibly and efficiently. LiverWiki integrates liver-related…

hivmut
Dataset

hivmut HIV Mutation

A database of mutagenesis and mutation information on Human Immunodefiency…

A database of mutagenesis and mutation information on Human Immunodefiency Virus (HIV). Hivmut describes the phenotypes of 7,608 unique mutations at 2,520 sites in the HIV proteome, resulting from…

SIFGD
Dataset

SIFGD Setaria italica Functional Genomics Database

Provides search and analysis tools for bioinformatics analyses of gene function…

Provides search and analysis tools for bioinformatics analyses of gene function or regulatory modules. SIFGD was designed to integrate existing data from publications, to improve the proportion of…

RefProtDom
Dataset

RefProtDom a protein database with improved domain boundaries and homology relationships

Informs about relationships and alignment boundaries between query domains and…

Informs about relationships and alignment boundaries between query domains and target library homologs. RefProtDom uses diverse set of full-length, multi-domain, proteins in the target library. It…

UniProt…
Dataset

UniProt proteomes

Provides 'proteome' sets of proteins thought to be expressed by…

Provides 'proteome' sets of proteins thought to be expressed by organisms whose genomes have been completely sequenced. UniProt proteomes is a database that gives access to “Reference…

CHOmine
Dataset

CHOmine

Provides a comprehensive overview and thus a valuable resource for finding…

Provides a comprehensive overview and thus a valuable resource for finding relevant data. CHOmine is an InterMine based data warehouse for Chinese Hamster Ovary (CHO) data that connects gene…

JCDB
Dataset

JCDB Jatropha curcas DataBase

A database which offers gene annotation of Jatropha curcas, also known as…

A database which offers gene annotation of Jatropha curcas, also known as Barbados nut, purging nut or physic nut. Jatropha curcas is currently attracting much attention as a plant with high…

PRDB
Dataset

PRDB Protein Repeat DataBase

Performs a global comparative analysis of protein tandem repeats. PRDB is a…

Performs a global comparative analysis of protein tandem repeats. PRDB is a curated database that includes the protein tandem repeats found in sequence databanks by the T-REKS program. It contains…

GPDE
Desktop

GPDE Griss Proteomics Database Engine

A biological proteomic database specifically designed for clinical proteomics…

A biological proteomic database specifically designed for clinical proteomics and biomarker discovery. GPDE combines experiments based on investigated cell types thereby supporting customizable…

MUFOLD-DB
Dataset

MUFOLD-DB

Allows users to collect and process the weekly Protein Data Bank (PDB) files.…

Allows users to collect and process the weekly Protein Data Bank (PDB) files. MUFOLD-DB provides a web interface for users to browse, search and download several types of data. This database includes…

A Syllabus of…
Dataset

A Syllabus of Human Hemoglobin Variants

Provides information about known human Hb variants. A Syllabus of Human…

Provides information about known human Hb variants. A Syllabus of Human Hemoglobin Variants contains more than 600 Hb variants. This resource is useful for researchers wishing to better understand…

RNaseP Database
Dataset

RNaseP Database Ribonuclease P Database

Provides a compilation of ribonuclease P sequences, sequence alignments,…

Provides a compilation of ribonuclease P sequences, sequence alignments, secondary structures, three-dimensional models and accessory information. RNaseP Database contains information on bacterial,…

Hypo
Dataset

Hypo

Offers a set of over 1000 curated records of the known ‘unknown’ regions in the…

Offers a set of over 1000 curated records of the known ‘unknown’ regions in the human genome. Hypo presents a Biomarkers Matcher, along with a local gene card Matcher, which retrieves the Symbol,…

Information

By using OMICtools you acknowledge that you have read and accepted the terms of the end user license agreement.