ProtClustDB specifications


Unique identifier OMICS_21322
Name ProtClustDB
Alternative name Protein Clusters Database
Restrictions to use None
Community driven Yes
Data access File download, Browse
User data submission Not allowed
Maintained Yes


  • person_outline William Klimke

Publication for Protein Clusters Database

ProtClustDB citations


Contribution of increased mutagenesis to the evolution of pollutants degrading indigenous bacteria

PLoS One
PMCID: 5544203
PMID: 28777807
DOI: 10.1371/journal.pone.0182484

[…] ed contigs using the program BLASTP version 2.2.31 with default parameters except for the E value cutoff which was 10−3. The database file for searches was created from sequences obtained from NCBI’s Protein Clusters database (1976 sequences of different proteins involved in DNA replication and repair). The ORFs of the selected contigs were subsequently predicted using the program Prodigal version […]


Transposable element assisted evolution and adaptation to host plant within the Leptosphaeria maculans Leptosphaeria biglobosa species complex of fungal pathogens

BMC Genomics
PMCID: 4210507
PMID: 25306241
DOI: 10.1186/1471-2164-15-891

[…] formed on all versus all BLASTp results from protein sequences of 80 annotated genomes at NCBI (Additional file : Table S13). This procedure is similar to that used to generate clusters in the Entrez Protein Clusters database at NCBI (ProtClustDB). The following filters were applied: cluster members were required to have compositional BLAST hits covering at least 70% of each protein length and a p […]


Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools

BMC Genomics
PMCID: 3618134
PMID: 23547799
DOI: 10.1186/1471-2164-14-211

[…] OG clusters [], Pfam [], TIGRfam, Gene Ontology etc. Functional annotations may be further “grouped” into metabolically relevant “pathways” such as COG functional categories, Entrez Protein Clusters (ProtClustDB [], FIGfams-subsystems [], KEGG [] and MetaCyc [] pathway collections, etc. Thereafter, annotated genomes might be maintained by integrated network systems such as RAST-SEED, IMG and other […]


Population Diversity of ORFan Genes in Escherichia coli

Genome Biol Evol
PMCID: 3514957
PMID: 23034216
DOI: 10.1093/gbe/evs081
call_split See protocol

[…] Clusters of homologous sequences used in this study ultimately derive from the Protein Clusters database (), a collection of automatically clustered Reference Sequence proteins from complete genomes (of prokaryotes, plasmids, viruses, organelles, and complete and incomplete geno […]


Solving the Problem: Genome Annotation Standards before the Data Deluge

Stand Genomic Sci
PMCID: 3236044
PMID: 22180819
DOI: 10.4056/sigs.2084864

[…] Automated and Manual Annotation of microbial and chloroplast Proteomes (HAMAP), the Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology groups (KO) that uses NCBI Reference Sequences, and NCBI's Protein Clusters database that includes prokaryote, viral, and selected eukaryotic organism groups (ProtClustDB) [], [,,,]. The TIGRFAMs and HAMAP projects contain only curated families, whereas KEGG […]


CDD: specific functional annotation with the Conserved Domain Database

Nucleic Acids Res
PMCID: 2686570
PMID: 18984618
DOI: 10.1093/nar/gkn845

[…] the time is spent formatting the output. Users of the service can choose between the default search set, which collates NCBI-curated domain models and those imported from SMART, Pfam, COGs and NCBI's Protein Clusters database, and individual search sets, such as all of the above and the KOGs collection (), which is not part of the default set.When users submit protein query sequences to NCBI's pro […]


ProtClustDB institution(s)
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD, USA

