UniqueProt specifications

Information


Unique identifier OMICS_20063
Name UniqueProt
Alternative name UniqueProt2
Software type Application/Script
Interface Command line interface
Restrictions to use None
Input data A file of protein sequences.
Input format FASTA
Operating system Unix/Linux
Programming languages Fortran
License GNU General Public License version 3.0
Computer skills Advanced
Stability Stable
Maintained Yes

Versioning


Add your version

Publication for UniqueProt

UniqueProt in publications

 (11)
PMCID: 5665533
PMID: 29091950
DOI: 10.1371/journal.pone.0187306

[…] proteome. the solvent accessible residues (‘exposed residues’) on the gpcrs were predicted using the predictprotein package []. the representative set of human proteins was obtained by using the uniqueprot software [] on the human proteome obtained from uniprot. using the frequencies of each amino acid in the human proteome, we calculated the log base10 fold change by dividing the frequency […]

PMCID: 5199198
PMID: 28025334
DOI: 10.1093/database/baw139

[…] proteomes., there are a host of software programs and methods aimed at minimizing redundancy within protein sequence databases. these include skipredundant (), decrease redundancy (), pisces (), uniqueprot (), cd-hit (), fsa-blast (), blastclust (ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.html), minset (), blastculler (), fastcluster (), leaf (), uclust () and compressive genomics (). […]

PMCID: 5054392
PMID: 27713481
DOI: 10.1038/srep34516

[…] roughly 470,000 proteins., we removed from our sets all proteins that were annotated as ‘uncharacterized’, ‘putative’, or ‘fragment’. we reduced sequence redundancy independently in each set using uniqueprot, ascertaining that no pair of proteins in one set had alignment length of less than 35 residues or a positive hssp-value (hval ≥ 0). after redundancy reduction our sequence-unique sets […]

PMCID: 4670006
PMID: 26673203
DOI: 10.5256/f1000research.7734.r11122

[…] the yeast ( s. cerevisiae) proteome from uniprot (proteome id: up000002311) as fasta files including only the reviewed proteins (uniprotkb/swiss-prot). removal of duplicates applying the method uniqueprot2 (with 100% pairwise sequence identity, keeping the longer sequence) left 5667 proteins ( ). we considered the 16 nuclear chromosomes (matched through http://www.yeastgenome.org , […]

PMCID: 4100115
PMID: 24886813
DOI: 10.3390/ijms15069670

[…] used in loctree []. eighteen localization classes are covered for eukaryotic proteins in loctree2 as opposed to six in loctree. also, the sequence bias was reduced compared to loctree by using the uniqueprot approach []. this corresponds to a threshold of 20% and 25% sequence identity for sequences longer than 250 amino acids for the development and testing datasets, respectively. the highest […]


To access a full list of publications, you will need to upgrade to our premium service.

UniqueProt institution(s)
CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA; Institute of Physical Biochemistry, University Witten/Herdecke, Witten, Germany; Columbia University Center for Computational Biology and Bioinformatics (C2B2), New York, NY, USA; North East Structural Genomics Consortium (NESG), Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
UniqueProt funding source(s)
Supported by the grants RO1-GM63029-01 from the National Institute of Health (NIH) and 1-R01- LM07329-01 from the National Library of Medicine (NLM).

UniqueProt reviews

star_border star_border star_border star_border star_border
star star star star star

Be the first to review UniqueProt