Protein annotation software tools | Sequence data analysis
The last decade has seen a remarkable growth in protein databases. This growth comes at a price: a growing number of submitted protein sequences lack functional annotation. Approximately 32% of sequences submitted to the most comprehensive protein database UniProtKB are labelled as 'Unknown protein' or alike.
Serves as an unparalleled, integrated, multi-modular sequence analysis and data management toolbox. Vector NTI Advance includes several application modules, a set of data analysis and management tools. The key features of this tool are also gel simulation, in-silico cloning, design primers for amplification and sequencing, multi-sequence alignment, restriction analysis or contig assembly.
Gives access to many free software tools for sequence analysis. EMBOSS aims to serve the molecular biology community. It permits the creation and the release of software in an open source spirit. This tool is useful for sequence analysis into a seamless whole. It is free of charge and is available in open source.
Automatically assigns K numbers to genes in the genome, enabling reconstruction of KEGG pathways and BRITE hierarchies. The method is based on sequence similarities, bi-directional best hit information and some heuristics, and has achieved a high degree of accuracy when compared with the manually curated KEGG GENES database.
Permits users to detail and annotate probesets on Affymetrix GeneChip microarrays. NetAffx allows users to search for probesets matching specified criteria such as annotation terms, as well as to identify any probesets relevant to a user-specified DNA sequence. It furnishes protein annotations derived by sequence homology, using the GRAPA method on collections of hidden Markov models (HMMs), representing well-characterized protein families.
Allows users to utilize annotated protein sequence databases for the understanding of Mass Spectrometry (MS) data. PeptideMass generates the theoretical peptide masses of any protein in the SWISS-PROT database, or of any sequence specified. It also aids the process of peptide and protein characterization. It is useful for the identification of proteins by peptide mass fingerprinting, and the identification of proteins by amino acid composition.
Calculates the geometries of hydrogen bonds. HBPLUS can describe neighbor interactions and compute hydrogen positions. It is able to consider hydrogens that can occupy more than one position and include amino-aromatic H-bonds. This tool investigates H-bonding near Asn, Gln and His side-chains and suggests optimal conformations.
Allows users to determine various properties of each protein in an entire proteome. PA permits researchers to perform several tasks: (1) prediction of the GeneQuiz general function and Gene Ontology (GO) molecular function of a protein; (2) prediction of the subcellular localization; or (3) creation of a custom classifier to predict a new property. Moreover, this tool can be used for any user-specified ontology.