Protein annotation software tools | Sequence data analysis
The last decade has seen a remarkable growth in protein databases. This growth comes at a price: a growing number of submitted protein sequences lack functional annotation. Approximately 32% of sequences submitted to the most comprehensive protein database UniProtKB are labelled as 'Unknown protein' or alike.
An ontology, a hierarchical controlled vocabulary forming a directed acyclic graph consisting of terms and definitions for protein chemical modifications, the nodes, logically linked by specific relationships, the edges. The PSI-MOD search allows users to query the full content of a number of existing public resources about protein modification and navigate easily through the hierarchy of modifications.
Allows users to access and preprocess structural data for all kinds of life science research, and gives an immediate visual impression of the overall protein structure and contained ligand molecules. ProteinPlus contains a server for special interest to life scientists with an occasional need to work with protein structures thanks to six services addressing the most important tasks at the beginning of structure analysis (Protoss; PoseView; EDIA; SIENA; DoGSiteScorer; HyPPI). Users can choose an application service of interest, set additional tool configurations and start the calculation.
Gives access to many free software tools for sequence analysis. EMBOSS aims to serve the molecular biology community. It permits the creation and the release of software in an open source spirit. This tool is useful for sequence analysis into a seamless whole. It is free of charge and is available in open source.
Detects protein structural and functional features from sequence. BAR performs transfer of statistically validated annotation among sequences that enter a cluster after constraining the alignment with stringent similarity criteria. The latest version allows user to query new sequences, such as ligands and organism or uses the server to investigate cross-cluster connections.
A high-throughput tool for more reliable functional annotation. PANNZER predicts Gene Ontology (GO) classes and free text descriptions about protein functionality. It uses weighted k-nearest neighbour methods with statistical testing to maximize the reliability of a functional annotation.
An ontological representation of protein-related entities. PRO defines and describes taxon-specific and taxon-neutral protein-related entities in three major areas: proteins related by evolution; proteins produced from a given gene and protein-containing complexes. PRO thus serves as a tool for referencing protein entities at any level of specificity. PRO organizes these entities into classes describing proteins derived from homologs (‘family level’ classes), from a single gene (‘gene level’ classes), from a single transcript (‘sequence level’ classes), or from a set of modifications (‘modification level’ classes). Each of these categories of classes are neutral with respect to taxonomy, but there are also taxon-specific versions (e.g. ‘organism-gene level’), thus allowing PRO to highlight connections and differences within and across species.
A web app to identifye automatically hot spots and design of smart libraries for engineering proteins’ stability, catalytic activity, substrate specificity and enantioselectivity. HotSpot Wizard integrates sequence, structural and evolutionary information obtained from 3 databases and 20 computational tools. The method provides comprehensive annotations of protein structures and assists protein engineers with the rational design of site-specific mutations and focused libraries.favourite protein and for the design of mutations in site-directed mutagenesis and focused directed evolution experiments.
A powerful computational platform for probabilistic protein domain annotation. Compared to SIFTER 2.0, SIFTER-T achieved an 87-fold performance improvement using published test data sets for the known annotations recovering module and a 72.3% speed increase for the gene tree generation module in quad-core machines, as well as a major decrease in memory usage during the realignment phase.
Provides Gene Ontology (GO) annotations for query protein sequences based on the functional classification of the domain-based CATH-Gene3D resource. FunFHMMer also provides valuable information for the prediction of functional sites. The FunFHMMer web server can be queried using a protein sequence in the FASTA format or by entering UniProt/GenBank sequence identifiers as input in the text area on the webpage.
Allows users to compare genomics that include functional information and families with the taxonomic classification. GOTaxExplorer provides users four query types: (1) selection of sets, (2) comparison of sets of Pfam families, (3) semantic comparison of sets of gene ontology (GO) terms and (4) functional comparison of sets of gene products. This software enables to customize sets of GO terms, families or taxonomic groups.
An automated bioinformatic pipeline for prediction and classification of Rab GTPases. Rabifier2 is a major update over the pipeline originally developed to identify Rab GTPases and classify them into subfamilies based on the protein sequence. It provides major improvements in Rab annotation, both in terms of speed and accuracy. It is used to annotate Rab diversity across Eukaryotes, which can be explored through the web.
Offers an interface to interpret protein family. PipeAlign automates the initial stages of the analysis process. It can retrieve homologous sequences and other related information and the hierarchical organization of this information in the context of a multiple alignment of complete sequences (MACS). This tool contains features for the refinement, validation and clustering of existing multiple sequence alignments.
Classifies and annotates protein sequences with the HAMAP database. Hamap-Scan can be used on large datasets such as whole proteome sequences. This tool provides a heuristic filter to score and select possible candidate matches.
Allows users to determine various properties of each protein in an entire proteome. PA permits researchers to perform several tasks: (1) prediction of the GeneQuiz general function and Gene Ontology (GO) molecular function of a protein; (2) prediction of the subcellular localization; or (3) creation of a custom classifier to predict a new property. Moreover, this tool can be used for any user-specified ontology.
Furnishes a de novo protein annotation system. ANNIE is a web interface that provides about twenty algorithms encompassing the first steps of the analysis segment-based sequence. The platform aims supplies a review of the possible functional assignments in protein sequence sets. Results can be examined separately or displayed together through a sequence viewer including a histogram and a taxonomy view.
Provides searching and programmatic access to protein and associated genomics data such as curated protein sequence positional annotations from UniProtKB, as well as mapped variation and proteomics data from large scale data sources (LSS). The Proteins API permits to retrieve the genomic sequence coordinates for proteins in UniProtKB. It allows the user to ask questions based upon his/her field of expertise and allowing him/her to gain an integrated overview of protein annotations available to aid knowledge gain on proteins in biological processes.
Maps protein features onto 3D structures. FeatureMap3D is a web application that works in 3 steps: (i) it searches the best homology and best resolution in combination, and provides a file containing the mapping between the input data and the PDB match; (ii) it provides a PyMol script which colors the matched PDB structure according to coverage and quality of the hit; and (iii) a publication quality ray-traced image is generated.
Serves for statistical modeling, visualization, discovery and annotation of protein motif specificity determinants. PSSMSearch takes a set of known functional motifs, defines the preferences of a motif-recognizing pocket, and uses this information to search for novel regions of the proteome matching the preferences of the pocket. Futhermore, this tool was designed to find human and viral docking motifs for the human phosphatase.
Allows users to automatically annotate structural models. 3DBIONOTES authorizes users to: (i) expose the macromolecular structure submitted or queried by users; (ii) show the biomedical and biochemical data gathered from different sources; (iii) obtain the alignment between the selected UniProt sequence and the active chain of the structural model. However, the tool was only available for structures already released in structural databases.
Allows to annotate protein sequence alignments with three-dimensional structural information. JOY serves as a post-processor to a protein structural alignment program by taking an alignment file and generating annotated alignments. Users can visualize 3D structural information in a sequence alignment to comprehend the conservation of amino acids in their specific local environments.