Protein subcellular localization detection software tools | Sequence data analysis
The function of a protein is generally related to its subcellular localization. Therefore, knowing its subcellular localization is helpful in understanding its potential functions and roles in biological processes.
Predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes. SignalP is a neural network–based method which can discriminate signal peptides from transmembrane regions. The software incorporates a prediction of cleavage sites and a signal peptide/non-signal peptide prediction based on a combination of several artificial neural networks.
Provides a suite of methods important for the prediction of protein structural and functional features. predictProtein is a web server that incorporates over 30 tools. This software searches up-to-date public sequence databases, creates alignments, and predicts aspects of protein structure and function. It can help when little is known about the protein in question. For medium-to-high throughput analyses, downloadable software packages and the PredictProtein Machine Image (PPMI) are available.
Provides machine learning and visualization methods for interrogating and analyzing on quantitative mass spectrometry (MS) data to infer protein sub-cellular localization. PRoloc is suited for spatial proteomics data analysis provided as an R package that performs sub-cellular localization prediction from experimental and condition-specific MS-based quantitative proteomics data. The software allows classification of proteins to tens of sub-cellular compartments.
Allows users to predict eukaryotic proteins location. TargetP is a web application that scores N-terminal pre-sequences in a submitted protein. The software indicates chloroplast transit peptide (cTP), mitochondrial targeting peptide (mTP) and secretory pathway signal peptide (SP) predicted localization. The application includes parameters which allow choosing between in Plants and Non-Plants version, personalized cutoffs and the possibility to determine cleavage sites.
A tool for the analysis of subcellular proteomics data, based on the use of standardized lists of subcellular markers. MetaMass analyzed data from 11 studies using MetaMass, mapping the subcellular location of 5,970 proteins. Our analysis revealed large variations in the performance of subcellular fractionation protocols as well as systematic biases in protein annotation databases. The Excel and R versions of MetaMass should enhance transparency and reproducibility in subcellular proteomics.
A computational method that given any Eukaryotic protein sequence performs three different tasks: i) the detection of targeting peptides; ii) their classification as mitochondrial or chloroplastic, and iii) the precise localization of the cleavage sites in an organelle-specific framework. TPpred3 outperforms the state-of-the-art methods in all the three tasks.
Predicts the amino acid sequence-based human protein subcellular location to cover human subcellular localizations. The sequences are represented by multi-view complementary features, i.e., context vocabulary annotation-based gene ontology (GO) terms, peptide-based functional domains, and residue-based statistical features. The major updates include: i) taking into consideration feature correlation and the hierarchical structure of GO terms; ii) extracting residue features from different segments of N and C-terminals; and iii) use of the latest versions of gene ontology, conserved domain database and SWISS-PROT database. Hum-mPLoc is designed to predict subcellular localization of human proteins.
A computer program for prediction of the classical importin-alpha/beta pathway-specific nuclear localization signals (NLSs). cNLS Mapper calculates NLS activities by using these profiles and an additivity-based motif scoring algorithm. This calculation method achieved significantly higher prediction accuracy in terms of both sensitivity and specificity than did current methods.
Provides a multi-label predictor that can improve the prediction quality for the subcellular localization of animal proteins. pLoc-mAnimal can deal with the multiple locations of animal proteins. It is a useful high throughput tool for annotating the subcellular location(s) of animal proteins. The tool was tested thank to the jack-knife test. It was able to produce absolute true success rate of 37 per cent higher in comparison with the state-of-the-art predictor.
A probabilistic generative model for protein localization. MDLoc takes advantage of the location inter-dependencies and location-feature dependency to provide a generalizable method for predicting multiple locations for proteins.
An extension of the PSORT II program for protein subcellular location prediction. WoLF PSORT converts protein amino acid sequences into numerical localization features; based on sorting signals, amino acid composition and functional motifs such as DNA-binding motifs. After conversion, a simple k-nearest neighbor classifier is used for prediction. Using html, the evidence for each prediction is shown in two ways: (i) a list of proteins of known localization with the most similar localization features to the query, and (ii) tables with detailed information about individual localization features. WoLF PSORT not only provides subcellular localization prediction with competitive accuracy, but also provides detailed information relevant to protein localization to help users to form their own hypotheses.
A SVM based methods for predicting the subcellular localization of the eukaryotic proteins using various features of proteins. The three features i) physicochemical properties, amino acid compostion, dipeptide compostion of proteins are taken in consideration for the development of method. The prediction accuracy of amino acid compostion, physicochemical properties and dipeptide based modules is 78.1%, 77.8% and 82.4% respectively.
Determines protein subcellular localization. MultiLoc2 integrates several sub-predictors based on the overall amino acid composition. It enables the prediction of sorting signals, phylogenetic profiles and gene ontology (GO) terms. This tool is based on the extendable protein profile vector and supports the integration of heterogeneous and relevant information. It is useful for novel proteins without relevant sequence similarity to annotated proteins.
Determines nine sub-cellular localizations. SubCons employs a Random Forest (RF) classifier to combine four predictors. It can generate Position-Specific Scoring Matrixes (PSSMs) and offers users the option to submit entire proteomes. This tool is helpful to understand the localization of a protein, in particular as it scales to complete genomes. It provides state of the art predictions, a confidence score rates the reliability of a prediction in order to evaluate the reliability of the prediction.
Allows users to determine various properties of each protein in an entire proteome. PA permits researchers to perform several tasks: (1) prediction of the GeneQuiz general function and Gene Ontology (GO) molecular function of a protein; (2) prediction of the subcellular localization; or (3) creation of a custom classifier to predict a new property. Moreover, this tool can be used for any user-specified ontology.
Predicts protein localization. PLPD can detect the likelihood of specific localization for a protein by using the Density-induced Support Vector Data Description (D-SVDD). D-SVDD is extended for this algorithm to run the prediction of protein subcellular localization. It utilizes three measurements for the assessment and to refine the protein localization predictor. PLPD approach is complimentary to other method such as the nearest neighbor or the discriminate covariant method.