Protein complex detection software tools | Interaction data analysis
Identification of protein complexes is crucial for understanding principles of cellular organization and functions. As the size of protein-protein interaction set increases, a general trend is to represent the interactions as a network and to develop effective algorithms to detect significant complexes in such networks.
Allows qualitative and quantitative predictions of protein complexes from data on protein-protein interactions, protein domains and protein abundances. SiComPre simulates protein-protein interactions on a large scale to perform its predictions. Users can customize all parameters and settings of the simulations. This tool can be used to compare complexomes simulated with different settings, enabling context-dependent prediction of protein complexes. It also aims to investigate context-dependent changes in the complexome.
A graph clustering algorithm that is able to handle weighted graphs and readily generates overlapping clusters. Owing to these properties, ClusterONE is especially useful for detecting protein complexes in protein-protein interaction networks with associated confidence values. ClusterONE is available as a standalone command-line application, or as a plugin to Cytoscape or ProCope.
A fast program locating and visualizing overlapping, densely interconnected groups of nodes in undirected graphs, and allowing the user to easily navigate between the original graph and the web of these groups. We show that in gene (protein) association networks CFinder can be used to predict the function(s) of a single protein and to discover novel modules. CFinder is also very efficient for locating the cliques of large sparse graphs.
Identifies and predicts protein complexes. RNSC relies on modeling the protein-protein interaction (PPI) network with a graph where nodes represent proteins and edges correspond to interactions. It applies principles of both graph theory and gene ontology to identify likely protein complexes with scalable accuracy. It filters clusters according to several criteria: cluster size, cluster density and functional homogeneity.
Finds complexes from weighted protein-protein interaction (PPI) networks. CMC employs maximal cliques approach to proceed. It follows three steps that consist in: (1) discovering all the maximal cliques from the weighted PPI network; (2) ranking the cliques according to their weighted density; and (3) merging or removing highly overlapped cliques. This tool is able to list all maximal cliques, by employing a depth-first search strategy.
Predicts protein complex by sampling. PPSampler is based on the Metropolis-Hastings algorithm, that employs a Markov chain Monte Carlo (MCMC) method. It replaces the sum of the weights of protein-protein interactions (PPIs) within a cluster with a generalized density of the cluster. This tool is able to construct samples from particular probability distributions. It was evaluated thanks to precision, recall, and F-measure applied on predicted clusters.
A clustering method which decomposes a network into overlapping clusters and which is, therefore, capable of correct assignment of multifunctional proteins. The principle of OCG is to cover the graph with initial overlapping classes that are iteratively fused into a hierarchy according to an extension of Newman's modularity function. By applying OCG to a human protein-protein interaction network, we show that multifunctional proteins are revealed at the intersection of clusters and demonstrate that the method outperforms other existing methods on simulated graphs and PPI networks.
An algorithm for inferring protein complexes from weighted interaction graphs. By using graph topological patterns and biological properties as features, we model each complex subgraph by a probabilistic Bayesian network (BN). We use a training set of known complexes to learn the parameters of this BN model. The log-likelihood ratio derived from the BN is then used to score subgraphs in the protein interaction graph and identify new complexes.
Allows detections both the parent module as well as the child module in the gene ontology (GO) hierarchy. SR-MCL is an algorithm assisting users to find functional modules or protein complexes to predict the function of unannotated proteins. It is based on manipulation of transition probabilities or stochastic flows between nodes of the graph. It aims to local dense sub-networks instead of globally clustering a graph.
A network refinement model based on the structural interface data of protein pairs for protein complex predictions. A simple PPI network, which is represented as a static entity, includes competitive interactions that cannot participate in complex formations together. In the proposed framework, a SPIN construction reserves sets of non-competitive interactions by considering mutual exclusions among the interactions in a network. This allows network-clustering algorithms to identify stable clusters that may possibly be matched by to actual protein complexes.
Allows prediction and evaluation of protein complexes from purification datasets which integrates efficient implementations of the major prediction methods. ProCope can be useful for both applying published methods on new datasets to obtain reproducible and predictions and for developing and evaluating new prediction methods. The software provides a graphical user interface (GUI), command line tools suitable for batch job processing and a Java application programming interface (API). The GUI can also be used as a Cytoscape plugin.
Permits to proceed analysis and visual inspection of complexome profiling (CP) data. NOVA supports highly flexible and interactive inspection, exploration, and analysis of complexome data. It displays the migration profiles as a heat map providing mouse functionality for visual inspection and data management.
Assists users in performing and evaluating simulations in the field of structure-based models (SBMs). eSBMTools is organized in modules that can be loaded into Python projects. It can be used at all stages in the context of SBM simulations: from generating the SBM itself, over manipulations of the model and configuration file generation, to extensive post-processing of simulation data. It also provides interfaces with a standard build of the GROMACS software suite.
Parses the PSI-25 files generated by the IntAct data repository which collects, curates and stores thousands of protein interactions. Rintact is an R package which provides two main functions: (i) psi25interaction that takes either a PSI-25 XML file from IntAct or an URL containing the web address of where such an XML file can be obtained and (ii) psi25complex, also takes a PSI-25 XML file or URL as an input parameter, but the file must contain protein complex membership information.
Allows users to study poorly-studied network regions, without sacrificing its ability to faithfully evaluate well-studied communities. CommWalker is a module evaluation framework that takes this heterogeneity of annotation into account. This tool accepts modules having the potential to uncover functional structure in network regions where such advances are most needed.
Recognizes protein complexes by appending multiple network alignments MNAs. NEOComplex is able to discover some biological examples that cannot be found by conventional complex identification tools. It can achieve good balance in precision and sensitivity. This tool includes the result of an arbitrary MNA algorithm as an input to provide orthology information. It permits users to identify particularly sparse protein complexes.
Accelerates the research process. SurpriseMe is composed of several algorithms and calculates distances among the solutions provided by the algorithms. It supplies users distance matrices (with variation of information (VI), Normalized Mutual Information (NMI) values) for helping to compare the solutions of the algorithms. It is useful for characterizing the community structure of complex networks.
Determines interaction signs in a physical network by exploiting network-based integer linear programming (ILP) formulations. NetworkAnnotation predicts signs of unannotated physical interactions from large-scale gene-knockout experiments by using various plausible models of cellular signaling. The application gives access to users to a table compiling investigated signs, their confidence scores and their type.
Identifies and visualizes functional modules with consideration of the core-peripheral structure of modules. The core idea of the NCMine algorithm is (i) the extraction of complete subgraph-like structures from networks based on a node-weighting scheme and (ii) merging local clusters based on module overlap. In a comparative analysis with other methods, NCMine achieved better performance.
A graph theoretic clustering algorithm that detects densely connected regions in large protein-protein interaction networks that may represent molecular complexes. The method is based on vertex weighting by local neighborhood density and outward traversal from a locally dense seed protein to isolate the dense regions according to given parameters. The algorithm has the advantage over other graph clustering methods of having a directed mode that allows fine-tuning of clusters of interest without considering the rest of the network and allows examination of cluster interconnectivity, which is relevant for protein networks.