Computational protocol: Protein Complex Identification by Integrating Protein-Protein Interaction Evidence from Multiple Sources

Similar protocols

Protocol publication

[…] After acquiring predicted PPI pairs, existing computational methods developed to identify protein complexes from PPI networks can be used. We employed 7 state-of-the-art protein complex identification algorithms here: COACH , CMC , CFinder , MCODE , IPCA , Clusterone and MCL .COACH is based on a core-attachment method and detects protein complexes from PPI networks. It mines protein complex cores from neighborhood graphs and forms protein complexes by including attachments into cores. Proteins placed in the same protein complex core are functionally similar and tend to be colocalized .CMC finds complexes from the weighted PPI network based on maximal cliques. It first uses an iterative scoring method (AdjustCD) to assign weight to protein pairs. The weight of a protein pair indicates the reliability of the interaction between the 2 proteins. It then generates all the maximal cliques from the weighted PPI networks. It finally removes or merges highly overlapped clusters based on their interconnectivity to determine protein complexes.Adamcsek et al. provided a software called CFinder to find functional modules in PPI networks. CFinder detects the k-clique percolation clusters as functional modules using a Clique Percolation Method . In particular, a k-clique is a clique with k nodes and two k-cliques are adjacent if they share (k –1) common nodes. A k-clique percolation cluster is then constructed by linking all the adjacent k-cliques as a bigger subgraph.MCODE algorithm proposed by Bader et al. is one of the first computational methods to detect protein complexes based on the proteins’ connectivity values in the PPI network. MCODE first weighs every node based on their local neighborhood densities, and then selects seed nodes with high weights as initial clusters and augments these clusters by outward traversing from the seeds. In addition, MCODE has an optional post-processing step with operations such as filtering non-dense subgraphs and generating overlapping clusters.IPCA is a modified DPClus algorithm which expands clusters starting from seeded vertices. It per-forms a better performance than DPClus since it proposes a new topological structure for protein complexes, which is a combination of subgraph diameter (or average vertex distance) and subgraph density.Clusterone algorithm consists of three major steps (Online Methods). First, starting from a single seed vertex, a greedy procedure adds or removes vertices to find groups with high cohesiveness. In the second step, they quantify the extent of overlap between each pair of groups and merge those for which the overlap score is above a specified threshold. In the third step, they discard complex candidates that contain less than three proteins or whose density is below a given threshold. Note that their method can detect potentially overlapping protein complexes.MCL (Markov Clustering) is a method that identify protein complexes by simulating random walks in PPI networks. It contains two steps: expansion and inflation. The expansion step assigns new probabilities for all pairs of nodes, while the inflation step changes the probabilities for all these walks in the graph. Iterative expansion and inflation will separate the PPI network into many parts as protein complexes. […]

Pipeline specifications

Software tools CFinder, MCODE, IPCA, ClusterONE, DPClus
Application Protein interaction analysis
Organisms Saccharomyces cerevisiae