A hybrid framework for residue contact prediction of transmembrane (TM) proteins, integrating a support vector machine (SVM) method and a mixed integer linear programming (MILP) method. COMSAT consists of two modules : COMSAT_SVM which is trained mainly on position-specific scoring matrix features, and COMSAT_MILP which is an ab initio method based on optimization models. Contacts predicted by the SVM model are ranked by SVM confidence scores, and a threshold is trained to improve the reliability of the predicted contacts. The proposed hybrid contact prediction scheme was tested on two independent TM protein sets based on the contact definition of 14 Å between Cα-Cα atoms. COMSAT shows satisfactory results when compared with 12 other state-of-the-art predictors, and is more robust in terms of prediction accuracy as the length and complexity of TM protein increase.
Centre for High Performance Computing, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Center for Cloud Computing, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; Department of Chemical Engineering, Texas A&M University, College Station, TX, USA; Texas A&M Energy Institute, Texas A&M University, College Station, TX, USA
COMSAT funding source(s)
This work was funded by the National Science Foundation of China (Grant Number: 11204342); Shenzhen Peacock Plan (Grant Number: KQCX20130628112914299), The Science Technology and Innovation Committee of Shenzhen Municipality (Grant Number: JCYJ20120615140912201); National High Technology Research and Development Program of China (Grant Number: 2015AA020109).