Computational protocol: A semi-supervised boosting SVM for predicting hot spots at protein-protein Interfaces

Similar protocols

Protocol publication

[…] Firstly, we used the implementation PSAIA proposed by Mihel et al., [] to generate features about solvent accessible surface area (ASA), relative solvent accessible surface area (RASA), depth index (DI) and protrusion index (PI), which are defined as follows:• Accessible surface area (ASA, usually expressed in Å2) is the atomic surface area of a molecule, protein and DNA etc., which is accessible to a solvent.• Relative ASA (RASA) is the ratio of the calculated ASA over the referenced ASA. The reference ASA of a residue X is obtained by Gly-X-Gly peptide in extended conformations [].• Depth index (DI): the depth of an atom i (DPXi) can be defined as the distance between atom i and the closest solvent accessible atom j. That is, DPXi = min(d1, d2, d3, ..., dn) where d1, d2, d3, ..., dn are the distances between the atom i and all solvent accessible atoms.• Protrusion index (PI) is defined as Vext/Vint. Here, Vint is given by the number of atoms within the sphere (with a fixed radius R) multiplied by the mean atomic volume found in proteins; Vext is the difference between the volume of the sphere and Vint, which denotes the remaining volume of the sphere.From ASA and RASA, five attributes can be derived:• total (the sum of all atom values);• backbone (the sum of all backbone atom values);• side-chain (the sum of all side-chain atom values);• polar (the sum of all oxygen, nitrogen atom values);• non-polar (the sum of all carbon atom values).And based on DI and PI, four residue attributes can be obtained:• total mean (the mean value of all atom values);• side-chain mean (the mean value of all side-chain atom values);• maximum (the maximum of all atom values);• minimum (the minimum of all atom values).Therefore, 36 features were generated by PSAIA from unbound and bound states.In addition, the relative changes of ASA, DI and PI between the unbound and bound states of the residues were calculated as in Xia et al's work [], and 13 more features were generated by the equations below: R c A S A = ( A S A u n b o u n d - A S A b o u n d ) / A S A u n b o u n d , R c D I = ( D I b o u n d - D I u n b o u n d ) / D I b o u n d , R c P I = ( P I u n b o u n d - P I b o u n d ) / P I u n b o u n d . Furthermore, we generated some useful features following the strategy of KFC2 []. Residues' solvent accessible surface is used in the following features and is calculated by NACCESS [].DELTA_TOT describes the difference between the solvent accessible surfaces in bound and unbound states: D E L T A _ T O T = A S A u n b - A S A b n d . SA_RATIO5 is the ratio of solvent accessible surface area over maxASA, which stands for the residue's maximum solvent accessible surface area as a tripeptide []: S A _ R A T I O 5 = D E L T A _ T O T × m a x A S A A S A u n b . Another form of ratio of solvent accessible surface area, CORE_RIM, is given by: C O R E _ R I M = D E L T A _ T O T A S A u n b . and this feature is quite like the relative change in total ASA described before. The main difference lies in that PSAIA treats each chain separately during the calculation []. In our work we will use at most one of these two features in order to avoid a bias.POS_PER is defined as below, where i is the sequence number of the residue and N is the total number of the interface residues: P O S _ P E R = C O R E _ R I M × i N . ROT4 and ROT5 stand for the total numbers of the side chain rotatable single bonds to target residues for the residues within 4.0Å and 5.0 Å, respectively.HP5 is the sum of hydrophobic values of all neighbors of a residue within 5Å.FP9N, FP9E, FP10N and FP10E were directly calculated by FADE [] that is an efficient method to calculate atomic density.PLAST 4 and PLAST 5 were calculated as: P L A S T 4 = W T _ R O T 4 A T M N 4 × m a x A S A , P L A S T 5 = W T _ R O T 5 A T M N 5 × m a x A S A , where WT_ROT4, WT_ROT5 count weighted rotatable single bond numbers of a residue's side chain within 4Å and 5Å respectively, and ATMN4, ATMN5 indicate the total numbers of surrounding atoms of a residue within 4Å and 5Å respectively. […]

Pipeline specifications