Computational protocol: Statistically derived asymmetric membrane potentials from α-helical and β-barrel membrane proteins

Similar protocols

Protocol publication

[…] Our implicitly derived membrane potentials can be used to identify MP topology. We used topologies from OPM for comparison, which are obtained either from the literature or the Uniprot database, which retrieves the topology from the primary literature or, if unknown, uses the sequence-based TMHMM topology predictor. For scoring topologies, we only considered residues in the membrane, yet distinguished lipid-accessible and –inaccessible residues. Each residue was assigned a free energy score, depending on the amino acid type and therefore the functional form (see Supplementary Tables –), its lipid accessibility, and the depth of the residue as obtained from its Cα atom. The scores over all residues were then summed to a total score; the lower the score, the more energetically favorable the inserted topology.For α-helical proteins, we predict the same topology as in OPM for 79.9% (191/239) of the proteins (Fig. ) when considering both lipid-accessible and lipid-inaccessible residues. We also compute prediction accuracies on lipid-accessible residues only for a fair comparison with other potentials for which only values on lipid-accessible residues are available. While these values could be higher, one should consider that (1) the implicit potentials are almost symmetric for helical proteins, except for Arg and Lys in support of the positive inside rule; (2) we disregard the score difference between OPM and -inverted topologies; in fact, both might have a very similar score; (3) the error rate of the topology in the OPM database is unknown. For proteins without experimental data or identified topology in the literature, TMHMM might still incorrectly predict topology, the authors state an accuracy of 77.5%.Figure 5Nevertheless, compared to other potentials and methods, the prediction accuracies of our potentials are somewhat higher. The asymmetric Ez potential of Schramm & DeGrado on lipid-accessible residues yields an accuracy of 72.8%, while the sequence-based state-of-the-art methods OCTOPUS and the meta-server TopCons (both of which contain a machine learning component) predicts OPM topologies for 72.1% and 74.8% of the protein chains. For β-barrel proteins, our potentials predict OPM topologies for 91.7% of the proteins for lipid-accessible and lipid-inaccessible residues and 90.6% for lipid-accessible residues only. We believe that this value is higher for β-barrels because the asymmetry in the potentials is more pronounced than for α-helical proteins. For comparison, the asymmetric potential of Lin & Liang achieves 82.3% and sequence-based BOCTOPUS predictions achieve 71.5% as compared to OPM.However, a truly fair comparison between structure-based and sequence-based methods is difficult, because the latter run on individual chains while the former benefit from the interconnection of the chains that constitute the protein structure. Further, our potential is trained on data similar to what OPM contains; while the protein embedding might be different, the topology is taken from OPM. Moreover, there is likely overlap between the training databases for the sequence-based methods and what OPM contains. […]

Pipeline specifications

Software tools TMHMM, Asymmetric Ez, OCTOPUS, TOPCONS, BOCTOPUS
Applications Protein structure analysis, Membrane protein analysis
Organisms Bacteria
Chemicals Amino Acids