Computational protocol: Calculating an optimal box size for ligand docking and virtual screening against experimental and predicted binding pockets

Similar protocols

Protocol publication

[…] In order to optimize the search space, we perform a series of docking calculations for each target using a cubic box whose edge lengths range from 2 to 36 Å with a small incremental step size of 2 Å to ensure a fine-grained sampling. Next, we analyze docking accuracy as a function of the size of a query compound size by calculating the ratio of the radius of gyration of a ligand (Rg) to the box size. Rg is defined as follows:1Rg=1N∑k=1N|r⇀k−r⇀center|2where N is the total number of ligand heavy atoms, the vector r⇀k corresponds to the Cartesian coordinates of each heavy atom, and r⇀center represents the geometric center of a ligand.By default, we calculate Rg for a single low-energy conformer generated for each query compound by obconformer from Open Babel []. For comparison, we also calculated the average values of Rg ± standard deviation using sets of 100 random rotamers generated by obrotamer (Open Babel []) for PDB-bench ligands. [...] DUD-E, an enhanced version of the DUD dataset [], comprises a diverse set of 101 proteins including many pharmacologically important targets such as ion channels and GPCRs []. DUD-E features 22,886 experimentally validated active compounds with an average number of 224 ligands per each protein target, and over 1,000,000 decoy molecules at an approximate ratio of 50 per 1 active compound. These decoys have similar chemical properties yet different topologies than the corresponding active compounds. Therefore, the DUD-E dataset allows performing rigorous and unbiased tests of docking algorithms, scoring functions and virtual screening tools [, ]. Similar to the PDB-bench dataset, we carried out docking calculations using experimental pocket centers calculated from 101 representative complex structures included in DUD-E (the D101 set). Furthermore, we evaluate the accuracy of virtual screening for a subset of 77 proteins whose binding sites were successfully predicted by eFindSite (the D77 set). A binding site prediction is considered successful when the distance between the predicted and experimental pocket center is below 8 Å. […]

Pipeline specifications

Software tools Open Babel, eFindSite
Databases DUD-E
Applications Drug design, Protein interaction analysis
Organisms Homo sapiens
Chemicals NADP