Computational protocol: Sampling Enrichment toward Target Structures Using Hybrid Molecular Dynamics-Monte Carlo Simulations

Similar protocols

Protocol publication

[…] Native-like structures are selected solely by their secondary structures. We select three structural fragments with typical secondary structures from PDB library: helix (PDB code 1VCS, residues 37–56), sheet (PDB code 1PIN, residues 9–28) and coil (PDB code 1GWP, residues 80–99). The secondary structures are determined using STRIDE[] [sheet (E, B), helix (H, I, G), coil (C, T)]. The sequences then are substituted by one of the 20 natural amino acids to create mono-residue peptides using Mutator plugin in VMD[]. These mono-residue peptides have different secondary structural preferences[], and they may evolve into energetically favorable secondary structure through the MD simulations in a given force field. Therefore, they are good and rational choice to evaluate the overall performance of the hybrid MD-MC method, regardless of whether the target structure is energetically favorable and achievable in MD simulations. Although these mono-residue peptides are not representative for the large diversity of real proteins, the types of residues and secondary structures are comprehensive which make it provide a theoretically sound evaluation. These 60 (3*20) structures are relaxed to get native-like structures through three steps: (i) relocation of atoms and surrounding water molecules with 2000 iterations of a conjugate gradient energy minimization; (ii) equilibration at T = 310K through a 1 ns NVT-ensemble MD simulation; and (iii) equilibration at T = 310K with a 1 ns NPT-ensemble MD simulation. A harmonic potential with the force constant of 1.0 kcal/mol/Å2 is applied to all non-hydrogen atoms to minimize the structure change in all the three steps. The final structures are selected as the native-like structures, which are next used as the initial structures in the forward simulations and as the target structures in the backward simulations. [...] Wide distribution of decoys from the forward simulations is important to select diverse and representative structures for the sampling enrichment study, and to construct a pseudo-energy function that can reliably guide simulations towards the target structure. The forward NVT MD simulations are carried out for 5 ns at 310K, 340K and 370K. In each simulation trajectory, structures are saved every 4 ps, which produce 1250 decoys in each trajectory. Thus, for each mono-residue peptide, a pool with 11250 (1250*3*3) decoys from 3 native-like structures at 3 simulation temperatures is collected.The initial simulation configuration are prepared using VMD[] through merging mono-residue peptides into a TIP3P water box[] with the edge size of 13 Å. Additional sodium or chloride ions are added to neutralize the system. The MD simulations are carried out with the periodic boundary condition using NAMD v2.9[]. The multiple time stepping integration scheme[] is used to accelerate electrostatic potential computation, and short-range non-bonded interactions are computed every step using a cutoff of 10 Å with a switch distance of 8 Å. Long-range electrostatic interactions are calculated using the particle-mash Ewald method with a grid spacing of 1 Å-1 by every 2 steps. The integration time step is of 2 fs with hydrogen atoms optimization using SHAKE[,]. Langevin dynamics for all non-hydrogen atoms is used to keep constant temperature, and the damping coefficient is 1 ps-1. The Nose′-Hoover Langevin piston[] with an interval of 200 fs and a damping timescale of 100 fs are used to maintain a constant pressure at 1 atm. [...] The clustering of decoys from the forward MD simulations is performed using SPICKER[] with the initial cut-off RMSD of 4 Å. The cut-off RMSD can be self-adjusted to satisfy the condition that the first and largest cluster (Top 1) cover 15%–70% of all input structures. The final cut-off RMSD is 4 Å for all mono-peptides except for the poly-Gly, which is 4.6 Å. For each mono-residue sequence, the center structures in the three most populated clusters (Top 3 models) are selected as the initial structures for the backward simulations. [...] Sampling enrichment is evaluated by comparing the hybrid MD-MC and the parallel MD in the backward simulations, i.e. both the MD-MC simulations and the parallel MD simulations have the same initial structures. The frame of the hybrid MD-MC simulations is presented in .The consecutive MC judgments are made every 4 ps in the MD simulation trajectory, and the decoys are saved after each MC judgment for results analysis. To make a MC judgment, the SAXS intensity profile for a given structure is computed using Fast-SAXS-pro[]. The acceptance probability in the MC judgment is given by the Metropolis criterion[], i.e., min{exp[-(En-En-1]/kBT), 1}. Here, T is the simulation temperature, kB is the Boltzmann constant, and En is the pseudo-energy function for the structure at the nth MC iteration. The pseudo-energy function is taken to be proportional to the discrepancy in the scattering intensity profiles between the target structure and the nth structure in the hybrid simulation.We perform 20 ns MD and MD-MC simulation at 310 K and 370 K, starting from 60 (20 sequences and Top3) models obtained from the clustering procedure. Since the MD simulations are not biased towards any target structure, we only perform one MD simulation for each of the models (i.e. 60 simulation trajectories). In the case of the MD-MC simulation, where the SAXS-derived information about the target structure is incorporated into the MC pseudo-energy function, we run 60 MD-MC simulations toward three native-like target structures, which resulted in 180 simulation trajectories in total.Additionally, to provide a solid statistical view on sampling enrichment and minimize the bias from over-sampling at the valley in the energy landscape, we randomly select 600 target structures and 600 initial structures for the MD-MC simulations. These supernumerary simulations are carried out at 370K for 1ns with 0.5ps time interval for the MC judgments in each trajectory. […]

Pipeline specifications

Software tools STRIDE, VMD, NAMD, SPICKER, Fast-SAXS
Applications Small-angle scattering, Protein structure analysis
Diseases Genetic Diseases, Inborn