Computational protocol: Molecular dynamics analysis of the aggregation propensity of polyglutamine segments

Similar protocols

Protocol publication

[…] The polyQ tract is the only common region observed in the otherwise very dissimilar polyQ proteins which are associated with polyglutamine diseases, and in all cases the polyQ expansion causes the disease. The threshold length of the polyQ segment that triggers these diseases is around 35 to 40 residues, except in SCA6 which has a shorter threshold of around 19 repeats [–]. Therefore, it is of interest to study the solvation behavior of polyQ segments shorter than 20 and longer than 40 repeats to find common features on how solvent interactions may affect the folding of such diverse set of proteins.We performed MD simulations for polyQ monomers with 18 repeats (Q18), 46 repeats (Q46) and 32 repeats (Q32). These correspond to lengths below the lowest known disease threshold, above the highest known normal threshold and the average repeat length of these two, respectively. A randomly selected extended structure of polyQ was used as the starting structure of the MD simulations. In order to avoid complications due to charged termini [], the polyQ sequences were capped with an acetyl group in the N-terminus and a N-methylamide group in the C-terminus, i.e. the structures considered here are [acetyl-(Gln)n-N-methylamide], where n = 18, 32, and 46 denotes the number of glutamines. xLEaP [] was used to build the initial configurations, and the Amber force field, AMBER ff99SB [], was used with a TIP3P water box to provide an explicit simulation of the solvent. A local minimization of the polyQ monomers was done in vacuum before the water box was added. The TIP3P water was included in a truncated octahedral box added to the polyQ monomer with a buffering distance of 9.0 Å between the edges of the box and the polyQ monomer. A second minimization was performed on the solvated system using a non-bonded cutoff distance of 9 Å to minimize the energy of the whole system. The whole system was then heated from 0 K to 310 K and equilibrated for 50 ps, followed by molecular dynamics simulations for 105 ns at the temperature of 310K and constant pressure of 1 atm. The temperature was maintained through the Berendsen thermostat with a coupling time of 0.1 ps. Isotropic position scaling was used to maintain the pressure and a relaxation time of 1 ps was used. The integration time step was 2 fs, and results were recorded every 1 ps.For each polyQ monomer six independent runs, using different randomly selected initial structures and different random seeds for its initialization, were performed and the results presented here are the average for these six runs. This procedure was adopted to increase sampling of the conformational space, while keeping a manageable MD simulation time. All the MD simulations were done using the Amber 14 molecular simulation package [] that supports a GPU accelerated PMEMD module, which implements the Particle Mesh Ewald (PME) method for electrostatics []. All calculations were performed using the clusters at the Center for High Performance Computing (CHPC) at the University of Utah. Each computing node in the cluster has two Nvidia 2090 GPUs and 12 Intel Xeon (Westmere X5660) processors. After a preliminary study to optimize the efficiency of the GPU-accelerated computing nodes (results not shown), we performed one simulation per GPU to obtain the best throughput performance with the settings of our cluster.The Cpptraj utility in the Amber 14 tool box [] was used for most of the analysis. The MD trajectories were re-imaged back to the primary box, and to speed up the analysis, only 1/100 of the frames were processed that is 100 ps per frame in the new trajectory. The secondary structure, hydrogen bond, solvent bridge, radius of gyration, and solvent surface area were calculated using Cpptraj for each simulation trajectory. The Rg value of the polyQ segments was calculated for each frame of the the last 80 ns and used to calculate the exponent factor b in the Rg ~ Nb equation. To calculate the exponent factor b, the log transform was done on each data point and a linear regression was used to get the value of b, which corresponds to the slope of the linear regresion.For each polyQ length, the results of the six independent simulations were averaged, such that all values reported here represent the average values over these six runs. Only the last 80 ns of the MD trajectories were considered to avoid transient effects (see ). The Pearson's product moment correlation, also known as r, was used to measure the strength and direction of any linear correlation between the two interested variables presented here and the p value was used to test significance. Statistical analyses were performed using R [], figures were plotted with ggplot2 package [] and Gnuplot [], and VMD was used for trajectory visualization []. […]

Pipeline specifications

Software tools AMBER, Ggplot2, Gnuplot, VMD
Databases PolyQ
Applications Miscellaneous, Protein structure analysis