Computational protocol: Structural dissection of human metapneumovirus phosphoprotein using small angle x-ray scattering

Similar protocols

Protocol publication

[…] Different strategies were used to obtain all atom models for each construct. 1000 models of the backbone of monomeric P1–60 were obtained using the program flexible-meccano, with an α-helical propensity of 50% for residues 14–26 which were observed to form an α-helix when complexed with the HMPV N protein. Protein side chains were then added using the program SCCOMP.Initial models of P135–237 were built based on three families of models extracted from our previous HMPV P modeling study that were found to accurately reproduce SAXS data for this P158–237 . All three models are composed of a tetrameric coiled coil ranging from residues 168 to 198, with disordered residues at the N-terminus that have residual α-helical structure. In the first model, residues 202–219 adopt α-helical structures that pack laterally against the C-terminal part of the coiled coil region, while residues 220–237 are extended. In the second model, residues 208–237 form an α-helix consistently with secondary structure predictions, and in the third model, residues 168–237 are in an extended conformation with no secondary or tertiary structure. The region composed of residues 135–158 (including the N-terminal His6-3C site) was obtained based on a LOMETS model adopting a relatively extended conformation which was grafted onto the three initial models of P158–237. Two additional starting models were generated by shortening the coiled coil by one helix turn at its C-terminus in the third model (to match the x-ray structure) and by further removing residual helical structure in the N-terminal region (that had been carried out from previous classical M.D. simulations). All five starting models of P158–237 were then simulated in GROMACS using either an atomistic coarse-grained structure-based model (SBM),, or explicit solvent classical molecular dynamics simulations (MDS).In the case of the SBM MDS, a timestep of 0.0005 time units was used and the simulation was coupled to a temperature bath via Langevin dynamics. A single 100 ns trajectory was obtained for each starting model, and snapshots were extracted every 50 ps leading to an ensemble of 5000 models. In the case of classical MDS, we generated multiple trajectories for an aggregated simulation time of ~660 ns. MDS was performed using the amber99SBws forcefield which has been developed to reproduce the properties of intrinsically disordered proteins. At the beginning of each simulation, the protein was immersed in a box of SPC/E water, with a minimum distance of 0.9 nm between protein atoms and the edges of the box. 150 mM of NaCl were then added using genion. Long range electrostatics were treated with the particle-mesh Ewald summation. Bond lengths were constrained using the P-LINCS algorithm. The integration time step was 5 fs. The v-rescale thermostat and the Parrinello–Rahman barostat were used to maintain a temperature of 300 K and a pressure of 1 atm. Each system was energy minimized using 1,000 steps of steepest descent and equilibrated for 500 ps with restrained protein heavy atoms prior to production simulations. Snapshots were extracted every 200 ps from each trajectory, leading to the generation of ~3300 additional models of P135–237.In order to generate models of P135–294 and of P1–294, we adopted a similar strategy in which residues 238–294 and residues 1–134 were grafted onto the existing P135–237 models with or without the α-helical secondary structure elements (residues 14–26 and residues 251–262). Model types that were not selected through ensemble optimization of P135–237 were not considered for this procedure. We then used the SBM approach to generate ensembles for P135–294 and P1–294, yielding ~5000 P135–294 and ~8000 P1–294 models. [...] For each model from each ensemble, theoretical SAXS patterns were calculated with the program CRYSOL and ensemble optimization fitting was performed with GAJOE,. GAJOE uses a genetic algorithm to select from a large pool of conformers optimized sub-ensembles that minimize the discrepancy between the experimental and calculated curves χ exp according to the following equation:1χexp2=1K-1∑j=1K[μI(Qj)-Iexp(Qj)σ(Qj)]2where K is the number of points in the experimental curve, σ is the standard deviation and µ is a scaling factor. The optimum selected ensemble size and relative weights of the models were determined automatically by GAJOE. For each curve, the ensemble optimization procedure was repeated for a minimum of 20 times, from which the Rg distributions of the optimized ensembles were built. [...] Our initial goal was the crystallization of a complex of the HMPV M2-1 protein bound to a P construct including the putative M2-1 binding region. The expression and purification of HMPV M2-1 has been described previously. In an attempt to crystallize the M2-1 – P135–237 complex, vapour diffusion crystallization trials of a 1:1 mixture of these two proteins at 7 mg/ml in 20 mM Tris, pH 7.5, 150 mM NaCl were set up using a Cartesian Technologies pipetting system. Although we were not able to grow crystals of a complex, crystals which later proved to harbour only the P oligomerization region could be obtained after extended time periods. The P21 crystal (form 1) of HMPV Pcore grew at 20 °C after 291–344 days with mother liquor containing 20% polyethylene glycol (PEG) 6000, 200 mM NaCl, and 100 mM Tris, pH 8.0. The P212121 crystal (form 2) grew at 20 °C after between 132 and 185 days with mother liquor containing 25% PEG 3350, 200 mM MgCl, and 100 mM Tris, pH 8.5. Crystals were frozen in liquid nitrogen after being cryoprotected with 25% glycerol. Diffraction data were recorded on beamlines I03 (P21 crystal) and I04 (P212121 crystal) at Diamond Light Source, Didcot, UK. Data reduction was carried out automatically with XIA2. [...] The HMPV Pcore data sets were phased by molecular replacement with PHASER using the previously published structure (pdbID:4BXT). The structures from both crystal forms were subjected to multiple rounds of manual building in COOT and refinement in PHENIX. We made use of translation-libration-screw (TLS) parameters and 8-fold torsion-angle non-crystallographic symmetry (NCS) restraints as implemented in PHENIX. For the 1.6 Å data of crystal form 1 we additionally carried out anisotropic atomic displacement parameter (ADP) refinement. The structures were validated with the wwPDB Validation Service (https://validate-rcsb-1.wwpdb.org/). Refinement statistics are given in Table . The final coordinates and structure factors have been deposited in the PDB with accession codes 5OIX and 5OIY. [...] Structure-related figures were prepared with the PyMOL Molecular Graphics System (DeLano Scientific LLC). Protein interfaces were analysed with the PISA webserver. Mapping of sequence conservation onto the Pcore structure was carried out with the ConSurf server using P sequences from the Pneumoviridae family members human metapneumovirus (HMPV), avian metapneumovirus (AMPV), canine pneumonia virus (CPV), murine pneumonia virus (MPV), bovine respiratory syncytial virus (BRSV), and human respiratory syncytial virus (HRSV). Sequences were aligned using using PROMALS3D and Jalview in order to analyse the conservation of the protein binding sites that were identified in RSV. The putative functional regions have been assigned based on homology and previously reported studies,,,,,,,. The linear net charge per residue (NCPR) for HMPV P was calculated using the Classification of Intrinsically Disordered Ensemble Regions (CIDER) webserver. […]

Pipeline specifications

Software tools LOMETS, GROMACS, CRYSOL, xia2, Coot, PHENIX
Databases wwPDB
Applications Small-angle scattering, Protein structure analysis
Organisms Human metapneumovirus, Dipturus trachyderma