Computational protocol: Flexible torsion-angle noncrystallographic symmetry restraints for improved macromolecular structure refinement

Similar protocols

Protocol publication

[…] Torsion NCS restraints in phenix.refine are implemented using the same torsion-based ‘top-out’ potential as described in Headd et al. (2012). Briefly, a torsion restraint is added for each NCS-related torsion angle in the working model. In proteins, this set of angles includes all protein side-chain χ angles and the backbone ϕ, ψ and ω angles. Improper dihedral C—N—Cα—Cβ and N—C—Cα—Cβ restraints are also added for each protein residue to preserve Cβ geometry, with each torsion restrained to the ideal value for the given residue type (Lovell et al., 2003). For nucleic acids, this set of torsion angles includes all seven backbone torsions, as well any defined base χ angles. Only macromolecules (protein and/or nucleic acid) are handled in the current implementation. Non-standard amino acids and nucleic acids are also supported automatically. It should also be noted that explicit torsion restraints are an improvement over the 1,4-distance-based approach described in Usón et al. (1999), resolving the 180° ambiguity that exists for some 1,4 distances, such as χ2 for the p90 and p−90 Trp rotamers.As discussed in Headd et al. (2012), the ‘top-out’ potential is defined by σ and limit parameters, with the latter parameterized in degrees to control at what difference between related torsions the target is smoothly reduced to zero. Our implementation is similar to the Welsch robust estimator function (Dennis & Welsch, 1978), and is conceptually similar to the local NCS potentials implemented in REFMAC5 (Murshudov et al., 2011) and BUSTER (Smart et al., 2012). The target for each set of NCS-related torsions is defined to be their average (except as noted below), which is updated after each refinement step that moves individual sites, including real-space refinement, reciprocal-space refinement and Asn/Gln/His side-chain orientation correction. The residuals for the torsion NCS restraints are calculated using the following ‘top-out’ functional form: where Δi is the difference between the ith torsion and its NCS-related average, σ is a user-definable standard deviation parameter, l is the limit parameter and n is the total number of added reference restraints. It should be noted that the average of two torsion angles is calculated by taking the tangent of the quotient of their average sines and cosines.Atomic displacement parameters (ADPs) for NCS-related atoms may be restrained using the same parameterization as for global, Cartesian-based NCS restraints, but we have found that allowing the ADPs to be refined independently typically results in improved R factors (data not shown), as NCS-related chains will often have considerably different ADPs. This observation is consistent with previous reports (Smart et al., 2012), and is also supported by the variation in TLS (translation, libration, screw-rotation) models observed for NCS-related chains in some cases (Burnley et al., 2012).For this manuscript, all refinements in phenix.refine were carried out using Phenix v.1.8.4-1496. [...] Because the torsion-based NCS restraints allow for local differences, including rotamer differences between NCS-related side chains, these restraints can be safely used even at high resolution without much risk of negatively impacting the refinement. To test the efficacy of our restraints in a real experimental context, we performed molecular replacement (MR) in Phaser (McCoy et al., 2007) with the data for RNAse S (PDB entry 1sar; Sevcik et al., 1991), which are nominally at 1.8 Å resolution (>90% complete to 1.85 Å resolution) using the A chain from a related 2.0 Å resolution RNAse S structure (PDB entry 1rsn; Sevcik et al., 2005) as a search model. Crystallographic symmetry operators and origin shifts were applied to the resultant MR solution using phenix.find_alt_orig_sym_mate (Oeffner et al., 2012) to place the model in the same frame of reference as the deposited 1sar model. AutoBuild (Terwilliger, 2002; Terwilliger et al., 2008) model rebuilding was then performed using three different protocols: no NCS in model refinement, global (Cartesian-based) NCS as part of model refinement and torsion NCS with rotamer correction and consistency checks as part of model refinement. Ordered water picking was disabled, and default settings were used otherwise. Using the traditional global NCS target, both Arg63 side chains are distorted to outlier conformations, while the flexible torsion-based NCS target allows these residues to reach and maintain valid rotamer states automatically (Fig. 3). Validation statistics are summarized in Table 1. Running AutoBuild without NCS and with torsion NCS produces final models with similar R factors, with the torsion NCS model having a slightly smaller R free–R work gap (Brünger, 1992), consistent with a less overfitted model. By comparison, the global NCS model has much higher R factors. There is one fewer rotamer outlier in the torsion NCS refined model, consistent with rotamer correction as part of the torsion NCS method. Compared with the deposited structure, the full-atom r.m.s.d.s for each AutoBuild model are 0.543, 0.508 and 0.625 Å for no NCS, torsion NCS and global NCS, respectively. Full-atom r.m.s.d.s were calculated using VMD (Humphrey et al., 1996). The clashscore is slightly elevated when using either NCS implementation: 1.74 versus 1.39 when using no NCS restraints. Visual inspection reveals that the difference is a single clash between the CG atom of ArgA40 and the HA2 atom of GlyB61. These atoms clash to some degree in all three refinements, but the refinement with no NCS restraints produces a model in which the overlap is just below the cutoff of 0.4 Å, resulting in the lower clashscore.The two NCS-related chains in the asymmetric unit exhibit conformational differences supported by the density, particularly in the loop region surrounding Arg63. As seen in Fig. 4(a), ArgA63 is best fitted as an mtm180 rotamer, while ArgB63 is best fitted as a ptm−180 rotamer (Fig. 4 b), which is consistent with the rotamers observed in the deposited 1sar structure. [...] To test the effectiveness of torsion-based NCS restraints in a typical molecular replacement-driven workflow, we randomly selected 56 protein structures from the PDB between 2.0 and 3.0 Å resolution which have between two and four NCS copies, no ligands and are between 100 and 300 amino acids in length. This data set is summarized in Supplementary Table S1. We also required each structure to have a homologue with sequence similarity of >90% but <100% for use for molecular replacement. We required this high level of similarity to limit the need for any manual rebuilding, allowing us to test the effectiveness of torsion NCS restraints in a fully automated mode of operation within Phenix. Once phased using molecular replacement in Phaser (McCoy et al., 2007), MR solutions were placed in the same frame of reference as the deposited PDB entry using phenix.alt_orig_sym_mate (Oeffner et al., 2012), and were then processed with AutoBuild (Terwilliger, 2002; Terwilliger et al., 2008) using the rebuild-in-place option with no NCS in refinement and no placing of waters. Following AutoBuild, models were refined using phenix.refine for ten macro-cycles, refining individual sites and individual ADPs, and optimizing target weights for xyz sites. Each refinement was repeated with no NCS, global NCS and torsion NCS restraints. As shown in Fig. 5(a), the use of torsion NCS restraints and rotamer correction generally results in the same or a lower R free value when compared with using no NCS restraints. By comparison, global NCS restraints often result in much larger values of R free, often coupled with significant distortions of the model. In a handful of cases the global NCS restraints result in a slightly lower R free value, but visual examination of these models reveal no significant structural differences between these models and those refined with torsion NCS restraints. We chose to report residual R free values, R free(NCS) − R free(no NCS), rather than absolute R free values because at this early stage of refinement the trend in R free is more revealing than its absolute magnitude. Refinements would need to be completed, including building the handful of missing side chains and placing any ordered solvent and/or ions, for comparison with published R free values to be revealing.To test the relative contribution of the torsion NCS term and rotamer correction, we also ran these refinements with torsion NCS restraints alone and rotamer correction alone. As shown in Fig. 5(b), rotamer correction alone generally results in R free values greater than or equal to the combined approach, with 18/56 cases (∼32%) resulting in a worse R free than using no NCS restraints at all. By comparison, using torsion NCS restraints alone produces results that correlate more closely with the combined approach, with only 4/56 cases (∼7%) resulting in an R free value worse than using no NCS at all. Using both torsion NCS restraints and rotamer correction combined results in the most consistent results across this data set.These refinement results were also compared with refinements carried out using REFMAC5 (Murshudov et al., 2011). We ran REFMAC5 both with and without local NCS restraints to allow us to calculate internally consistent R free(NCS) − R free(no NCS) values. As shown in Fig. 5(a), refinement in REFMAC5 using local NCS restraints exhibits the same trend of improvement in R free over refinement in REFMAC5 without local NCS restraints as observed for the phenix.refine results. On average, the addition of local NCS restraints in REFMAC5 reduces R free by −0.62%, while torsion NCS restraints with rotamer correction in phenix.refine reduces R free by −0.47%. Both methods at worst produce the same R free as refinement without NCS restraints but for a handful of cases (PDB entries 1c03, 2fxk and 2o9f for torsion NCS with rotamer correction, 2o9f for REFMAC5). These results suggest that both NCS parameterizations are a suitable automated strategy for the moderate-resolution cases presented in this test set.Geometric validation metrics demonstrate similar results. As shown in Fig. 6(a), the rotamer outlier percentage from refinements using torsion NCS restraints with rotamer correction is similar to those from refinements with no NCS restraints (average of 1.41 and 1.43%, respectively), with many cases of a higher rotamer outlier percentage when using global NCS restraints (average of 2.62%) or REFMAC5 (average of 2.64%). As shown in Fig. 6(b), torsion NCS restraints alone and rotamer correction alone exhibit a similar trend to that of the combined approach, with torsion NCS alone having an average outlier percentage of 1.48% and rotamer correction alone having an average outlier percentage of 1.39%.Ramachandran analysis produces similar results. As shown in Fig. 7(a), refinement with no NCS, torsion NCS with rotamer correction or refinement with REFMAC5 all produce similar percentages of Ramachandran outliers, with averages of 0.46, 0.42 and 0.52%, respectively. The results from the refinements using global NCS restraints are slightly worse, with an average Ramachandran outlier percentage of 0.59%. Fig. 7(b) illustrates that refinements with torsion NCS alone produce models with slightly better average Ramachandran outlier percentages than refinements with rotamer correction alone (0.39 versus 0.47%) but, like rotamer outlier percentages, the trend is similar.As shown in Fig. 8(a), refinement using torsion NCS restraints with rotamer correction produces slightly elevated clashscores compared with refinement using no NCS restraints (averages of 3.10 and 2.97, respectively), while refinement using global NCS restraints causes elevated clashscores in many cases (average clashscore of 3.67). Refinements with REFMAC5 produce the highest clashscores across the test set, with an average of 4.39. Rotamer correction alone and torsion NCS restraints alone produce similar results to the combined approach (average clashscores of 3.01 and 3.05, respectively), with the slightly better performance by rotamer correction alone likely to be owing to an increased emphasis on not introducing steric clashes, coupled with fewer restraints on the overall model. As described in Chen et al. (2010), clashscore is defined as the number of steric clashes >0.4 Å per 1000 atoms. These differences in clashscore, therefore, are minimal, but serve to show that in general the use of torsion NCS restraints results in a model approximately as good as, if not better than, those models refined with no NCS restraints, and are usually safe to use at this working resolution range, even very early in the refinement process.Interestingly, as shown in Fig. 5(b), of the three cases in which torsion NCS with rotamer correction results in higher R free values than with no NCS restraints at all, rotamer correction alone corrects this problem in two cases (PDB entries 1c03 and 2o9f) and torsion NCS restraints alone corrects this problem in the other case (PDB entry 2fxk). Closer inspection of 1c03 reveals that the model has perfect Ramachandran statistics for all refinements (Fig. 9), limiting the benefit of torsion NCS restraints on the backbone. The refinement without any NCS restraints produces the lowest rotamer outlier percentage for this example, suggesting that rotamer correction is too aggressive in this case and, combined with torsion NCS restraints, produces a poorer model. For 2fxk, an improvement in rotamer outlier percentage with rotamer correction comes at the cost of an increase in the number of Ramachandran outliers, leading to an overfitted model, explaining the overall improvement for the torsion NCS only refinement. Finally, for 2o9f, all models fall into the bottom third of each geometry validation metric (third from last in the Ramachandran outlier percentage), suggesting that the refinement of models that are quite far from the correct global minimum requires more concerted motions than are possible with the addition of simple restraints, and that the addition of too many restraints further limits the ability of the model to move towards this minimum.On occasion, the final model following refinement using torsion NCS restraints will have a slightly higher rotamer outlier percentage or clashscore than a comparable model refined using no NCS restraints. In our experience, this almost always indicates an area of the model that requires concerted rebuilding beyond the capacity of current automated refinement methods. For example, from the 56 models selected for this test, the final torsion NCS-refined model for the 2.0 Å resolution ubiquitin-conjugating enzyme structure (PDB entry 1jbb; VanDemark et al., 2001) has a rotamer outlier percentage of 1.49% (a total of four outliers) versus a default-refined model outlier percentage of 1.12% (a total of three outliers), coupled with an elevated clashscore (2.85 versus 2.65). The difference is an outlier for LeuB88 using torsion NCS restraints versus a tp rotamer when using no NCS restraints. While the LeuA88 side chain is an mt rotamer, the model around the side chain is too distorted for either the rotamer outlier or rotamer consistency routines to correct this change. As shown in Fig. 7, however, the use of torsion NCS restraints is able to refine to similar backbone orientations between the A and B chain, causing the incorrect side chain to stand out as an outlier. Using no NCS restraints, this side chain distributes the error across the local backbone, refining to a false-positive tp rotamer (Figs. 7 a and 10 a). The clashscore is also eased by distributing the error across the backbone, explaining the higher observed clashscore with torsion NCS restraints. The ϕ/ψ values around LeuA88 (−138.5°, 140.3°) are quite different from those around LeuB88 (−155.5°, 125.9°) when refined with no NCS. Conversely, the ϕ/ψ values are quite similar when refined using flexible torsion NCS restraints [(−145.5°, 134.3°) and (−147.9°, 130.7°)]. Outliers such as these can be corrected using more aggressive refinement methods or through simple rebuilding in a graphical building program such as Coot (Emsley et al., 2010). In this case, the side chain is corrected to an mt rotamer using Coot (Fig. 10 b), and subsequent refinement confirms that this is a preferable rotamer for this side chain (Fig. 10 c).Following five additional macro-cycles of refinement, the torsion NCS-refined model with corrected LeuB88 has improved R work/R free values (0.1942/0.2441) compared with the model with corrected LeuB88 refined with no NCS restraints (0.2015/0.2503). The final rotamer outlier percentages favor the torsion NCS-refined model (1.12 versus 1.49%). […]

Pipeline specifications

Software tools REFMAC5, PHENIX, VMD, Coot
Application Protein structure analysis