Computational protocol: Outcome of the First Electron Microscopy Validation Task Force Meeting

Similar protocols

Protocol publication

[…] Map variance and local resolution determination, such as bootstrap-based variance maps () and local Fourier Shell Correlation (FSC) measurements (), can provide additional measures to help interpret structures. For maps with resolution better than 20 Å by the 0.5 FSC criterion, RMEASURE () can be used to estimate resolution and signal-to-noise directly from the map based on correlation of neighboring Fourier Transform terms. Possible bias from a starting model or overfitting of noise should also be estimated and statistics provided where possible. [...] The usefulness of a model strongly depends on its accuracy; different applications that use the models have varied requirements for model accuracy and precision. As with structural models derived by other techniques, accuracy can be estimated globally for the whole model or locally for each specific part (e.g., residue). There are three sets of fundamentally different criteria for assessing a model based on a 3DEM map, all of which should generally be satisfied.First, the conformation of a subunit and interfaces between subunits can be assessed without regard to the 3DEM map. The corresponding criteria for assessment of the internal consistency of a model with known molecular constraints (e.g., on geometry, conformation, and molecular interactions) include those proposed by the PDB working groups focused on assessment of crystallographic (), NMR (, and modeled () structures.Second, a model can be assessed with regard to the 3DEM map. A sample set of corresponding criteria for agreement of the model with the map are produced by the EMFIT program (, 2001), including atomic clashes, component interactions, chemical properties, fit to the map, as well as a composite criterion that quantifies model quality relative to a background distribution. Other programs that provide statistical measures for assessing a model in the context of a 3DEM map include CoAn () and E2HSTAT (available in EMAN2, ). A correlation coefficient between a map determined by EM and a map calculated from a model can also be used (), as can residue-based and overall real-space R values (). Comparisons of the cross-correlation to other metrics, such as those borrowing from machine learning techniques, enable systematic and objective evaluation of scoring functions (). More studies on the evaluation tools themselves are needed.Finally, a model can be assessed with regard to additional data about the structure that were not used in model calculation. Such data may include cross-linking, antibody labeling, sites of specific labels (such as carbohydrate moieties), proximity of known features to recognizable positions in the map, chemical properties consistent with the environment, and spectroscopic measurements.Assessment criteria should be as independent as possible from the objective function that is optimized during fitting (). At low resolution, a large number of non-EM-derived constraints are typically used in model construction, potentially reducing the informative value of certain assessment criteria. For example, analysis of the main-chain stereochemistry of a rigid-body fitted structure has no bearing on the accuracy and quality of the obtained model, but rather reflects the quality of the high-resolution structure that was used to fit the low-resolution data. Consideration of the modeling and fitting procedures is therefore an important component of the assessment.Methods for estimating model accuracy are being developed; no accurate or dominant method has yet emerged. There is a great need to assess the model quality based on the data-to-parameter ratio and precision, but anecdotal evidence suggests that such methods are not yet reliable. Approaches that begin to address this issue include cross-validation () and quantifying the best-fitting model relative to alternative fits () or the fitting to the mirror image map (). The predicted accuracy should depend on the map variance. In assessing the quality of a map, all of these criteria need to be satisfied within reasonable tolerance. The EMFIT program copes with this problem by taking the average of each attribute expressed as the number of standard deviations above the mean of random fits (). It also needs to be determined whether or not a map computed from a flexibly fitted model fits within the error bars of the original map equally well as the original model (if it does, there is no information in the map to justify flexible fitting). The Bayesian inferential structure determination approach originally proposed for NMR structure determination () could also be applied to EM-based modeling. Finally, accuracy measures that convey the suitability of models for specific applications need to be established. […]

Pipeline specifications

Software tools Rmeasure, EMFit, EMAN
Databases wwPDB
Application cryo-EM