The routine resolve_cryo_em is a tool for carrying out density modification of cryo-EM maps
Density modification with resolve_cryo_em is carried out using two half-maps, along with the FSC-based resolution and a sequence file specifying the contents of the map.
Density modification can also be carried out by using the initial density-modified map or a single supplied model as a basis for generating multiple models, then using model-based density as part of the density modification target to yield a model-based density modified map.
The tutorial video is available on the Phenix YouTube channel and covers the following topics:
Density modification with resolve_cryo_em is based on two ideas. One is that the errors in Fourier coefficients representing a cryo-EM map are (to some extent) uncorrelated. This means that one Fourier coefficient does not know about the errors in another one. (Note that this is not including errors that are correlated simply because the molecule is small and is placed in a large map. Correlated errors in this context are those where one Fourier coefficient has been adjusted to compensate for errors in another one.)
The other is that some features in a map are known in advance. This could include features such as the flatness of the solvent region, distributions of map values in the solvent and macromolecule region, similarities of symmetry-related regions.
Then the way density modification works is that Fourier coefficients for the map are adjusted to agree both with the original map and with the expected features. This improves the Fourier coefficients, and the key result is that the map improves everywhere, not just where the information about expected features was available.
Unique features of density modification for cryo-EM are that two half-maps with independent errors are available in cryo-EM (allowing estimation of errors), and that the errors in Fourier coefficients are (more or less) distributed as two-dimensional Gaussians (i.e both phase and amplitude errors). This leads to many differences in implementation density modification in crystallography though core elements are identical.
Normally you will access the functionality of resolve_cryo_em by running the ResolveCryoEM tool in the Phenix GUI. You can also run it from the command line. You may wish to run it from the command line in background with multiprocessing if you are running with denmod_with_models=True as this can take a long time (perhaps 1 day x 16 processors if you have 250 residues in the unique part of your model).
Half-maps: Supply two unmasked half maps. They can be sharpened but it does not make much of a difference. They must be unmasked and they must still have noise in the solvent region. They can be boxed with a soft boundary (if they are boxed specify box_before_analysis=False so it isn't done again)
Sequence file: Supply a sequence file with the sequence of the molecule. Be sure to put in all copies of the molecule (i.e. a 24-mer needs 24 chains). (You can also supply one copy of the molecule and specify copies=24 if you want.)
Model file: You can supply a model; if you do then the model will be randomized and refined against the half-maps to yield 8 models for each half map. These will be used in model-based density modification.
Automatically-generated models: You can specify density_modify_with_model=True and not supply any model. In this case 8 models will be automatically generated for each half-map and used in model-based density modification.
The basic inputs to resolve_cryo_em are:
Two unmasked half-maps sequence file or molecular mass or solvent fraction
Optional inputs are:
Full map (Recombined with density-modified half maps) Model (Used to generate multi-model representation for density modification) Symmetry file (Supply with model or if automatic model generation is used to limit the modeling to the unique part of the map. Alternatively, used in density modification in cases where the symmetry is not part of the reconstruction process). Mask file (Used to identify the region where the macromolecule is present)
The procedure used by resolve_cryo_em has several steps:
Boxing of maps: If the supplied maps are much larger than the molecule, the maps are trimmed down to about 5 A bigger than the largest dimension of the molecule (estimated from a low-res mask and the molecular volume based on sequence or as specified) in each direction, using a soft mask at the edges of the box. Resolution estimate and half-map sharpening of maps: The half-maps are compared as a function of resolution and the resolution (FSC=0.143) is estimated and the maps are sharpened based on the estimated map quality of the full (averaged) map. A full map is calculated. Generation of map-value (density) histograms: The full map is analyzed to identify the distribution of map values in the solvent and macromolecule region. These histograms are optionally used in density modification. Map-phasing of half-maps: Each half map is used in a process of map-based estimation of new Fourier coefficients using a maximum-likelihood procedure. In essence, one Fourier coefficient is removed from the map at a time. Then a new value of that Fourier coefficient is found that maximizes the likelihood of the map given all the other Fourier coefficients. The likelihood is calculated from the agreement of the histograms of the (new) map with expected histograms. For example, if the solvent region is expected to be flat, then the new map has a high likelihood if it is flat in the solvent region. By varying the one Fourier coefficient the flatness of the solvent region will change. Similarly the histograms of density in the region of the macromolecule will change depending on the value of the one Fourier coefficient. The best value for each Fourier coefficient is then used to calculated a map-phasing map. Estimation of errors: Fourier coefficients for the two starting half-maps and the two density-modified maps are compared to give FSC values as a function of resolution. These FSC values are used to estimate correlated and uncorrelated errors in the four maps and to identify optimal weighting between original and density-modified maps. Recombination of original and map-phasing half-maps: Based on the estimated errors in original and map-phasing half-maps, all four maps are recombined to create a new density-modified map. Additionally, each half-map and associated map-phasing half map are recombined to create two new density-modified half-maps. Optional real-space and sigma weighting: The smoothed local rms differences between original half maps and between density-modified half maps are used (optionally) to identify location-specific weighting for the original and density-modified maps. The variance of Fourier coefficients among the four maps are used (optionally) to weight individual final Fourier coefficients. Optional starting with unsharpened maps. The input half maps are used in the density modification step without adjusting the resolution dependence of the maps with half-map sharpening. This can be useful for high-resolution maps. Note that whether or not this option is selected, normally an overall B-factor is still applied to the input maps based on the nominal resolution. The difference therefore is in the details of the resolution dependence which is adjusted in the default case based on a comparison of half-maps, but not with density modify unsharpened maps. The overall B-factor applied before density modification is controlled by the "remove_aniso" and "b_iso" keywords. By default remove_aniso=True and b_iso is target_b_ratio*resolution (target_b_ratio=10), so before density modification whatever resolution dependence and anisotropy is present in the each map is adjusted to yield an overall B of b_iso and no anisotropy. Optional alternative final sharpening of maps. The final sharpening normally consists of two parts. One is scaling the map in shells of resolution based on the estimated correlation of the map coefficients at each resolution with the true ones. This is controlled by the keyword final_scale_with_fsc_ref. The second part of final scaling is scaling Fourier coefficients in each shell of resolution either to match the low-resolution shell or to match the scaled half-maps, or to be part-way between these (controlled by the keyword geometric_scaling). Optional spectral scaling and local sharpening. The final map is optionally scaled with a resolution-dependent scale factor representing the radial part of a typical Fourier transform of a macromolecule. The final map is optionally locally resolution-filtered (local sharpening). The final map is also optionally blurred slightly with a blurring dependent on the overall resolution of the map. Optional use of an input full map. If you supply a full map in addition to the half-maps then the full map will be recombined with the two map-phasing half-maps instead of the average of the two original half-maps being recombined. This could improve the final map if your full map has been filtered in some special way. Optional density modification with multiple models. You can generate multiple-model representations of each of your two half maps either by supplying a single model (it will be randomized and refined against each half-map) or by specifying density_modify_with_model=True (models will be generated automatically). The multiple models will be used to estimate the density at each point near the models along with a guess of the uncertainty of that density. These are used to create a target map for density modification that is based on model density and the map from standard density modification, weighted according to their uncertainty estimates. This target map is then used in density modification (as a target for the region containing the model), where Fourier coefficients that lead to a map that is more similar to the target density are considered more likely thatn those that do not. This procedure necessarily can introduce model bias, but this is greatly reduced by the use of multiple models, uncertainty estimates, and the likelihood-based density modification. You may with to use a finer resolution for density modification (dm_resolution) if you include model information because the model can provide higher-resolution information that simple density modification. Density modification including symmetry that is not part of reconstruction symmetry. If your molecule has symmetry that was not used in the reconstruction process, it can be used in density modification. Supply a symmetry_file (.ncs_spec file) that you can obtain with the find_ncs tool in phenix (uses a model or density) and it will be included.
Density modification with model-building adds additional cycles to the density modification procedure in which multiple models are built using map_to_model and the averaged density and uncertainty in the average density is used to combine the model density with the initial density-modified map.
The procedure includes:
Create initial density-modified half-maps and full map Create N (typically 16) variants of the full map by changing the resolution cutoff, spectral_scaling, and blurring of the map. Build a model into each modified full map. If symmetry file supplied, build only the unique part of the model. Refine some of the models against half-map 1 and some against half-map 2 Create one composite model based on all models Create model density for each half-map based on the models refined againt that map. This model density will have a mean value and variance for each point in the map near to at least 3 models. Create composite density for each half-maps by combining the model density with the density-modified half-map, weighting the model density according to its consistency among models. Density-modify each composite half-map, and create a new set of density modified half-maps and full map, as in the procedure for standard density modification. Sharpen the resulting maps using model-based sharpening with the composite model. Note that this procedure takes a lot of computation.
The R-value for density modification is quite different from a refinement R. Basically the statistical (maximum-likelihood) density modification procedure allows calculation of each structure factor based on the values of all the other structure factors and the expectations about the map (i.e, flat solvent, distribution of density values in protein, ncs, etc.) These "map-phasing" structure factors are (at least in the first cycle) independent of the starting structure factors and can be compared with the measured data and an R-value obtained. The values of these R-values range from about 0.25 to 0.5 in most cases, and when the map is very good usually the R values are smaller. These R values do not involve any atomic model so they are quite unlike a refinement R.
For discussion of how map-likelihood structure factors are calculated in Resolve, see the Map-likelihood phasing reference below.
You can use resolve_cryo_em to density-modify a cryo-EM map:
phenix.resolve_cryo_em half_map_A.mrc half_map_B.mrc seq_file=seq.dat
If density modification is working well, you should see:
An improvement in reported resolution. Typically this is small..from 0.05 A to 0.5 A. If it is zero...it probably did not work. More fine detail. Density modification mostly changes the high-resolution part of your map. (The low-resolution part of your map is generally already very good, and usually if it is not then density modification can't fix it. Occasionally low and high-resolution aspects of a map can improve.) Compare your original (auto-sharpened) map with your density-modified map and see if side chains and carbonyl oxygens are more clear and if the connectivity of the chain improves. If you supply or generate models, you should see improved high-resolution detail in the region of the model. You are not likely to see improvement away from the model as you would in crystallography. The multiple models generated are basically identifying what locations in your map are compatible with being the location of an atom, based on the density already there and the geometrical restrictions on a model. That information then is included in density modification and improves density in the region of the model but is not strong enough to improve density elsewhere.
Usually you should first just run with all defaults, supplying two half-maps and a sequence file and nominal resolution. See how the map looks and use it as a baseline. Then some things to try to improve the map are:
Add a full map file to recombine it with density-modified half-maps. This can be helpful if your full map has been processed in a way that makes it much better than an average of your half-maps. Add a mask file to exclude part of the map. If your map has some huge noise peaks or just density that is not part of your molecule, you can try to exclude it by supplying a mask file (a map file with 1 where your molecule is located and 0 elsewhere). If you have a model, you can supply the model and it will be used to generate a multi-model representation of your molecule. This multi-model representation will be used in density modification. Note that you should supply a symmetry file if there is symmetry in your reconstruction so that only the unique part needs to be analyzed. If you don't have a model but your resolution is high (about 3.5 A or better), you can improve your map by automatically generating a multi-model representation of your structure. This is used in the same way as the multi-model representation starting with a supplied model, and a symmetry file is helpful to identify just the unique part of the molecule. You can try changing the resolution (overall) and the density-modification resolution. Normally the resolution is the gold-standard resolution from comparing your half-maps. However varying this may help. The density-modification resolution is normally set automatically to be somewhat finer than the overall resolution, but its exact value does have an effect so you could try different values. You can try changing some of the defaults that occasionally help. One is setting density_modify_unsharpened_maps=True (density modifies the original half-maps without auto-sharpening them). Another is setting final_scale_with_fsc_ref=False (applies final resolution-dependent scaling that is part-way between standard half-map FSC-based sharpening and constant amplitudes over all resolutions). These usually are best for high_resolution maps (< 3 A). Real_space_weighting, sigma_weighting, and spectral_scaling may improve the final map. Real-space weighting weights original and density-modified maps based on their local uncertainties. Sigma-weighting weights them by uncertainties in individual Fourier coefficients. Spectral scaling scales the final map by resolution to match the resolution-dependence of a typical protein.
Density modification changes the sharpening of your map, so just because the map looks better or worse doesn't necessarily mean that anything important has happened.
If you have two maps and you want to know which is the better one, here are some things you can do:
1. Create a matched version of your maps that have the same resolution-dependence using phenix.auto_sharpen. Run it like this: phenix.auto_sharpen n_bins=100 auto_sharpen_methods=external_map_sharpening external_map_file=target_map.map map_to_change.map resolution=4.1 sharpened_map_file=map_to_change_matched.map Now you can look at target_map.map and map_to_change_matched.map and differences you see are due to intrinsic properties of the maps, not just sharpening. 2. If you have a good model, refine the model against each map. Then use phenix.mtriage to estimate the resolution where the FSC(map,model)==0.5 for each map. The map with a lower value of this resolution may be better. 3. As in #2, once you have a pair of maps and a model refined against each map, you can run phenix.auto_sharpen with each map and model and it will print out the average FSC for each. The one with the higher overall FSC may be better.
If the half-maps have been masked the procedure may not work well.
If the solvent noise is very non-uniform the procedure may work poorly. By default a rectangular solid region enclosing the molecule is cut out and used in density modification. You can supply a boxed map and set the keyword box_before_analysis=False to avoid this.
If the maps have very prominent density away from the macromolecule this may interfere with density modification. You can get around this by supplying a mask (as a map, 1=inside the molecule).
If there is non-macromolecule but real density in the maps this may interfere with density modification (for example, lipid density).
If your computer does not have a lot of memory and you use multiple processors (nproc bigger than 1) then resolve_cryo_em can crash. If this happens try nproc=1.
Density modification introduces some correlations between half-maps due to solvent flattening. This can have a small effect on the resolution estimates obtained with half-map FSC. The resolution estimates provided by the program are corrected for this effect.
If you use the real_space_weighting or sigma_weighting or sharpening_type=local_final_half_map options there may be some extra correlations between half-maps introduced. Calculating resolution using FSC between these density-modified maps can lead to overstating the resolution. The resolution estimates provided by the program are before applying these weighting schemes (unless you specify local_methods_final_cycle = False and run multiple cycles) so they are not normally affected by this.
The density modification procedure works best in the resolution range of about 4.5 A or better, but occasionally works very well at lower resolutions (not tested beyond 9 A). The procedure does not work in every case (we do not have a great metric for success but about half of cases appear to be improved according to the metrics described above). Varying the parameters can have a substantial effect on the outcome.
Model-based density modification necessarily biases the map towards the models that are built. By building multiple models, the effect of this bias is reduced but not eliminated. For example if the starting map has an error that causes models to be built with a side chain the wrong place, the new model- based density will show even more density in that location. That is, this process can accentuate errors in the original maps if they match plausible locations of atoms in a model. Balancing this, the procedure generally does not appear to yield density at positions where there is no density in the original map even if atoms are located there in a model. It is important that the original or non-model-based maps be consulted to evaluate any specific density in the map.
{{phil:phenix.programs.resolve_cryo_em}}