Density modification of cryo EM maps with resolve_cryo_em

Author(s)

Purpose

The routine resolve_cryo_em is a tool for carrying out density modification of cryo-EM maps

Usage

Density modification with resolve_cryo_em can be carried out based on two half-maps, along with the FSC-based resolution and a sequence file specifying the contents of the map.

Density modification can also be carried out by using the initial density-modified map as a basis for generating multiple models, then averaging the model-based maps with the density-modified map to yield a model-based density modified map.

How resolve_cryo_em works:

Density modification with resolve_cryo_em is based on two ideas. One is that the errors in Fourier coefficients representing a cryo-EM map are (to some extent) uncorrelated. This means that one Fourier coefficient does not know about the errors in another one. (Note that this is not including errors that are correlated simply because the molecule is small and is placed in a large map. Correlated errors in this context are those where one Fourier coefficient has been adjusted to compensate for errors in another one.)

The other is that some features in a map are known in advance. This could include features such as the flatness of the solvent region, distributions of map values in the solvent and macromolecule region, similarities of symmetry-related regions.

Then the way density modification works is that Fourier coefficients for the map are adjusted to agree both with the original map and with the expected features. This improves the Fourier coefficients, and the key result is that the map improves everywhere, not just where the information about expected features was available.

Unique features of density modification for cryo-EM are that two half-maps with independent errors are available in cryo-EM (allowing estimation of errors), and that the errors in Fourier coefficients are (more or less) distributed as two-dimensional Gaussians (i.e both phase and amplitude errors). This leads to many differences in implementation density modification in crystallography though core elements are identical.

Using resolve_cryo_em:

Normally you will access the functionality of resolve_cryo_em by running the ResolveCryoEM tool in the Phenix GUI. You can also run it from the command line. (You may wish to run it from the command line in background with multiprocessing if you are running with denmod_with_models=True as this can take a long time (1 day x 16 processors for 250 residues in the unique part of the model for example).

Half-maps: Supply two unmasked half maps. They can be sharpened but it does not make much of a difference.

Sequence file: Supply a sequence file with the sequence of the molecule. Be sure to put in all copies of the molecule (i.e. a 24-mer needs 24 chains). (You can also supply one copy of the molecule and specify copies=24 if you want.)

Procedure used by resolve_cryo_em

The inputs to resolve_cryo_em are:

Two unmasked half-maps
sequence file or molecular mass or solvent fraction

The procedure used by resolve_cryo_em has several steps:

Boxing of maps:  If the supplied maps are much larger than the molecule,
the maps are trimmed down to about 5 A bigger than the largest dimension
of the molecule (estimated from a low-res mask and the molecular
volume based on sequence or as specified) in each direction.

Resolution estimate and half-map sharpening of maps: The half-maps are
compared as a function of resolution and the resolution (FSC=0.143)
is estimated and the maps are sharpened based on the estimated map quality
of the full (averaged) map.  A full map is calculated.

Generation of map-value (density) histograms:  The full map is analyzed
to identify the distribution of map values in the solvent and
macromolecule region.  These histograms are to be used in density
modification.

Map-phasing of half-maps:  Each half map is used in a process of
map-based estimation of new Fourier coefficients using a
maximum-likelihood procedure. In essence, one Fourier coefficient is
removed from the map at a time. Then a new value of that Fourier coefficient
is found that maximizes the likelihood of the map given all the other
Fourier coefficients. The likelihood is calculated from the agreement of
the histograms of the (new) map with expected histograms.  For example,
if the solvent region is expected to be flat, then the new map has a
high likelihood if it is flat in the solvent region.  By varying the one
Fourier coefficient the flatness of the solvent region will change. Similarly
the histograms of density in the region of the macromolecule will change
depending on the value of the one Fourier coefficient. The best value for
each Fourier coefficient is then used to calculated a map-phasing map.

Estimation of errors:  Fourier coefficients for the two starting
half-maps and the two density-modified maps are compared to give FSC
values as a function of resolution.  These FSC values are used to estimate
correlated and uncorrelated errors in the four maps and to identify
optimal weighting between original and density-modified maps.

Recombination of original and map-phasing half-maps:  Based on the
estimated errors in original and map-phasing half-maps, all four maps
are recombined to create a new density-modified map.  Additionally,
each half-map and associated map-phasing half map are recombined to
create two new density-modified half-maps.

Optional real-space and sigma weighting:  The smoothed local rms differences
between original half maps and between density-modified half maps are
used (optionally) to identify location-specific weighting for the
original and density-modified maps.  The variance of Fourier coefficients
among the four maps are used (optionally) to weight individual final
Fourier coefficients.

Optional starting with unsharpened maps.  The input half maps are used in the
density modification step without auto-sharpening. (Normally these maps are
auto-sharpened based on half-map sharpening first).  This can be useful
for high-resolution maps.

Optional alternative final sharpening of maps.  The final sharpening normally
consists of two parts.  One is scaling the map in shells of resolution
based on the estimated correlation of the map coefficients at each
resolution with the true ones.  This is controlled by the
keyword final_scale_with_fsc_ref.   The second part of final scaling is
scaling Fourier coefficients in each shell of resolution either to match
the low-resolution shell or to match the scaled half-maps, or to
be part-way between these (controlled by the keyword geometric_scaling).

Optional spectral scaling and local sharpening.  The final
map is optionally scaled with a resolution-dependent scale factor
representing the radial part of a typical Fourier transform of a
macromolecule.  The final map is optionally locally resolution-filtered
(local sharpening).  The final map is also optionally blurred slightly
with a blurring dependent on the overall resolution of the map.

Optional use of an input full map.  If you supply a full map in addition
to the half-maps then the full map will be recombined with the two
map-phasing half-maps instead of the average of the two original
half-maps being recombined.  This could improve the final map if your
full map has been filtered in some special way.

Procedure used by resolve_cryo_em for density modification with model-building

Density modification with model-building adds additional cycles to the density modification procedure in which multiple models are built using map_to_model and the averaged density and uncertainty in the average density is used to combine the model density with the initial density-modified map.

The procedure includes:

Create initial density-modified half-maps and full map

Create N (typically 16) variants of the full map by changing the resolution
cutoff, spectral_scaling, and blurring of the map.

Build a model into each modified full map

Refine some of the models against half-map 1 and some against half-map 2

Create one composite model based on all models

Create model density for each half-map based on the models refined againt
that map.  This model density will have a mean value and variance for each
point in the map near to at least 3 models.

Create composite density for each half-maps by combining the model density
with the density-modified half-map, weighting the model density according to
its consistency among models.

Density-modify each composite half-map, and create a new set of density
modified half-maps and full map, as in the procedure for standard
density modification.

Sharpen the resulting maps using model-based sharpening with the composite
model.

Note that this procedure takes a lot of computation.

Examples

Standard run of resolve_cryo_em:

You can use resolve_cryo_em to density-modify a cryo-EM map:

phenix.resolve_cryo_em half_map_A.mrc half_map_B.mrc seq_file=seq.dat

How to tell how well density modification is working

Density modification changes the sharpening of your map, so just because the map looks better or worse doesn't necessarily mean that anything important has happened.

If you have two maps and you want to know which is the better one, here are some things you can do:

1. Create a matched version of your maps that have the same
resolution-dependence using phenix.auto_sharpen. Run it like this:

phenix.auto_sharpen n_bins=100 auto_sharpen_methods=external_map_sharpening
     external_map_file=target_map.map map_to_change.map resolution=4.1
     sharpened_map_file=map_to_change_matched.map

Now you can look at target_map.map and map_to_change_matched.map and
differences you see are due to intrinsic properties of the maps, not just
sharpening.

2. If you have a good model, refine the model against each map.  Then use
phenix.mtriage to estimate the resolution where the FSC(map,model)==0.5
for each map. The map with a lower value of this resolution may be better.

3. As in #2, once you have a pair of maps and a model refined against
each map, you can run phenix.auto_sharpen with each map and model and
it will print out the average FSC for each.  The one with the higher
overall FSC may be better.

Possible Problems

If the half-maps have been masked the procedure may not work well.

If the solvent noise is very non-uniform the procedure may work poorly. By default a rectangular solid region enclosing the molecule is cut out and used in density modification. You can supply a boxed map and set the keyword box_before_analysis=False to avoid this.

If the maps have very prominent density away from the macromolecule this may interfere with density modification. You can get around this by supplying a mask (as a map, 1=inside the molecule).

If there is non-macromolecule but real density in the maps this may interfere with density modification (for example, lipid density).

Specific limitations and problems:

Density modification introduces some correlations between half-maps due to solvent flattening. This can have a small effect on the resolution estimates obtained with half-map FSC. The resolution estimates provided by the program are corrected for this effect.

If you use the real_space_weighting or sigma_weighting or sharpening_type=local_final_half_map options there may be some extra correlations between half-maps introduced. Calculating resolution using FSC between these density-modified maps can lead to overstating the resolution. The resolution estimates provided by the program are before applying these weighting schemes (unless you specify local_methods_final_cycle = False and run multiple cycles) so they are not normally affected by this.

The density modification procedure works best in the resolution range of about 4.5 A or better.

Model-based density modification necessarily biases the map towards the models that are built. By building multiple models, the effect of this bias is reduced but not eliminated. For example if the starting map has an error that causes models to be built with a side chain the wrong place, the new model- based density will show even more density in that location. It is essential that the original or non-model-based maps be consulted to evaluate any specific density in the map.

Literature

Additional information

List of all available keywords