Optimizing a map with a local anisotropy correction using local_aniso_sharpen

Author(s)

local_aniso_sharpen: Tom Terwilliger

Purpose

The routine local_aniso_sharpen is a tool for optimizing a map by applying a local, anisotropic, resolution-dependent scaling factor.

How local_aniso_sharpen works:

Local anisotropic sharpening can be carried out locally, as the name suggests, or on a map as a whole.

The basis for sharpening is an analysis of the resolution-dependent fall-off of amplitudes of Fourier coefficients for the map, along with an analysis of the resolution-dependent fall-off of correlation between two half-maps, or between a map and a model-based map.

When carried out locally, a full map is divided into small boxes. The density near the edges of each box is masked so that it gradually diminishes to zero at the edges. Each small box is treated as a full map to identify its optimal sharpening. Then the optimal sharpening parameters from the small boxes are applied to the full map in a way that has no edge effects and smoothly varies from one place to another in the map.

To identify the optimal anisotropic sharpening of a map based on the information in two half-maps, two analyses are done. The first is an analysis of the resolution-dependent fall-off of rms amplitudes of Fourier coefficients representing the map. This is examined as a function of direction in reciprocal space, and is similar to the calculation normally done to apply an anisotropy correction to a map. This analysis shows the anisotropy of the map itself.

The second is an analysis of the correlation between Fourier coefficients for the two half-maps. This is also done as a function of resolution and direction in reciprocal space. This analysis shows the anisotropy of the errors in the map.

For purposes of this analysis, the optimal map is the one that has the maximal expected correlation to an idealized version of the true map. This idealized map is a map that would be obtained from a model where all the atoms are point atoms (B values of about zero).

For a map with zero error (all correlations at all resolutions and directions equal to 1), the optimal map will be one that has no anisotropy and the same resolution dependence as the idealized map. For such a map, first the anisotropy in the map is removed, then an overall resolution-dependence matching that of the idealized map is imposed by simple multiplication with a resolution_dependent scale factor.

For a map with errors, the map coefficients obtained from the previous step are modified by a local scale factor that reflects the expected signal-to-noise in that map coefficient. The scale factor for a particular map coefficient is given by 1/(1 + E**2), where E is the normalized expected error in that map coefficient. This scale factor will ordinarily be anisotropic and resolution-dependent.

Examples

You can use local_aniso_sharpen with either two half-maps or a map and a model.

Standard run of local_aniso_sharpen with two half-maps:

To run local_aniso_sharpen with two half maps, you can say:

phenix.local_aniso_sharpen half_map_A.mrc half_map_B.mrc seq_file=seq.fa

If you wish, you can specify a nominal resolution. The sequence file is to let the program figure out the volume of the molecule that is present.

To run local_aniso_sharpen with a map and model, you can say:

phenix.local_aniso_sharpen map.mrc model.pdb resolution=3 seq_file=seq.fa

The resolution is again optional.

You can specify whether anisotropy or local sharpening are to be applied:

local_sharpen=True
anisotropic_sharpen=True

Output

Local_aniso_sharpen will provide a summary of the anisotropy of your map if you leave anisotropic_sharpen=True. Here are some of the values that are calculated for a test case, with annotations:

NOTE: All values apply after removal of overall U of:
   (0.81, 0.84, 0.78, -0.03, -0.01, 0.04)

>>  This overall U value has large and similar values for the first 3
>>    numbers, and small values for the last 3.  This indicates that this
>>    map is relatively isotropic.
>>  In this output, the overall U listed above is removed from all other
>>    U values  before printing them out.

>>  All values are reported in normalized units (U values).  See
>>    https://www.iucr.org/__data/iucr/cifdic_html/1/cif_core.dic/Iatom_site_U_iso_or_equiv.html

 Estimated anisotropic fall-off of the data relative to ideal
(Positive means amplitudes fall off more in this direction)
 (  X,      Y,      Z,    XY,    XZ,    YZ)
(0.002, 0.089, 0.037, 0.031, 0.042, -0.002)

>> This is the fall-off of the data, after removal of the overall U values,
>>   relative to a standard of values calculated from a beta-galactosidase
>>   model without a bulk solvent correction (available from the phenix
>>   python module cctbx.development.approx_amplitude_vs_resolution)

 Anisotropy of the data
(Positive means amplitudes fall off more in this direction)
 (  X,      Y,      Z,    XY,    XZ,    YZ)
(-0.119, 0.167, -0.133, -0.013, 0.031, 0.064)

>> This is the anisotropy of the data, after removal of overall U values

 Anisotropy of the uncertainties
(Positive means uncertainties decrease more in this direction)
 (  X,      Y,      Z,    XY,    XZ,    YZ)
(0.160, -0.032, -0.035, 0.031, 0.054, -0.010)

>> This is the anisotropy of the uncertainties in scale factors,
>>    after removal of overall U values

 Anisotropy of the scale factors
(Positive means scale factors increase more in this direction)
 (  X,      Y,      Z,    XY,    XZ,    YZ)
(-0.022, -0.121, -0.088, -0.050, -0.040, -0.019)

>> This is the anisotropy of the scale factors,
>>    after removal of overall U values

Possible Problems

If the half-maps are not actually independent the procedure will not work well

If the model is very poor the procedure will not work well

For model-based sharpening, if local sharpening is used, the sharpening is only applied in the region of the model

Specific limitations and problems:

Literature

Additional information

List of all available keywords

job_title = None Job title in PHENIX GUI, not used on command line
input_files
- seq_file = None Sequence file. Include all copies of each sequence so that the full contents of the map can be estimated.
- map_model
  - full_map = None Input full map file
  - half_map = None Input half map files
  - model = None Input model file
output
- sharpened_map_file = default Sharpened map file name
- sharpened_map_file_1 = default Sharpened half map 1 file name
- sharpened_map_file_2 = default Sharpened half map 2 file name
- local_resolution_map_file = default Local resolution map file name
- overwrite = True Overwrite files with same names
- file_name = None Not used
- filename = None Not used
- serial = None Not used
- output_scale_factor = None Scale factor to be applied to output map just before writing. Normally the output map will have a mean of zero and SD of 1. This may lead to the maximum in the map being much greater than 1. You can adjust the output SD with this scale factor.
sharpening
- local_sharpen = False Sharpen locally (alternative is global sharpening). Note: can take a long time
- anisotropic_sharpen = True Use anisotropic sharpening. Can be combined with local sharpening
- model_sharpen = None Model sharpening (default if model is supplied and only one map is supplied)
- n_bins = None Number of bins for sharpening (default 200 overall and 20 local)
- n_boxes = None Number of boxes
- box_size_grid_units = None Size of core region of boxes (not including region where mask is applied, in grid units
- sharpen_all_maps = True Sharpen and write out half-maps in addition to the full map
crystal_info
- resolution = None Nominal resolution of map
- wrapping = None You can specify whether the map is wrapped (can map values outside bounds to inside with cell translations).
control
- multiprocessing = *multiprocessing sge lsf pbs condor pbspro slurm Choices are multiprocessing (single machine) or queuing systems Not implemented
- queue_run_command = None run command for queue jobs. For example qsub. Not implemented
- nproc = 1 Number of processors to use. NOTE: by default multiple processors will only be used in the map-to-model step (this is because multiprocessing requires writing out nproc sets of huge files and it can be very slow with distributed queues.). You can override this with force_nproc = True.
- ignore_symmetry_conflicts = False You can ignore the symmetry information (CRYST1) from coordinate files. This may be necessary if your model has been placed in a box with box_map for example.
- verbose = False Verbose output
guiGUI-specific parameter required for output directory
- output_dir = None