Putting the best parts of several maps together with combine_focused_maps

Author(s)

combine_focused_maps: Tom Terwilliger

Purpose

The routine combine_focused_maps uses rigid-body refinement to place a model in several maps. It then uses the map-model correlation to identify which are the best parts of each map and the relationships among the models in the different maps to superpose the best parts of each map and create a composite map.

Usage

How combine_focused_maps works:

Combine-focused-maps uses the fit of chains in a model to each map to identify the correspondence between the maps. Using this approach it is not necessary that the maps are all superimposed or even that they have the same gridding or size.

If the maps all do superimpose (approximately), you can supply a single model. The model will be refined separately against each map with rigid-body refinement. The placements of each chain in each map will then be used to identify the transformations needed to superimpose the parts of each map corresponding to each chain.

If the maps are quite different, then you will need to supply one model for each map, where you have already placed each chain so that it superimposes with the appropriate part of each map. The models should all be the same except that they have been rigid-body-refined against the various maps. You can use the phenix tool phenix.dock_in_map to place the chains in each map if you have not done it in some other way.

Once a set of models has been set up to match the maps, the map-model correlation in the region of each chain in each model is calculated. This correlation is calculated in a special way: the B-values of all the atoms in the model are set to zero before map-model correlation is evaluated. This step is important in order to make sure that a perfect high-resolution map always will score better than a perfect low_resolution map. (If B-values are variable then a perfect high-resolution map and a perfect low-resolution map will both have correlations of 1, but with a B-value of zero the high-resolution map will have a higher correlation.)

For each chain in the model, the map-model correlations for each map are then used to identify the weighting on that map.

Alternatively, you can specify what parts of the model are to be grouped together for identifying which map to use in what region. You specify any number of region_selection selections (in Phenix atom selection syntax, like (chain A and resseq 250:360) ). Each one is used to define a region in space and within that region the maps are weighted according to their correlations to the atoms selected with that selection.

An empirical weighting scheme is used. The relative weights of two maps (for this one chain or retion ) with a difference in map_model_cc of delta_cc are given by:

exp(-delta_cc/delta_cc_norm)

where delta_cc_norm is typically about 0.05. This means a map that has a map-model cc 0.1 less than another map gets a weight of about 1/10 of the better-fitting map. Using this weighting scheme and transformations calculated from the positions of this chain after rigid-body refinement against each map, a weighted average map is created for each chain in the model. These maps are all superimposed on the reference map and masked around the chain that each is to represent.

Finally all the weighted average maps are combined to form a composite map.

Additional output files

If you specify get_contribution_maps=True (this is the default), at the end of the procedure you will get one map written out for each input map file, called something like "contribution_map_1.ccp4". This map has values from 0 to 1 showing the contribution of this input map file to the final map. You can use it to color your final composite map, showing how much of the information at each point came from this input map file.

If you specify get_cc_map=True (also default) you will get a map file called cc_map.ccp4 that shows the map correlation between the model and the final composite map. The correlation map will be calculated over whatever regions in the model were used in forming the composite map (i.e. if you used individual chains, then the correlations will be calculated over each chain).

Examples

Standard run of combine_focused_maps:

Running combine_focused_maps is easy. From the command-line you can type:

phenix.combine_focused_maps reference_map.map focused_map_1.map \
   focused_map_2.map model.pdb resolution=4

where reference_map is the map (CCP4, mrc or other related format) that will be used as the template to superimpose all other maps will be superimposed, and focused_map_1.map and focused_map_2.map are maps that are focused on some part of the map (and the remainder of those maps may be poor). The model will be rigid-body refined against all three maps (keeping chains fixed as rigid bodies). The best parts of each map will be selected and combined to create a composite map.

Possible Problems

Specific limitations and problems:

If the density in some of your maps is rather poor, the rigid-body refinement step may not work well. One thing to try in this case is to refine with rigid_body_refinement_single_unit=True. This will tend to hold everything together (but it won't give you an adjustment of individual chain positions). Another option is to refine (with rigid-body refinement only) the models before using combine_focused_maps. That way you have full control over the refinement process. Then in combine_focused_maps use rigid_body_refinement=False.

You cannot use region_selection with use_model_symmetry.

Literature

Additional information

If you supply a model for each map, the models are used to define what parts of each map are included in the analysis. If the model for your target map has chains ABC and one focused map has just chain A, then only chain A from the focused map will be transferred to the target map. If the model for the focused map has chains ABC then weighted versions of the map in the vicinities of ABC will be included.

If you have a model with symmetry (say, chains A B C D are all the same) and just one focused map (say, focused on A), then you can supply your target map, target model with chains ABCD, an one focused map and a model for the focused map that contains just chain A. Then you add the keyword "use_model_symmetry=True" and the focused map from chain A is applied to A B C and D in the target map.

If you supply half-maps corresponding to each full map, the half-maps will be processed in the same way as the full map, yielding one pair of half-maps at the end of the procedure. You can use these half-maps to estimate the local resolution of the final composite map (using phenix.local_resolution) or for other procedures requiring half maps. Note that the half maps could have discontinuities so they may not be appropriate for all uses of half maps.

List of all available keywords

job_title = None Job title in PHENIX GUI, not used on command line
input_files
- map_file = None Files with CCP4-style maps. May have origin in any location. All maps will be superimposed on the first map. Normally it is assumed that all the map files are approximately superimposed so that a single model_file will approximately match all map files. If the map files do not superimpose, supply one model file per map file, where the model files differ only by rigid-body refinement.
- map_scale = None Scale to apply to map before use. Must be input in same order as map_files. Not compatible with normalize_maps.
- model_file = None Model placed in first map to be used to align maps. Each chain in the model will be used to mark an independent region in each map. Optionally one model can be supplied to match each map. Normally it is assumed that all the map files are approximately superimposed so that a single model_file will approximately match all map files. If the map files do not superimpose, supply one model file per map file, where the model files differ only by rigid-body refinement. NOTE: only the parts of maps marked by a chain in the corresponding model file will be included in averaging.
- half_map_1_file = None Half-map 1 list matching map_file list. If you supply a set of half_map_1 files and half_map_2 files, then whatever is done to the files in map_list will be applied to both the half_map_1 files and half_map_2 files.
- half_map_2_file = None Half-map 2 list matching map_file list. If you supply a set of half_map_1 files and half_map_2 files, then whatever is done to the files in map_list will be applied to both the half_map_1 files and half_map_2 files.
- region_selection = None Model selection string defining one region to be considered. This is an alternative approach to defining regions (default is to use each chain as a region). Any number of regions can be supplied.
output_files
- composite_map_file = composite_map.ccp4 Output map file with composite map
- output_scale_factor = None Scale factor to be applied to output map just before writing. Normally the output map will have a mean of zero and SD of 1. This may lead to the maximum in the map being much greater than 1. You can adjust the output SD with this scale factor.
- refined_model_file = composite_map.pdb Output model file after rigid-body refinement
- composite_half_map_1_file = composite_half_map_1.ccp4 Output map file with composite half-map 1
- composite_half_map_2_file = composite_half_map_2.ccp4 Output map file with composite half-map 2
crystal_info
- resolution = None High-resolution limit for main search. This can be lower resolution than the data. The search is quicker at lower resolution. .short_caption = High-resolution limit
- scattering_table = n_gaussian wk1995 it1992 *electron neutron Choice of scattering table for structure factor calculations. Standard for X-ray is n_gaussian, for cryoEM is electron.
superpose
- use_model_symmetry = False If use_model_symmetry = True then find all the chains in the target model (the first model supplied) that have the same sequence (or use symmetry_chains to ID these chains). Any chains in other models that match one of these chains will be matched to all of them. Using this keyword a single focused map can be applied to all corresponding chains in the target map/model. Note: cannot be combined with region_selection.
- symmetry_chains = None You can specify the chains that are to be considered equivalent. See use_model_symmetry for description. Note: cannot be combined with region_selection.
- rigid_body_refinement = True Run rigid-body refinement on input model vs each map
- rigid_body_refinement_single_unit = False Run rigid-body refinement with just one unit (do not break up into chains)
- remove_water = True Remove waters from input files
- local_weighting = False Use local correlation to weight individual regions of maps. Size of local region determined by local_residues. Alternative is one weight for each chain of each map.
- local_residues = 10 Number of residues to use in local correlations .short_caption = Local residues
- delta_cc_norm = 0.05 When scaling, weight a map or part of a map that has map cc of delta_cc less than another map by exp(-delta_cc/delta_cc_norm) less than the other map.
- mask_atoms_atom_radius = None Radius for masking atoms when transforming map. Default is max(3, resolution).
- mask_smoothing_radius = None Radius for smoothing mask. Default is 3 times the resolution
- normalize_maps = True Normalize maps (mean zero, sd 1) before use. Not compatible with map_scale option.
- get_contribution_maps = True Create map-like files showing contribution of each map
- get_cc_map = True Create map-like file showing map-model correlation at end
control
- nproc = 1 Number of processors to use
- random_seed = 171731 Random seed
- verbose = False Verbose output
guiGUI-specific parameter required for output directory
- output_dir = None
- restraint_files = None