Density modification of cryo EM maps with resolve_cryo_em
Author(s)
- resolve_cryo_em: Tom Terwilliger
Purpose
The routine resolve_cryo_em is a tool for carrying out density modification
of cryo-EM maps
Usage
Density modification with resolve_cryo_em is carried out using two
half-maps, along with the FSC-based resolution and a sequence file specifying
the contents of the map.
Density modification can also be carried out by using the initial
density-modified map or a single supplied model
as a basis for generating multiple models, then using model-based density
as part of the density modification target to yield a model-based
density modified map.
How resolve_cryo_em works:
Density modification with resolve_cryo_em is based on two ideas.
One is that the errors in Fourier coefficients representing a cryo-EM map are
(to some extent) uncorrelated. This means that one Fourier coefficient does
not know about the errors in another one. (Note that this is not including
errors that are correlated simply because the molecule is small and is
placed in a large map. Correlated errors in this context are those where one
Fourier coefficient has been adjusted to compensate for errors in another one.)
The other is that some features in a map are known in advance. This could
include features such as the flatness of the solvent region, distributions
of map values in the solvent and macromolecule region, similarities of
symmetry-related regions.
Then the way density modification works is that Fourier coefficients for
the map are adjusted to agree both with the original map and with
the expected features. This improves the Fourier coefficients,
and the key result is that the map improves everywhere, not just where
the information about expected features was available.
Unique features of density modification for cryo-EM are that two half-maps with
independent errors are available in cryo-EM (allowing estimation of
errors), and that the errors in Fourier coefficients are (more or less)
distributed as two-dimensional Gaussians (i.e both phase and amplitude
errors). This leads to many differences in implementation density
modification in crystallography though core elements are identical.
Using resolve_cryo_em:
Normally you will access the functionality of resolve_cryo_em by running
the ResolveCryoEM tool in the Phenix GUI. You can also run it from the
command line. You may wish to run it from the command line in background
with multiprocessing if you are running with denmod_with_models=True as this
can take a long time (perhaps 1 day x 16 processors if you have 250
residues in the unique part of your model).
Half-maps: Supply two unmasked half maps. They can be sharpened but it does
not make much of a difference. They must be unmasked and they must still have
noise in the solvent region. They can be boxed with a soft boundary (if they
are boxed specify box_before_analysis=False so it isn't done again)
Sequence file: Supply a sequence file with the sequence of the molecule. Be
sure to put in all copies of the molecule (i.e. a 24-mer needs 24 chains). (You can also supply one copy of the molecule and specify copies=24 if you want.)
Model file: You can supply a model; if you do then the model will be randomized
and refined against the half-maps to yield 8 models for each half map. These
will be used in model-based density modification.
Automatically-generated models: You can specify density_modify_with_model=True
and not supply any model. In this case 8 models will be automatically
generated for each half-map and used in model-based density modification.
Procedure used by resolve_cryo_em
The basic inputs to resolve_cryo_em are:
Two unmasked half-maps
sequence file or molecular mass or solvent fraction
Optional inputs are:
Full map (Recombined with density-modified half maps)
Model (Used to generate multi-model representation for density modification)
Symmetry file (Supply with model or if automatic model generation is
used to limit the modeling to the unique part of the map. Alternatively,
used in density modification in cases where the symmetry is not part
of the reconstruction process).
Mask file (Used to identify the region where the macromolecule is present)
The procedure used by resolve_cryo_em has several steps:
Boxing of maps: If the supplied maps are much larger than the molecule,
the maps are trimmed down to about 5 A bigger than the largest dimension
of the molecule (estimated from a low-res mask and the molecular
volume based on sequence or as specified) in each direction, using
a soft mask at the edges of the box.
Resolution estimate and half-map sharpening of maps: The half-maps are
compared as a function of resolution and the resolution (FSC=0.143)
is estimated and the maps are sharpened based on the estimated map quality
of the full (averaged) map. A full map is calculated.
Generation of map-value (density) histograms: The full map is analyzed
to identify the distribution of map values in the solvent and
macromolecule region. These histograms are optionally used in density
modification.
Map-phasing of half-maps: Each half map is used in a process of
map-based estimation of new Fourier coefficients using a
maximum-likelihood procedure. In essence, one Fourier coefficient is
removed from the map at a time. Then a new value of that Fourier coefficient
is found that maximizes the likelihood of the map given all the other
Fourier coefficients. The likelihood is calculated from the agreement of
the histograms of the (new) map with expected histograms. For example,
if the solvent region is expected to be flat, then the new map has a
high likelihood if it is flat in the solvent region. By varying the one
Fourier coefficient the flatness of the solvent region will change. Similarly
the histograms of density in the region of the macromolecule will change
depending on the value of the one Fourier coefficient. The best value for
each Fourier coefficient is then used to calculated a map-phasing map.
Estimation of errors: Fourier coefficients for the two starting
half-maps and the two density-modified maps are compared to give FSC
values as a function of resolution. These FSC values are used to estimate
correlated and uncorrelated errors in the four maps and to identify
optimal weighting between original and density-modified maps.
Recombination of original and map-phasing half-maps: Based on the
estimated errors in original and map-phasing half-maps, all four maps
are recombined to create a new density-modified map. Additionally,
each half-map and associated map-phasing half map are recombined to
create two new density-modified half-maps.
Optional real-space and sigma weighting: The smoothed local rms differences
between original half maps and between density-modified half maps are
used (optionally) to identify location-specific weighting for the
original and density-modified maps. The variance of Fourier coefficients
among the four maps are used (optionally) to weight individual final
Fourier coefficients.
Optional starting with unsharpened maps. The input half maps are used in the
density modification step without adjusting the resolution dependence
of the maps with half-map sharpening. This can be useful
for high-resolution maps. Note that whether or not this option is
selected, normally an overall B-factor is still applied to the input maps
based on the nominal resolution. The difference therefore is in the
details of the resolution dependence which is adjusted in the default
case based on a comparison of half-maps, but not with density
modify unsharpened maps. The overall B-factor
applied before density modification is controlled by the "remove_aniso"
and "b_iso" keywords. By default remove_aniso=True and b_iso is
target_b_ratio*resolution (target_b_ratio=10), so before density
modification whatever resolution dependence and anisotropy is present in
the each map is adjusted to yield an overall B of b_iso and no anisotropy.
Optional alternative final sharpening of maps. The final sharpening normally
consists of two parts. One is scaling the map in shells of resolution
based on the estimated correlation of the map coefficients at each
resolution with the true ones. This is controlled by the
keyword final_scale_with_fsc_ref. The second part of final scaling is
scaling Fourier coefficients in each shell of resolution either to match
the low-resolution shell or to match the scaled half-maps, or to
be part-way between these (controlled by the keyword geometric_scaling).
Optional spectral scaling and local sharpening. The final
map is optionally scaled with a resolution-dependent scale factor
representing the radial part of a typical Fourier transform of a
macromolecule. The final map is optionally locally resolution-filtered
(local sharpening). The final map is also optionally blurred slightly
with a blurring dependent on the overall resolution of the map.
Optional use of an input full map. If you supply a full map in addition
to the half-maps then the full map will be recombined with the two
map-phasing half-maps instead of the average of the two original
half-maps being recombined. This could improve the final map if your
full map has been filtered in some special way.
Optional density modification with multiple models. You can generate
multiple-model representations of each of your two half maps either
by supplying a single model (it will be randomized and refined against
each half-map) or by specifying density_modify_with_model=True (models
will be generated automatically). The multiple models will be used to
estimate the density at each point near the models along with a guess of
the uncertainty of that density. These are used to create a target map
for density modification that is based on model density and the
map from standard density modification, weighted according to their
uncertainty estimates. This target map is then used in density modification
(as a target for the region containing the model), where Fourier coefficients that lead to a map that is more similar to the target density are considered
more likely thatn those that do not. This procedure necessarily can
introduce model bias, but this is greatly reduced by the use of multiple
models, uncertainty estimates, and the likelihood-based density modification.
You may with to use a finer resolution for density modification
(dm_resolution) if you include model information because the model can
provide higher-resolution information that simple density modification.
Density modification including symmetry that is not part of reconstruction
symmetry. If your molecule has symmetry that was not used in the
reconstruction process, it can be used in density modification. Supply a
symmetry_file (.ncs_spec file) that you can obtain with the find_ncs tool
in phenix (uses a model or density) and it will be included.
Procedure used by resolve_cryo_em for density modification with model-building
Density modification with model-building adds additional cycles to the
density modification procedure in which multiple models are built using
map_to_model and the averaged density and uncertainty in the average density
is used to combine the model density with the initial density-modified map.
The procedure includes:
Create initial density-modified half-maps and full map
Create N (typically 16) variants of the full map by changing the resolution
cutoff, spectral_scaling, and blurring of the map.
Build a model into each modified full map. If symmetry file supplied,
build only the unique part of the model.
Refine some of the models against half-map 1 and some against half-map 2
Create one composite model based on all models
Create model density for each half-map based on the models refined againt
that map. This model density will have a mean value and variance for each
point in the map near to at least 3 models.
Create composite density for each half-maps by combining the model density
with the density-modified half-map, weighting the model density according to
its consistency among models.
Density-modify each composite half-map, and create a new set of density
modified half-maps and full map, as in the procedure for standard
density modification.
Sharpen the resulting maps using model-based sharpening with the composite
model.
Note that this procedure takes a lot of computation.
Note on R-values in density modification
The R-value for density modification is quite different from a refinement R.
Basically the statistical (maximum-likelihood) density modification
procedure allows calculation of each structure factor based on the values of
all the other structure factors and the expectations about the map
(i.e, flat solvent, distribution of density values in protein, ncs, etc.)
These "map-phasing" structure factors are (at least in the first cycle)
independent of the starting structure factors and can be compared with the
measured data and an R-value obtained. The values of these R-values range
from about 0.25 to 0.5 in most cases, and when the map is very good usually
the R values are smaller. These R values do not involve any atomic model
so they are quite unlike a refinement R.
For discussion of how map-likelihood structure factors are calculated in
Resolve, see the Map-likelihood phasing reference below.
Examples
Standard run of resolve_cryo_em:
You can use resolve_cryo_em to density-modify a cryo-EM map:
phenix.resolve_cryo_em half_map_A.mrc half_map_B.mrc seq_file=seq.dat
What to expect with density modification
If density modification is working well, you should see:
An improvement in reported resolution. Typically this is small..from 0.05 A
to 0.5 A. If it is zero...it probably did not work.
More fine detail. Density modification mostly changes
the high-resolution part of your map. (The low-resolution part of your map
is generally already very good, and usually if it is not then density
modification can't fix it. Occasionally low and high-resolution aspects
of a map can improve.) Compare your original (auto-sharpened) map with
your density-modified map and see if side chains and carbonyl oxygens are more
clear and if the connectivity of the chain improves.
If you supply or generate models, you should see improved high-resolution
detail in the region of the model. You are not likely to see improvement
away from the model as you would in crystallography. The multiple models
generated are basically identifying what locations in your map are
compatible with being the location of an atom, based on the density already
there and the geometrical restrictions on a model. That information then
is included in density modification and improves density in the region of
the model but is not strong enough to improve density elsewhere.
What to try to improve your map
Usually you should first just run with all defaults, supplying two half-maps
and a sequence file and nominal resolution. See how the map looks and use it
as a baseline. Then some things to try to improve the map are:
Add a full map file to recombine it with density-modified half-maps. This
can be helpful if your full map has been processed in a way that makes it
much better than an average of your half-maps.
Add a mask file to exclude part of the map. If your map has some huge noise
peaks or just density that is not part of your molecule, you can try to
exclude it by supplying a mask file (a map file with 1 where your molecule is
located and 0 elsewhere).
If you have a model, you can supply the model and it will be used to generate
a multi-model representation of your molecule. This multi-model representation
will be used in density modification. Note that you should supply a
symmetry file if there is symmetry in your reconstruction so that only
the unique part needs to be analyzed.
If you don't have a model but your resolution is high (about 3.5 A or better),
you can improve your map by automatically generating a multi-model
representation of your structure. This is used in the same way as the
multi-model representation starting with a supplied model, and a symmetry
file is helpful to identify just the unique part of the molecule.
You can try changing
the resolution (overall) and the density-modification resolution. Normally
the resolution is the gold-standard resolution from comparing your half-maps.
However varying this may help. The density-modification resolution is
normally set automatically to be somewhat finer than the overall resolution,
but its exact value does have an effect so you could try different values.
You can try changing some of the defaults that occasionally help. One is
setting density_modify_unsharpened_maps=True (density modifies the original
half-maps without auto-sharpening them). Another is setting
final_scale_with_fsc_ref=False (applies final resolution-dependent scaling
that is part-way between standard half-map FSC-based sharpening and
constant amplitudes over all resolutions).
These usually are best for high_resolution maps (< 3 A).
Real_space_weighting, sigma_weighting, and spectral_scaling may improve the
final map. Real-space weighting weights original and density-modified maps
based on their local uncertainties. Sigma-weighting weights them by
uncertainties in individual Fourier coefficients. Spectral scaling scales the
final map by resolution to match the resolution-dependence of a typical
protein.
How to tell how well density modification is working
Density modification changes the sharpening of your map, so just because the
map looks better or worse doesn't necessarily mean that anything important
has happened.
If you have two maps and you want to know which is the better one, here
are some things you can do:
1. Create a matched version of your maps that have the same
resolution-dependence using phenix.auto_sharpen. Run it like this:
phenix.auto_sharpen n_bins=100 auto_sharpen_methods=external_map_sharpening
external_map_file=target_map.map map_to_change.map resolution=4.1
sharpened_map_file=map_to_change_matched.map
Now you can look at target_map.map and map_to_change_matched.map and
differences you see are due to intrinsic properties of the maps, not just
sharpening.
2. If you have a good model, refine the model against each map. Then use
phenix.mtriage to estimate the resolution where the FSC(map,model)==0.5
for each map. The map with a lower value of this resolution may be better.
3. As in #2, once you have a pair of maps and a model refined against
each map, you can run phenix.auto_sharpen with each map and model and
it will print out the average FSC for each. The one with the higher
overall FSC may be better.
Possible Problems
If the half-maps have been masked the procedure may not work well.
If the solvent noise is very non-uniform the procedure may work poorly.
By default a rectangular solid region enclosing the molecule is cut out
and used in density modification. You can supply a boxed map and
set the keyword box_before_analysis=False to avoid this.
If the maps have very prominent density away from the macromolecule this
may interfere with density modification. You can get around this by supplying
a mask (as a map, 1=inside the molecule).
If there is non-macromolecule but real density in the maps this
may interfere with density modification (for example, lipid density).
Specific limitations and problems:
If your computer does not have a lot of memory and you use multiple
processors (nproc bigger than 1) then resolve_cryo_em can crash. If this
happens try nproc=1.
Density modification introduces some correlations between half-maps
due to solvent flattening. This can have a small effect on the resolution
estimates obtained with half-map FSC. The resolution estimates provided
by the program are corrected for this effect.
If you use the real_space_weighting or sigma_weighting or
sharpening_type=local_final_half_map options there may be some
extra correlations between half-maps introduced. Calculating resolution
using FSC between these density-modified maps can lead to overstating the
resolution. The resolution estimates provided by the program are before
applying these weighting schemes (unless you specify local_methods_final_cycle = False and run multiple cycles) so they are not normally affected by this.
The density modification procedure works best in the resolution range of about 4.5 A or better, but occasionally works very well at lower resolutions (not
tested beyond 9 A). The procedure does not work in every case (we do not
have a great metric for success but about half of cases appear to be
improved according to the metrics described above).
Varying the parameters can have a substantial effect on the outcome.
Model-based density modification necessarily biases the map towards the models
that are built. By building multiple models, the effect of this bias is
reduced but not eliminated. For example if the starting map has an error that
causes models to be built with a side chain the wrong place, the new model-
based density will show even more density in that location. That is,
this process can accentuate errors in the original maps if they match
plausible locations of atoms in a model. Balancing this, the procedure
generally does not appear to yield density at positions where there is no
density in the original map even if atoms are located there in a model.
It is important that the original or non-model-based maps be consulted to
evaluate any specific density in the map.
Additional information
List of all available keywords
- job_title = None Job title in PHENIX GUI, not used on command line
- input_files
- half_map_file_name_1 = None Half map file name
- half_map_file_name_2 = None Half map file name
- map_file_name = None Optional map file name. If you supply a full map it will be used in recombination with the density modified half maps. If you have a good full map this may be useful.
- target_map_file_name = None Input target map file name
- mask_file_name = None Input mask file name (as map file with 1 for macromolecule, 0 solvent, smoothly varying between 1 and 0). Used instead of automatic mask to define region where map correlations are calculated.
- model_1 = None Model to be randomized to generate multiple-model representation of structure in density modification. Normally eight models will be generated for each half-map. If multiple model_1 and model_2 files are supplied then they will be used directly and should be refined against half_map_1 or denmod_half_map_1 before supplying them. Normally supply at least minimum_model_files corresponding to half-map 1 and the same for half-map 2. Generated automatically if density_modify_with_model is set and no model is supplied. If supplied, at least minimum_model_files must be supplied for each half-map.
- model_2 = None Model corresponding to half-map 2 (unique part of model) to use in density modification. All model_2 models should be refined against half_map_2 or denmod_half_map_2 before supplying them. Normally supply at least minimum_model_files corresponding to half-map 2 and the same for half-map 1. Generated automatically if density_modify_with_model is set.
- minimum_model_files = 3 Minimum number of models representing half-maps 1 and 2. NOTE: must be at least 2.
- denmod_map_file_name = None Optional input density-modified full map. Normally generated automatically
- denmod_half_map_file_name_1 = None Optional input density-modified half map file 1. Normally generated automatically
- denmod_half_map_file_name_2 = None Optional input density-modified half map file 2. Normally generated automatically
- truncate_density_file_name = None Input model file containing atomic coordinates around which density is to be truncated at 2 sigma in density modification. This can be used to truncate high density for heavy atoms or high density that is away from the macromolecule so that they do not interfere with density modification. Note that this means that the density in these regions will typically be much lower after density modification. See also rad_mask, the radius around the coordinates to mask.
- rad_mask = 5 Radius around truncate_density atom positions to truncate density in density modification
- histograms_file_name = None Input histograms file name. Normally this file is created automatically
- guess_histograms_path = True If histograms path is too long, try to guess the path based on the resolve default path and the working directory
- symmetry_file = None Symmetry file used to apply symmetry during density modification and model-building. Required if density_modify_with_model is set and ncs_copies is greater than 1.
- seq_file = None Sequence file used to estimate volume occupied by the molecule and to generate models (if density_modify_with_model is set) . Either supply all copies of each chain or supply the unique part of the sequence file and use ncs_copies to specify how many copies of this are present. If ncs_copies = 1 or None, then the seq_file must include all copies (i.e, if a dimer, must have two copies of the chain) and no NCS will be applied. .short_caption = Sequence file
- output_files
- pdb_out = merged_model.pdb Merged model (composite model obtained from model-building steps)
- initial_map_file_name = initial_map.ccp4 Output scaled original map file name. Only written out if apply_cc_star_to_initial_map is True. (Otherwise the initial map is highly sharpened and not suitable for viewing).
- model_density_map_file_name = model_density_12.ccp4 Output map containing density used as model-based target in density modification.
- denmod_map_file_name = denmod_map.ccp4 Denmod map file name
- denmod_blur_50_map_file_name = None Output scaled denmod map file name, blurred with B of 50
- denmod_half_map_1_file_name = denmod_half_map_1.ccp4 Output scaled denmod half map 1 file name
- denmod_half_map_2_file_name = denmod_half_map_2.ccp4 Output scaled denmod half map 2 file name
- map_phasing_half_map_1_file_name = None Map phasing half map 1 file name
- map_phasing_half_map_2_file_name = None Map phasing half map 2 file name
- output_mask_file_name = None Output mask file
- temp_dir = None Temporary directory. Default is resolve_cryo_em_xx where xx creates new directory. If specified, temp_dir will be used or created and used, and the working directory will be resolve_cryo_em_xx inside the specified temp_dir. Does not apply if running from GUI.
- restore_full_size = False Restore full size of maps on output by padding with zeroes.
- resample_on_fine_grid = False Resample output maps on fine grid with scale of resampling_ratio to starting resolution of map. Requires input maps have origin at (0, 0, 0).
- resampling_ratio_to_resolution = 0.2 Ratio of gridding for output map to starting resolution of map
- output_pixel_size = None You can specify the output pixel size. This will set resample_on_fine_grid to True and adjust the gridding to result in the specified pixel size.
- use_cubic_boxing = False When boxing map, make a cubic box
- soft_mask_at_end = False When boxing map, create a soft mask around the edges at the end. This is in addition to the soft_mask that is used in the analysis step. Does not apply if restore_full_size is used.
- adjust_local_resolution_coloring = False Adjust local resolution coloring based on limits of resolution. Default is to use a standard coloring scheme.
- output_directory = None Location for output files
- crystal_info
- resolution = None Estimated resolution of full map data. Used to set dm_resolution
- original_resolution = None Original resolution (set automatically).
- minimum_resolution = None Minimum resolution (set automatically).
- n_xyz = None Gridding for resolve density modification calculation. Normally set automatically as same as gridding of input map after boxing.
- auto_gridding = None Automatically set gridding in resolve density modification based on the resolution (not same as input map)
- use_model_mask = False Use model mask if a model is supplied. Only applies if use_model_as_target is False
- soft_mask = True Use soft mask when boxing map
- soft_mask_radius = None Radius for soft mask when boxing map. Default is twice the resolution of the map
- ncs_copies = None Number of copies in the entire structure of whatever is in seq_file or whatever is specified as the molecular_mass. You can leave ncs_copies = None or ncs_copies = 1 and specify entire molecule in seq_file or molecular mass. Or you can set ncs_copies = xx and specify just the unique part of the sequence or molecular_mass.
- box_before_analysis = True Cut out and soft-mask a box of density around the molecule before density modification. If you have already boxed your map specify False, otherwise normally specify True.
- box_cushion = 5 Buffer around box of density around molecule to be cut out for density modification
- density_select = False Use density_select option to cut out molecule in box_before_analysis
- box_with_model_if_present = True Use model to box density before analysis (if present)
- close_index = None Index within this value of edge is considered close. If edge of proposed box is within close_index of end, take the whole box.
- min_close_index = 10 Smallest value for estimate of close_index
- min_close_index_ratio = 10 Close index will be larger of min_close_index and cell grid units divided by min_close_index_ratio.
- solvent_content = None Solvent fraction (content) of the cell. You can specify the fraction of the volume of this cell that is taken up by the macromolecule. Normally set automatically. Values go from 0 to 1. This number applies to the boxed map if boxing is done here.
- solvent_content_iterations = 3 Iterations of solvent fraction estimation
- molecular_mass = None Molecular mass of molecule in Da. Used as alternative method of specifying solvent content. If ncs_copies is specified, total molecular mass = molecular_mass times ncs_copies.
- fraction_of_max_mask_threshold = None threshold of standard deviation map in low resolution mask identification of solvent content. Used if no solvent content and no molecular mass and no sequence file are specified. A good value is 0.05.
- wang_radius = None Local averaging radius for solvent identification in masking. Default is 1.5* resolution
- denmod_wang_radius = None Local averaging radius for solvent identification in density modification Default is 1.5* resolution
- smoothing_radius = None Radius for mask smoothing. Default is twice resolution.
- buffer_radius = None Radius for mask buffer in map_box. Default is smoothing radius value.
- minimum_smoothing_radius = 5 Minimum radius for mask smoothing.
- minimum_solvent_content = 0.5 Stop and ask for a bigger box if solvent content appears to be less than minimum_solvent_content
- maximum_solvent_content = 0.9999 Stop and ask for a smaller box if solvent content appears to be more than maximum_solvent_content
- overall_delta_b = -30 Overall delta B to apply to model map coefficients, relative to overall B of the map. (-30 means get the overall B of the map, subtract 30, set B of atoms in model to this value.)
- n_bins = 100 Number of resolution bins for sharpening. Default is 100. More bins may help slightly but will slow down density modification.
- set_resolution_to_dm_resolution = True Set resolution to dm_resolution once it is determined
- dm_resolution_offset_list = None List of resolution offsets to try for density modification. Applied if quick = False
- dm_resolution_list = None List of resolutions to try for density modification. Normally set automatically
- dm_resolution = None High-resolution limit for density modification. Normally set automatically from input resolution.
- dm_res_a = 2.4 dm_resolution will be guessed from resolution with the formula: res_dm = a + b*(res-res_c) + d * (res-res_c)**2
- dm_res_b = 0.99 dm_resolution will be guessed from resolution with the formula: res_dm = a + b*(res-res_c) + d * (res-res_c)**2
- dm_res_c = 3.0 dm_resolution will be guessed from resolution with the formula: res_dm = a + b*(res-res_c) + d * (res-res_c)**2
- dm_res_d = -0.2 dm_resolution will be guessed from resolution with the formula: res_dm = a + b*(res-res_c) + d * (res-res_c)**2
- chain_type = *None PROTEIN RNA DNA Chain type. Determined automatically from sequence file if not given. Mixed chain types are fine (leave blank if so).
- sequence = None Optional sequence. Can be used instead of a seq_file
- scattering_table = n_gaussian wk1995 it1992 *electron neutron Choice of scattering table for structure factor calculations. Standard for X-ray is n_gaussian, for cryoEM is electron.
- strategy
- calculate_local_resolution = True Get local resolution map
- geometric_scale = True Use geometric mean of low-res scale factor and shell scale factor in scaling. Prior to FSCref scaling, the low-res scale factor would make Fourier coefficients in each shell of resolution match those in the lowest-resolution shell (on average). The shell scale factor would make them match the resolution dependence of the (scaled) half-maps. The geometric mean is partway between these. Note: not applied if final_scale_with_fsc_ref is True.
- geometric_scale_half_maps_only = False Use geometric mean only for half-maps, not map combination
- geometric_scale_not_half_maps = False Use geometric mean except for half-maps, only map combination
- target_value = None Target value in scaling. None means scale to input map. You can use geometric_scale = False target_value = 1000 final_scale_with_fsc_ref = False to have a more-or-less resolution-independent final map (an E-map).
- final_scale_with_fsc_ref = True Adjust final resolution-dependent average Fourier coefficient amplitudes to match estimated correlation with true cofficients (FSCref). This uses fixed_target_value = 1000 in recombination and turns off geometric_scale in map recombination
- fixed_target_value = 1000 Fixed target value to use if specified.
- include_all_in_target_norm = False Include all input data in calculating average values for normalization. Alternative is just half-map data
- require_significant_hm12 = None Require significant FSC for half-maps (within min_hm12ab_ratio of FSC for density-modified maps). Default is True except when models are used.
- half_map_fsc_to_keep_only = 0.9 If half-map fsc is this value or greater, skip density modification in this shell. Used to prevent density-modified information from making low-resolution terms worse than they started out.
- optimize_r_value = False Optimize choice of histograms/sharpening based on R-value for density modification. Alternative is based on estimated improvement in resolution.
- optimize_sharpening_method = None Try unsharpened and half-map sharpened map coefficients as starting point for density modification, then auto-set overall sharpening if remove_aniso = true, then density modify and choose based on minimum density modification R value
- density_modify_density_target_start_with_denmod = False Start with density-modified maps in density_modify_density_target
- density_modify_model_simple_merge = False Merge model_density with original without another density modification run. Normally choose either density_modify_model_simple_merge or density_modify_density_target for model-based density modification.
- density_modify_density_target = True Use density from unmerged models as target for density modification
- use_model_variance = True Use estimate of model variance to weight density target.
- mask_density_target = True Mask density target to use only region where model was present
- use_merged_models_in_denmod = False Use merged model in density modification (one for each half map)
- use_tight_mask = False Use tight mask in average_maps
- model_mask_radius = 3 Radius for masking density from model
- no_denmod_after_model = False No density modification after recombining model information. Not implemented.
- use_correlated_fsc = False Assume errors in model-based FSC values are highly correlated
- use_maximum_inside_mask = False Use map_1 density (original) if higher than map_2 (model-based) inside map_2 mask in average_maps
- sd_smoothing_radius_ratio = 0.75 Radius for smoothing density sd estimates as ratio to resolution. Used in average_maps
- smooth_model_map = False Smooth model map in average_maps
- weight_on_model = 1 Weight on model in average_maps
- sharpen_map_data_before_scaling = True Sharpen input map data before using it to scale half-map and model data
- use_half_datasets_for_sigmas_by_reflection = False Use differences between F1 and F2 for sigmas ( times 0.707). Alternative is use these differences in shells only. Do not use this as it introduces correlations between half-datasets.
- fix_amplitudes = False Fix amplitudes, only change phases
- use_variance = True Use model variance in model-based density modification
- refine_symmetry_in_denmod = True Refine symmetry operators in density modification.
- use_symmetry_in_denmod = False Use symmetry in density modification. Normally not used as maps already have symmetry applied.
- symmetry_in_denmod_last_cycle_only = True Use symmetry in density modification only on last cycle (if use_symmetry_in_denmod = True).
- fraction_ncs = None Fraction of macromolecule expected to follow symmetry. Use None to identify automatically
- use_denmod_map_in_denmod_with_model = False Use density-modified map to start density modification with model
- refine = True Refine models
- refine_after_randomize = False Refine models after running randomize routine. Note that randomize (used in density_modify_with_single_model) already refines the multiple models but does not refine B. This refines them further including B values. It may be better to keep B values of zero for the multiple-models and skip this.
- sequence_models = None Get sequences of models. Default is True if models are generated automatically at a resolution of sequence_models_resolution or better (normally 3.5 A) and False otherwise. Note that if a very big model this can take a really long time.
- sequence_models_resolution = 3.5 Resolution at which sequence_models is normally applied if models are generated automatically.
- rerun_with_avail_seq = True Rerun sequence alignment with parts of the sequence already used removed. With a large structure this can take a long time.
- write_map_files = True Write out final map files
- write_boxed_map_files = False Write out boxed versions of input map files
- delta_resolution = 0.0 0.5 Resolution offsets to try in model-building
- macro_cycles = 3 Cycles of real-space refinement in model-building and randomization
- build_thoroughness = *None quick medium thorough Thoroughness of model-building
- build_methods = high_density_from_model no_high_density_from_model Build methods to try in model-building. Allowed methods: trace_and_build phase_and_build high_density_from_model no_high_density_from_model quick medium
- backup_build_methods = phase_and_build Build methods to try in model-building if initial methods do not yield enough models. Allowed methods: trace_and_build phase_and_build high_density_from_model no_high_density_from_model quick medium
- vary_blur_and_spectral_scaling_with_model = True In density_modify_with_model, generate maps with and without blur_by_resolution and spectral_scaling (4 maps)
- density_modify_with_model = None Run model-based density modification. If one model is supplied, randomize and generate multi-model representation for each half-map. Models are deleted unless verbose = True. If multiple models (at least minimum_model_files) are supplied for each half-map, use them. Otherwise generate models and refine against half-maps. Use model information in density modification.
- randomize_single_model = True Randomize single model if supplied.
- density_modify_with_single_model = None Density modify with single model by randomizing model before use Normally used internally only.
- model_cycles = 1 Number of cycles of model-based density modification. Normally use 1 if you supply a single model (if you use more it will generate new models on cycles 2-...). Use 1 or a few if you are generating new models.
- model_sharpen = True Carry out model-based sharpening based on final model if density modifying with model information
- use_model_as_target = True Use model as target in density modification. Alternative is to use density based on model as target for density modification.
- omit_with_model = False Use omit maps in density_modify_with_model. Only applies if use_model_as_target is True. Not implemented
- n_box_target = None Target number of boxes in omit maps
- scale_rmsd = 1 Scale on RMSD estimate for model-map agreement
- patterns = False Use resolve_patterns to identify patterns in map and use them in density modification. Requires box_before_analysis.
- update_starting_map = True Update starting half maps each cycle using half-maps from previous cycle.. Alternative is to restart with original (sharpened) half maps each cycle.
- update_histograms = True Update histograms each cycle using working map. Alternative is to use input histogram file or generate histograms once.
- database_list = None List of histogram database entries to try one at a time if optimizing histograms. Database 0 is local database for this run. Databases 1 2 3 4 5 are 1 - 3 A databases. Default is 5 for quick [0, 5] for non-quick
- database_number = None Use this database entry by default (0 is local database, 5 std)
- rad_smooth_ratio = 1 Ratio of rad_smooth to resolution
- spectral_scaling = False Scale average Fourier coefficient amplitude vs resolution to match spectrum expected for a protein in a box of solvent. This can restore resolution-dependent features of a map.
- blur_by_resolution = False Blur after half-map averaging by B = 10 times resolution. This can be used along with blur by resolution factor to apply a slight blurring that may make the final map easier to interpret.
- blur_by_resolution_factor = 10 Blur after half-map averaging by B = blur_by_resolution_factor times resolution
- sigma_weighting = None In recombination step, weight individual fourier coefficients based on estimated variances: 1/sqrt(normalized_variance). Can be combined with real_space_weighting. Can cause correlations between density-modified half maps so that half-map FSC values may overstate resolution.
- average_neighbors = False In sigma_weighting, average neighboring weights
- apply_cc_star_to_initial_map = None Apply resolution-dependent cc_star (FSCref) weighting to initial map before density modification. Default is True. If False, apply any other scaling but set cc_star = 1.
- local_methods_final_cycle = True Apply any local sharpening/averaging only on last cycle
- real_space_weighting = None In recombination step, weight maps in half-map averaging based on local half-map variance in real space (smoothed with smoothing_radius). Can be combined with sigma_weighting. Can cause correlations between density-modified half maps so that half-map FSC values may overstate resolution.
- sharpening_type = None half_map *final_half_map local_final_half_map Sharpening to apply at each stage. Half-map means scale data in each shell to maximum then multiply by estimated FSCref from input first half maps. Final_half_map means same except use estimate of FSCref for final maps. Final_local_half_map uses local FSCref estimates.
- cycles = 1 Cycles of density modification
- damping = None Damping of shifts on each cycle.
- minimum_r_value_improvement_per_cycle = 0.001 Minimum r_value improvement to keep cycling. Does not apply if dampling is used.
- force_cycles = True Keep results of last cycle even if not better
- minimum_improvement_per_cycle = -1 Minimum improvement to keep cycling. Does not apply if dampling is used.
- zero_half_dataset_correlation = False Assume correlation between density-modified half-datasets is zero.
- very_high_error = 100 If an error (asqr, bsqr, csqr, ssqr) is this big or bigger, require it to keep this value for all subsequent bins
- cc_cut = 0.2 Estimate of minimum highly reliable CC in half-map FSC. Used to decide at what CC value to smooth the remaining CC values. Also used to decide whether scale_using_last will apply
- correct_for_final_value = True Correct FSC values for correlated errors by assuming that the last smoothed value represents the correlated error
- max_cc_for_rescale = 0.1 Min reliable CC in half-maps. Used along with cc_cut and scale_using_last to correct for small errors in FSC estimation at high resolution. If the value of FSC near the high-resolution limit is above max_cc_for_rescale, assume these values are correct and do not correct them.
- scale_using_last = 3 If set, assume that the last scale_using_last bins in the FSC for half-map or model sharpening are about zero (corrects for errors int the half-map process). Only applies if these values are less than cc_cut
- mask_atoms_atom_radius = 5 Atom radius in model masking
- control_no_denmod = False Run dummy density modification just flattening solvent as estimated by probability mask
- zero_solvent = False Zero solvent region (for testing)
- randomize_solvent = False Randomize solvent region (for testing). Also set solvent_content and solvent_noise_ratio to run this.
- solvent_noise_ratio = 0.4 Solvent noise ratio for randomize_solvent
- use_bins_for_solvent_noise = True Use bins in solvent noise
- require_non_identical_half_maps = True Require half maps that are not identical
- density_modify_unsharpened_maps = None Use unsharpened (original) half maps in density modification. Normally using the sharpened map is better but it may be worth trying with unsharpened maps.
- default_fom = 0.9 Default FOM value if unknown
- anisotropy
- remove_aniso = True Remove anisotropy from data and optionally sharpen before density modification
- b_iso = None Target overall B value for anisotropy correction. Ignored if remove_aniso = False. If None, default is target_b_ratio*resolution
- max_b_iso = None Default maximum overall B value for anisotropy correction. Ignored if remove_aniso = False. Recommended if b_iso is set. If used, default is minimum of (max_b_iso, B of dataset, target_b_ratio*resolution)
- target_b_ratio_list = 10. 20. Set of ratios of target B to resolution to try. Applied if quick = False and ignored if remove_aniso = False or b_iso is set
- target_b_ratio = 10. Default ratio of target B value to resolution for anisotropy correction. Ignored if remove_aniso = False. Ignored if quick or b_iso is set. If used, default is minimum of (max_b_iso, B of dataset, target_b_ratio*resolution)
- control
- quick = True Quick run, do not optimize dm_resolution, target_b_iso or histogram database. Turns off real_space_weighting and sigma_weighting unless they are set.
- ignore_symmetry_conflicts = False Ignore symmetry conflicts in input files
- wrapping = False Allow wrapping of map around boundaries
- ignore_limitations = False Ignore restrictions on what can be run
- stop_after_mask = False Stop after getting mask
- verbose = False Verbose output
- max_dirs = 1000 Maximum number of directories (trace_and_build_xxx)
- resolve_size = None Size of resolve to use.
- resolve_command_file = None File with commands for resolve
- multiprocessing = *multiprocessing sge lsf pbs condor pbspro slurm Choices are multiprocessing (single machine) or queuing systems
- queue_run_command = None run command for queue jobs. For example qsub.
- nproc = 1 Number of processors to use. NOTE: by default multiple processors will only be used in the map-to-model step (this is because multiprocessing requires writing out nproc sets of huge files and it can be very slow with distributed queues.). You can override this with force_nproc = True.
- force_nproc = False Use all processors in all steps. If not set then multiprocessing will not be used in average_maps
- random_seed = 171731 Random seed
- max_wait_time = 1 Max time to wait for files to appear. Increase if files seem to be missing during run.
- test_gui = None Run from command line but test GUI features
- guiGUI-specific parameter required for output directory