Density modification with with phenix.density_modification
- phenix.density_modification: Tom Terwilliger
phenix.density_modification is a tool to run Resolve to carry out density
modification, including the use of NCS symmetry and electron-density
distributions. Masks for the solvent boundary and for the region where
NCS operators are to be applied can be specified as PDB files with dummy
atoms marking the masked regions.
In phenix.density_modification phases are improved by combining
any existing phase information with phase probabilities based on
the agreement of your electron density map with expectations about
that map. This procedure is known as statisical density modification and
is carried out by phenix.resolve. The expectations about a map include:
- A flat solvent region
- Matching NCS-related density
- Density distributions (histograms of density) matching expected
distributions
- Density in a region where a model has been built matching expected density
for the model
The inputs to phenix.density_modification are :
Required:
- F, SIGF: Structure factor amplitudes for your structure.
Optional:
- PHIB, FOM, HLA HLB HLC HLC: Phases and figure of merit and Hendrickson
Lattman (HL) coefficients. These normally come from experimental phasing, but
they can be from any source. These are optional if you supply a model.
The HL coefficients describe the phase probabilities for each phase.
- Model: This is normally a PDB file containing your current model.
It can be used as a source of phase information (calculated starting phases)
and also as a target for density modification.
- NCS operators: These are supplied as a .ncs_spec file
that you can create by running phenix.find_ncs on a model with NCS or
using heavy-atom sites and a map or just with a map.
- Solvent boundary file: This is a PDB file that defines the region
that is not solvent (the macromolecule) as dummy atoms.
- NCS boundary file: This is a PDB file that defines the region
where NCS is to be applied (all the NCS copies) as dummy atoms.
This kind of density modification is normally carried out after
experimental phasing, but it can also be carried out if you have
created HL coefficients from your data and a model.
In these cases you have phases and phase probability information
(F, SIGF, PHIB FOM, HLA HLB HLC HLD).
Prior to density modification, the amplitudes (F, SIGF) are normally
corrected for anisotropy and sharpened. You can control this
by specifying the target overall B value after sharpening. By default
the target overall (Wilson) B is 10 times the high-resolution limit
of the data (or the resolution that you specify), up to a maximum
of 40 (you can change that too).
A starting map is calculated from F, PHIB, FOM. The location of the
macromolecule and solvent are estimated in a probabilistic way using the
local variation in the map (with a smoothing radius of rad_wang) to
discriminate between the two.
Phases are optimized to agree with the HL coefficients and to yield a map
that has a flat solvent, NCS (if present), and density distributions matching
model maps.
If you supply NCS operators, phenix.density_modification will automatically
figure out the region over which this NCS is present. Within this region,
it will weight the density for NCS calculations according to how similar the
density is at NCS-related locations. If you supply an NCS mask, that is used
instead. The density at NCS-related locations is used as part of the target
for density modification.
If you supply a model, density modification occurs in several steps. First
the data are corrected for anisotropy as above. Then
the model is used to estimate phases, fom, and HL coefficients. Then density
modification is carried out much as described above for experimental phasing.
Then a final step is carried out in which model density is used as part of the
target for density modification.
To instead create HL coefficients from your data and model and use those
with phenix.density_modification, first run phenix.refine
to refine your model. Then run phenix.reciprocal_space_arrays with your model
and data and it will create HL coefficients for you.
- denmod.mtz: Density-modified map coefficients. If you specify
- hl_in_output_mtz=True then it will also contain Hendrickson Lattman
coefficients
solvent_boundary.map: CCP4-style map file showing the solvent boundary
ncs_boundary.map: CCP4-style map file showing the asymmetric unit of NCS
Running phenix.density_modification from the command line is easy.
phenix.phenix.density_modification phaser_1.mtz solvent_content=0.5
In this example, phaser_1.mtz (the output of Phaser experimental phasing)
normally has F, SIGF, PHIB, FOM, HLA HLB HLC HLD, each of which is
identified automatically. Standard density modification using
these experimental phases and phase probabilities is carried out.
phenix.phenix.density_modification phaser_1.mtz solvent_content=0.64 \
resolution=3 output_mtz=denmod_std.mtz mask_type=histograms
In this example, the high-resolution limit is set to 3 A, the output
file is denmod_std.mtz, and the solvent mask is identified using a
histogram-based method (mask_type=histograms).
phenix.phenix.density_modification phaser_1.mtz solvent_content=0.64 \
mask_from_pdb=mask.pdb rad_mask=4
In this example, the file mask.pdb has dummy atoms that indicate where the
macromolecule is located. All points within 4 A (rad_mask=4) of an atom in
mask.pdb are considered to be within the macromolecule region.
phenix.phenix.density_modification phaser_1.mtz solvent_content=0.64 \
ncs_file=ncs.ncs_spec
In this example, the file ncs.ncs_spec (created with phenix.find_ncs or
phenix.autosol or phenix.simple_ncs_from_pdb or phenix.autobuild)
contains the NCS operators. These will be used to automatically find the
region where NCS applies. Then the NCS-related density will be used as
part of density modification.
phenix.phenix.density_modification phaser_1.mtz solvent_content=0.64 \
ncs_file=ncs.ncs_spec ncs_domain_pdb=ncs_mask.pdb rad_mask=5
This example is like the previous one, except instead of finding the region
where NCS applies automatically, it is defined by the atoms in ncs_mask.pdb.
All points within rad_mask (5 A in this example) of an atom in ncs_mask.pdb
are considered to be within the region where NCS applies. This is useful
if you want to apply NCS only to a part of the region where NCS is present.
- input_files
- data_file = None Mtz or other data file with intensities or amplitudes
- data_labels = None Optional data labels for Iobs,SIGIobs or Fobs,SIGFobs. Input data can be I, F, or I+/I- or F+/F-.
- phib_labels = None Optional data label for PHIB. You can force skipping with phib_labels=SKIP
- fom_labels = None Optional data label for FOM. You can force skipping with fom_labels=SKIP
- hl_file = None Mtz or other data file with HL coefficients. If None assumed to be the same as data_file. Required unless a pdb_file is supplied.
- hl_labels = None Optional labels for HL coefficients. You can force skipping with hl_labels=SKIP
- map_coeffs_file = None Mtz or other file with map coefficients for starting map. If None assumed to be same as data_file
- map_coeffs_labels = None Optional labels for map coefficients. You can force skipping with map_coeffs_labels=SKIP
- pdb_file = None Optional file with model to be used in density modification. If provided, the density modification will use model density as part of the density modification target. If supplied, HL coefficients are optional.
- ha_file = None Optional file with heavy-atom sites to be truncated in density modification. Density within rad_mask of a site in this file will be truncated at 2 x the rms of the map. Note: not compatible with mask_from_pdb.
- mask_from_pdb = None Optional file with dummy atoms marking the region to be considered as the macromolecule in density modification. The region within rad_mask of an atom in this file will be considered within the macromolecule. Note: not compatible with ha_file.
- ncs_file = None Optional file with NCS information (.ncs_spec file). This file can be obtained with phenix.find_ncs or phenix.simple_ncs_from_pdb
- ncs_domain_pdb = None Optional file with dummy atoms marking the region where NCS is to be applied. Supersedes ncs_domain_pdb in the ncs_file. Points within rad_mask of an atom in this file are considered for NCS. You need to supply a mask covering all NCS copies. NOTE: the output_ncs_mask_file map shows the boundary of the asymmetric unit of NCS (just one copy) while the input ncs_domain_pdb file must contain all NCS copies.
- resolve_commands_file = None Optional file with any commands for resolve density modification.
- directories
- temp_dir = temp_denmod Temporary directory. Not normally used. Contents deleted after the run if clean_up=True is specified.
- crystal_info
- resolution = None Resolution for density modification
- output_files
- output_mtz = denmod.mtz Output MTZ file with density-modified map coefficients
- hl_in_output_mtz = True Calculate density modification HL coefficients and write to output mtz file
- output_mask_file = solvent_boundary.map Output CCP4-style map file with solvent boundary mask.
- output_ncs_mask_file = ncs_boundary.map Output CCP4-style map file with mask marking region used for NCS. NOTE: the output_ncs_mask_file map shows the boundary of the asymmetric unit of NCS (just one copy) while the input ncs_domain_pdb file must contain all NCS copies.
- anisotropy
- remove_aniso = True Remove anisotropy from data and optionally sharpen
- b_iso = None Target overall B value for anisotropy correction. Ignored if remove_aniso = False. If None, default is minimum of (max_b_iso, B of dataset, target_b_ratio*resolution)
- max_b_iso = 40. Default maximum overall B value for anisotropy correction. Ignored if remove_aniso = False. Ignored if b_iso is set. If used, default is minimum of (max_b_iso, B of dataset, target_b_ratio*resolution)
- target_b_ratio = 10. Default ratio of target B value to resolution for anisotropy correction. Ignored if remove_aniso = False. Ignored if b_iso is set. If used, default is minimum of (max_b_iso, B of dataset, target_b_ratio*resolution)
- denmod
- rad_wang = None Radius for estimation of solvent boundary. Normally chosen automatically.
- optimize_rad_wang = False Optimize value of rad_wang
- mask_type = *None histograms probability wang Method for solvent identification. Default is histograms. If you have a SAD dataset with a heavy atom such as Pt or Au then you may wish to choose wang because the histogram method is sensitive to very high peaks. Options are: histograms: compare local rms of map and local skew of map to values from a model map and estimate probabilities. This one is usually the best. probability: compare local rms of map to distribution for all points in this map and estimate probabilities. In a few cases this one is much better than histograms. wang: take points with highest local rms and define as protein.
- solvent_content = None Solvent content (0 to 1) of the crystal (required)
- mask_cycles = 5 Mask cycles (overall cycles of density modification)
- minor_cycles = 10 Minor cycles of density modification for each mask cycle
- rad_mask = 4 Radius (A) for mask calculation around heavy-atom sites (ha_file), for definition of protein boundary (mask_from_pdb), and for definition of the region used for NCS (ncs_domain_pdb).
- denmod_with_model = None By default if a pdb_file is supplied it will be used in density modification. You can turn this off with denmod_with_model=False
- remove_waters = True Remove waters from model before use in density modification or mask generation
- remove_hetatm = True Remove HETATM (ligands, metals etc.) from model before use in density modification or mask generation
- control
- verbose = False Verbose output
- clean_up = True Remove temp_dir when finished