Morphing a model with morph_model
- morph_model: Tom Terwilliger
morph_model is a procedure for distorting a model to match an electron
density map. It is suited for a case where a poor but unbiased map is
available and a model that is generally locally similar to the target
structure but where longer-range distances are not conserved. This is
often the case in homologous structures.
The basic idea in morph_model is to apply a smoothly-varying offset to
the template structure so as to make it match the map as closely as
possible. In this way information about the local structure of the
template is retained while allowing longer-range changes in structure.
Morphing is carried out in 3 steps:
- For each residue in the structure an offset is identified that
optimizes the match of the map to a sphere of atoms within rad_morph
(typically 6 A) of the CA atom of that residue.
- The offsets are smoothed in a window of 10 residues
- The smoothed offset for each residue is applied to all the atoms in
that residue
After morphing, the morphed model is refined. This process is then
repeated morph_cycles (typically 6) times.
Normally you should specify that morph-model use a prime-and-switch map
(map_type=prime_and_switch), as that is usually the most effective
map for morphing. You can however choose instead to use a 2mFo-DFc map,
a density-modified map, or an omit map. The prime-and-switch map is a
map calculated in the same way as a density-modified map, except that
once the initial model-based map is calculated (priming), the model
information is no longer used (switching). This map is typically quite
unbiased by the model.
Running morph_model is easy. If you have a template (coords1.pdb) and a
data file fobs.mtz with FP SIGFP and FreeR_flag, you can type:
phenix.morph_model data=coords1.mtz model=coords1.pdb
and morph_model will run automatically.
If you have your own map you want to use in morph_model, you can type:
phenix.morph_model data=coords1.mtz \
model=coords1.pdb map_coeffs=my_map_coeffs.mtz
and morph_model will start off with your map coefficients instead of
calculating a new map. If you specify update_map=True (default) then
new maps will be calculated on each subsequent cycle. If you specify
update_map=False then your map will be used throughout.
You can morph just one part of your model if you want. You can specify
it like this:
phenix.morph_model data=coords1.mtz \
model=coords1.pdb morph_selection="chain A"
The remainder of the model will be refined but not morphed.
Usually you will want to edit a parameters file so that you can specify
more details of the run. You can get a default parameters file with:
phenix.morph_model
and then just edit the file morph_model_params.dat.
- Improved crystallographic models through iterated local density-guided model deformation and reciprocal-space refinement. T.C. Terwilliger, R.J. Read, P.D. Adams, A.T. Brunger, P.V. Afonine, R.W. Grosse-Kunstleve, and L.W. Hung. Acta Crystallogr D Biol Crystallogr 68, 861-70 (2012).
- input_files
- data = None Data file with experimental data ( FP SIGFP or I SIGI ).
- data_labels = None Optional labels for experimental data Normally these would be something like I,SIGI or F,SIGF
- free_r_data = None Optional data file with free_r flags. By default this is the same file as used for experimental data.
- free_r_labels = None Optional labels for free_r flags. Normally these would be something like FreeR_flags or R_free_flags
- hl_data = None Optional data file with hl coeffs. By default this is the same file as used for experimental data.
- hl_labels = None Optional labels for hl flags. Normally these would be something like HLA,HLB,HLC,HLD
- labin = None Optional Labin line for file with data. This is
present for backward compatibility only.
Normally use data_labels instead. Only allowed if
the data file contains myFP mySIGF and myFreeR_flag:
LABIN FP=myFP SIGFP=mySIGFP FreeR_flag=myFreeR_flag
- model = None Input PDB file with coordinates to be refined and morphed.
- map_coeffs = None Data file (mtz format) with map coeffs ( FP PHIB FOM or FWT PHWT) for use in morphing. Normally use None. If set, then these map coefficients will be used instead of an omit map.
- labin_map_coeffs = None Labin line for map coeffs file Something like FP=FP PHIB=PHIM FOM=FOMM Normally use None; used by morph_model in iteration
- output
- target_output_format = *None pdb mmcif Desired output format (if possible). Choices are None ( try to use input format), pdb, mmcif. If output model does not fit in pdb format, mmcif will be used. Default is pdb.
- output_files
- log = morph_model.log Output log file
- morphed_model = morphed_model.pdb Output morphed model
- params_out = morph_model_params.eff Parameters file to rerun morph_model
- directories
- temp_dir = "" Optional temporary work directory
- workdir = "" Optional work directory. Base path for all work
- output_dir = "" Output directory where files are to be written
- top_output_dir = None Output directory for entire set of runs
- gui_output_dir = None Base directory for PHENIX GUI. Ignored if running the command-line version.
- crystal_info
- space_group = None Optional space group specification
- resolution = 0. high-resolution limit for map calculation
- solvent_fraction = None Optional solvent fraction
- chain_type = *PROTEIN DNA RNA Chain type (for identifying main-chain and side-chain atoms)
- correct_aniso = False Remove anisotropy from data
- final_b = None Overall B of aniso-corrected data. If None, then minimum of the diagonal of anisotropic B values is used
- morphing
- number_of_cycles = 6 Number of cycles of morphing
- rad_morph = 6 Radius for morphing
- rad_morph_final = None Optional final radius for morphing. If set, rad_morph will be changed from initial value to final value over the cycles of morphing
- morph_main = False You can choose whether to use only main-chain atoms plus c-beta atoms in calculation of shifts in morphing. Default is morph_main=False; use all atoms including side-chain atoms.
- morph_selection = None You can specify an atom selection identifying which parts of your molecule to morph. These should be contiguous segments such as a chain or some segments in a chain. The remainder will be refined only.
- map_type = simple_omit refine_omit 2fofc density_modified *prime_and_switch Map type to use in morphing. Ignored if map_coeffs is set
- use_ncs_in_ps = False Use NCS (if available) in prime-and-switch maps
- update_map = True Create new map for morphing at each cycle. Ignored if map_coeffs is set
- refinement
- refinement_params = None Input parameters file for phenix.refine
- skip_clash_guard = True Skip clash guard check in refinement
- correct_special_position_tolerance = None Adjust tolerance for special position check. If 0., then check for clashes near special positions is not carried out. This sometimes allows phenix.refine to continue even if an atom is near a special position. If 1., then checks within 1 A of special positions. If None, then uses phenix.refine default. (1)
- reference_restraints = False Restrain angles to starting model. This may be useful at lower resolution.
- use_hl = None You can choose to use HL coefficients in refinement If None, hl coeffs will be used if available
- test_flag_value = None You can set the test flag value if you want
- control
- verbose = False Verbose output
- debug = False Debugging output
- raise_sorry = False Raise sorry if problems
- dry_run = False Just read in and check parameter names
- nproc = 1 Number of processors to use
- group_run_command = "sh " Not currently enabled. Command to use to run multiple jobs This may be sh if you are using a single machine (where you might set background=True) or something like 'qsub' or 'qsub -q all.q@theta' on a cluster (where you should leave background=False)
- queue_commands = None Not currently enabled. You can add any commands that need to be run for your queueing system. For example on a PBS system you might say: queue_commands='#PBS -N morph_model' queue_commands='#PBS -j oe' queue_commands='#PBS -l walltime=03:00:00' queue_commands='#PBS -l nodes=1:ppn=1' NOTE: you can put in the characters '<path>' in any queue_commands line and this will be replaced by a string of characters based on the path to the run directory. The first character and last two characters of each part of the path will be included, separated by '_',up to 15 characters. For example 'test_autobuild/WORK_5/AutoBuild_run_1_/TEMP0/RUN_1' would be represented by: 'tld_W_5_A1__TP0_1'
- condor_universe = vanilla The universe for condor is usually vanilla. However you might need to set it to local for your cluster
- add_double_quotes_in_condor = True You might need to turn on or off double quotes in condor job submission scripts. These are already default elsewhere but may interfere with condor paths.
- condor = None Not currently enabled. Specifies if the group_run_command is submitting a job to a condor cluster. Set by default to True if group_run_command=condor_submit, otherwise False. For condor job submission morph_model uses a customized script with condor commands. Also uses one_subprocess_level=True
- one_subprocess_level = None Specifies that a subprocess cannot submit a job
- single_run_command = "sh " Command to use to run single jobs Normally this is sh
- background = None Run in background. If None, automatically set to True if nproc is greater than one and group_run_command is sh
- ignore_errors_in_subprocess = True Generally use ignore_errors_in_subprocess=True to ignore errors in sub-processes. This allows you to continue even if a few jobs crash. If all jobs in a group crash, the process will stop. NOTE: if a job hangs or never runs...this will not be detected and you will have to either put a file with the name FINISHED in the directory where the job was to run (e.g, MORPH_MODEL_3/FINISHED) or stop the whole job by putting a file with the name STOPWIZARD in the main run directory (e.g., MORPH_MODEL_3/STOPWIZARD)
- check_run_command = False Try out run command to make sure it works Use False if your queue may not be available at the beginning of your run. Use True if you want to check things out
- max_wait_time = 100 Maximum time (sec) to wait for a file to be written (Useful for queues or nfs-mounted systems)
- wait_between_submit_time = 1.0 You can specify the length of time (seconds) to wait between each job that is submitted when running sub-processes. This can be helpful on NFS-mounted systems when running with multiple processors to avoid file conflicts. The symptom of too short a wait_between_submit_time is File exists:....
- wizard_directory_number = None Directory number for MORPH_MODEL_xx. Normally None except if called from GUI
- n_dir_max = 100000 Maximum number of directories to create (must be as big as nproc or nstruct/chunk)
- write_run_directory_to_file = None The working directory name is written to this file
- job_title = None Job title in PHENIX GUI, not used on command line
- non_user_params
- print_citations = True Print citation information at end of run
- is_sub_process = False identifies if this is a sub-process or top-level job