Quick model-building of a single model with resolve using build_one_model

Author(s)

build_one_model: Tom Terwilliger

Purpose

If you have an mtz map coefficients file and a sequence file, you can use build_one_model to just build a single model with resolve. You can also extend an existing model.

Usage

How build_one_model works:

build_one_model runs resolve model-building, by default in superquick_build mode. If you supply a model, then resolve will try to extend the ends of each chain in your model.

Output files from build_one_model

build_one_model.pdb: A PDB file with the resulting model

Examples

Standard run of build_one_model:

Running build_one_model is easy. From the command-line you can type:

phenix.build_one_model  map_coeffs.mtz \
   sequence.dat free_in=exptl_fobs_phases_freeR_flags.mtz

where exptl_fobs_phases_freeR_flags.mtz has your free R flags for refinement. If you want to supply a PDB file to extend instead you can do that:

phenix.build_one_model map_coeffs.mtz \
  sequence.dat free_in=exptl_fobs_phases_freeR_flags.mtz \
  model.pdb

Possible Problems

Specific limitations and problems:

Literature

Additional information

List of all available keywords

build_one_model
- input_files
  - seq_file = None File with 1-letter code sequence of molecule. Chains separated by blank line or greater-than sign
  - pdb_in = None Optional starting PDB file (ends will be extended if present)
  - map_coeffs = None MTZ file with coefficients for a map
  - map_coeffs_labels = None Map coeffs labels (like: FWT,PHFWT). Alternative to labin_map_coeffs
  - labin_map_coeffs = None Labin line for MTZ file with map coefficients. This is optional if build_one_model can guess the correct coefficients for FP PHI . Otherwise specify: LABIN FP=myFP PHIB=myPHI where myFP is your column label for FP
  - ncs_info_file = None ncs_spec file with NCS information (written by simple_ncs_from_pdb or find_ncs)
  - remove_free
    - free_in = None MTZ file containing FreeR_flags NOTE free_in only used in build_one_model. Ignored by phase_and_build Used as source of freeR information for real_space refinement. Note other columns of data may be present and can be used in reciprocal-space refinement. A suitable file is exptf_fobs_phases_freeR_flags.mtz from autosol/autobuild or my_model_refine_data.mtz from phenix.refine
    - labin_free = None Labin line for MTZ file with FreeR_flags. This is optional if build_one_model can guess the correct coefficients for FreeR_flags.Otherwise specify: FreeR_flags==myFreeR_flags
    - map_coeffs_no_free = None Optional MTZ file with coefficients for a map with freeR set removed. Use instead of free_in. This map will be used for real-space refinement Not needed if use_all_in_refine is True
    - map_coeffs_labels_no_free = None Map coeffs labels (like: FWT,PHFWT). Alternative to labin_map_coeffs_no_free
    - labin_no_free = None Labin line for MTZ file with map coefficients and freeR set removed. This is optional if build_one_model can guess the correct coefficients for FP PHI . Otherwise specify: LABIN FP=myFP PHIB=myPHI where myFP is your column label for FP
    - test_flag_value = None This parameter sets the value of the test set that is to be free. Normally phenix sets up test sets with values of 0 and 1 with 1 as the free set. The CCP4 convention is values of 0 through 19 with 0 as the free set. Either of these is recognized by default in Phenix. If you have any other convention (for example values of 0 to 19 and test set is 1) then you can specify this with test_flag_value.
- output_files
  - pdb_out = build_one_model.pdb Output PDB file
  - log = build_one_model.log Output log file
  - params_out = build_one_model_params.eff Parameters file to rerun build_one_model
  - model_building_log = model_building.log Log file for model-building
- refinement
  - rs_refine = True You can run real-space refinement after model-building NOTE: real_space refinement requires a source of FreeR_flag and standard requires Fobs SigFobs and a source of FreeR_flag For real-space refinement you can supply either an mtz file with a FreeR_flag column or an mtz map file that has all the FreeR reflections removed
  - use_all_in_refine = False You can use all your data in refinement (i.e., for cryo-EM data) if you want to. Not suitable for X-ray data.
  - macro_cycles = None Macro cycles in real-space refinement (None=default)
  - rs_refine_adp = False Refine with ADP in real-space refinement
  - real_space_refine_before_merge = False Run real-space refinement on partial models before merging
  - macro_cycles_in_refine_before_merge = 3 Macro cycles in real_space_refine_before_merge
- model_building
  - quick = False You can run quickly (superquick_build/delta_phi=30.) or more thoroughly (default, thorough_build/delta_phi=20.)
  - extend_only = True If input model is supplied, just extend it (do not build a new model from scratch).
  - extend = True If input model is supplied, extend it. (Set to False to not extend input model or helices_strands_only model).
  - helices_strands_only = None You can find just helices and strands and extend them. Same as insert_helices=True and include_strands=True. This is useful at resolutions of 3. A and worse.
  - insert_helices = None You can find helices and use them as a starting point for model-building. This is useful if your resolution is worse than 3 A. Same as helices_strands_only=True include_strands=False.
  - include_strands = None You can find strands and use them as a starting point for model-building. This is useful if your resolution is worse than 3 A. Same as helices_strands_only=True insert_helices=False.
  - build_rna_helices = None Use build_rna_helices tool to build RNA if chain_type=RNA
  - n_build_repeat = 0 Iterations of building
  - n_random_frag = None Tries of fragment extension
  - delta_phi = None Rotation angle for secondary structure search (set with quick/thorough to 30/20 degrees by default)
  - rna_rho_min_main_base = -1. starting minimum density at main-chain atoms for RNA building
  - rna_rho_min_main_low = -1. final minimum density at main-chain atoms for RNA building
  - rna_rho_min_main_seg = -1. minimum density at main-chain atoms in segment ID in RNA building
  - rna_rho_min_side_seg = -1. minimum density at side-chain atoms in segment ID in RNA building
  - cc_min = 0.40 Minimum map-model correlation to keep a segment after refinement
  - cc_min_rna = 0.25 Minimum map-model correlation (RNA building)
  - merge_with_combine_models = True Merge using parallel combine_models method. Can include standard merge. Alternative is merging with resolve.
  - merge_by_segment_correlation = True Merge using segment correlation if merge_with_combine_models is selected. Can be used along with merge_remainder=True/False and standard_merge=True/False
  - merge_remainder = True Merge remainder in merge_by_segment_correlation
  - standard_merge = True If merge_with_combine_models is set, merge by reading all chains and running resolve merging.
  - use_cc_in_combine_extend = False You can choose to use the correlation of density rather than density at atomic positions to score models in the merge_second_model or merge_both_models step. This may be useful at lower resolution (> 3 A)
- directories
  - temp_dir = "temp_dir" Optional temporary work directory
  - output_dir = "" Output directory where files are to be written
  - top_output_dir = "" Top output directory for control files
- crystal_info
  - resolution = 0. high-resolution limit for map calculation
  - solvent_fraction = None You can specify the solvent fraction Normally it is set automatically
  - chain_type = *PROTEIN DNA RNA Chain type (for identifying main-chain and side-chain atoms)
  - scattering_table = *n_gaussian wk1995 it1992 electron neutron Choice of scattering table for structure factor calculations. Standard for X-ray is n_gaussian, for cryoEM is electron.
  - semet = False You can specify that your protein contains selenomethionine
- control
  - verbose = False Verbose output
  - raise_sorry = False Raise sorry if problems
  - debug = False Debugging output
  - dry_run = False Just read in and check parameter names
  - write_run_directory_to_file = None The working directory name is written to this file
  - multiprocessing = *multiprocessing sge lsf pbs condor pbspro slurm Choices are multiprocessing (single machine) or queuing systems
  - queue_run_command = None run command for queue jobs. For example qsub.
  - parallel_step = *final_assembly assembly peak_search all Last step to carry out in parallel in model-building. peak_search means find the secondary structure in parallel. assembly means find ss and extend in parallel. final_assembly means find ss, extend and create final model. all means do entire model-building nproc times.
  - nproc = 1 Number of processors to use (None is do not use multiprocessing)
  - i_ran_seed = 712341 Random seed for model-building
  - resolve_size = 12 Size for resolve
  - resolve_command_list = None You can supply any resolve command here NOTE: for command-line usage you need to enclose the whole set of commands in double quotes (") and each individual command in single quotes (') like this: resolve_command_list="'no_build' 'b_overall 23' "