| Python-based Hierarchical ENvironment for Integrated Xtallography |
| Documentation Home |
Rapid phase improvement and model-building using phase_and_build
Author(s)
Purposephase_and_build is a new and rapid method for improving the quality of your map and building a model. The approach is to carry out an iterative process of building a model as rapidly as possible and using this model in density modification to improve the map. This approach is related to the older phenix.autobuild approach. The difference is that in phenix.autobuild much effort was spent on building the best possible model at each stage before carrying out density modification, while in phenix.phase_and_build speed of model-building is optimized. The result is that phenix.phase_and_build is 10 times faster than phenix.autobuild, yet it produces nearly as good a model in the end. The phenix.phase_and_build approach will also find NCS from your starting map and apply it during density modification. UsageHow phase_and_build works:
Output files from phase_and_buildphase_and_build.pdb: A PDB file with the resulting model phase_and_build_map_coeffs.mtz: An MTZ file with optimized phases Parameters files in phase_and_buildWhen you run phenix.phase_and_build it will write out a phase_and_build_params.eff parameter file that can be used to re-run phenix.phase_and_build (just as for essentially all PHENIX methods). In addition, phenix.phase_and_build will write out the parameters files for the intermediate methods used as part of phenix.phase_and_build to the temporary directory used in building. You can run these with: phenix.find_ncs temp_dir/find_ncs_params.eff # runs NCS identification phenix.autobuild temp_dir/AutoBuild_run_1_/autobuild.eff # runs first cycle of density modification phenix.build_one_model temp_dir/build_one_model_params.eff # runs most recent model-building phenix.assign_sequence temp_dir/assign_sequence_params.eff # runs sequence assignment and filling short gaps phenix.fit_loops temp_dir/fit_loops_params.eff # runs loop fittingThis gives you control of all the steps in map improvement and model-building in addition to letting you run them all together with phenix.phase_and_build ExamplesStandard run of phase_and_build:Running phase_and_build is easy. From the command-line you can type: phenix.phase_and_build exptl_fobs_phases_freeR_flags.mtz sequence.datIf you want to supply a file with anisotropy-corrected data to use in density modification you can do so: phenix.phase_and_build data=exptl_fobs_phases_freeR_flags.mtz \ seq_file=sequence.dat \ aniso_corrected_data=solve_1.mtzwhere solve_1.mtz is anisotropy-corrected (the amplitudes are not measured amplitudes, but rather are corrected with an anisotropic B-factor), and exptl_fobs_phases_freeR_flags.mtz contains experimental amplitudes. These two files normally will contain the same phase information. (Usually these files will come from phenix.autosol.) You can also add a starting model or a starting map to phenix.phase_and_build. This means that you can run it once, get a new model and map, then run it again to further improve your model and map. Possible ProblemsSpecific limitations and problems:phenix.phase_and_build does not have the full flexibility of phenix.autobuild, so you may want to get a nearly-complete model with phenix.phase_and_build and then use phenix.autobuild to increase the completeness and quality. LiteratureAdditional informationList of all phase_and_build keywords
-------------------------------------------------------------------------------
Legend: black bold - scope names
black - parameter names
red - parameter values
blue - parameter help
blue bold - scope help
Parameter values:
* means selected parameter (where multiple choices are available)
False is No
True is Yes
None means not provided, not predefined, or left up to the program
"%3d" is a Python style formatting descriptor
-------------------------------------------------------------------------------
phase_and_build
input_files
data= None MTZ file containing FP SIGFP PHIB FOM HLA HLB HLC HLD
FreeR_flags Used as source of FP SIGFP freeR information in
refinement and as source of experimental phase information for
density modification. A suitable file is
exptl_fobs_phases_freeR_flags.mtz from autosol or autobuild NOTE:
This is a temporary requirement. You can also supply any other
format of file if the data columns can be identified automatically
labin= None Labin line for MTZ file with FreeR_flags. This is optional
if phase_and_build can guess the labels. Otherwise specify a line
like: FP=FP SIGFP=SIGFP PHIB=PHIB FOM=FOM HLA=HLA HLB=HLB HLC=HLD
FreeR_flags==myFreeR_flags
aniso_corrected_data= None Optional MTZ file containing
anisotropy-corrected data with FP SIGFP PHIB FOM
HLA HLB HLC HLD Used as source of FP SIGFP
information for density modification. A suitable
file is solve_1.mtz or phaser_1.mtz If none
supplied, the mtz file specified as data will be
used.
labin_aniso_corrected_data= None Labin line for aniso_corrected data MTZ
file . This is optional if phase_and_build
can guess the labels
map_file_fom= None You can specify the FOM of the map_coeffs file
(useful in cases where the map file has only FWT PHFWT and
no FOM column). This FOM is used to set the default
smoothing radius for the density modification solvent
boundary.
map_file_is_density_modified= False You can specify that the
input_map_file has been density modified.
(This changes the assumptions on
statistics of the map.)
ha_file= None Heavy atom sites to be used to find NCS and to remove high
peaks of density in initial density modification
seq_file= None File with 1-letter code sequence of molecule. Chains
separated by blank line or greater-than sign
pdb_in= None Optional starting PDB file (ends will be extended if
present)
map_coeffs= None MTZ file with coefficients for a map
labin_map_coeffs= None Labin line for MTZ file with map coefficients.
This is optional if build_one_model can guess the
correct coefficients for FP PHI and FOM. Otherwise
specify: LABIN FP=myFP PHIB=myPHI FOM=myFOM where myFP
is your column label for FP
ncs_info_file= None ncs_spec file with NCS information (written by
simple_ncs_from_pdb or find_ncs)
remove_free NOTE remove_free params only used in build_one_model, not
phase_and_build
free_in= None MTZ file containing FreeR_flags NOTE free_in only used
in build_one_model. Ignored by phase_and_build Used as
source of freeR information for real_space refinement. Note
other columns of data may be present and can be used in
reciprocal-space refinement. A suitable file is
exptf_fobs_phases_freeR_flags.mtz from autosol/autobuild or
my_model_refine_data.mtz from phenix.refine
labin_free= None Labin line for MTZ file with FreeR_flags. This is
optional if build_one_model can guess the correct
coefficients for FreeR_flags.Otherwise specify:
FreeR_flags==myFreeR_flags
map_coeffs_no_free= None Optional MTZ file with coefficients for a
map with freeR set removed. Use instead of
free_in. This map will be used for real-space
refinement
labin_no_free= None Labin line for MTZ file with map coefficients and
freeR set removed. This is optional if build_one_model
can guess the correct coefficients for FP PHI and FOM.
Otherwise specify: LABIN FP=myFP PHIB=myPHI FOM=myFOM
where myFP is your column label for FP
output_files
mtz_out= 'phase_and_build_map_coeffs.mtz' Output MTZ file with map coeffs
pdb_out= build_one_model.pdb Output PDB file
log= build_one_model.log Output log file
params_out= phase_and_build_params.eff Parameters file to rerun
phase_and_build
job_title= None Job title in PHENIX GUI, not used on command line
cycles
ncycle= 2 Number of initial cycles of model-building, refinement and
density modification
nmodels= 1 Number of models to build with map from initial cycles
ncs
find_ncs= True Find NCS from input_ha_file or density or chains in the
model
update_ncs= True Update NCS as new information becomes available
use_ha_in_ncs= True Use ha_file as source of NCS information
optimize_ncs= True Try to map NCS operators close together
minimum_ncs_cc= None Minimum CC for NCS (default unless extreme denmod)
density_modification
truncate_ha_sites_in_resolve= True You can choose to truncate the
density near heavy-atom sites at a maximum
of 2.5 sigma. This is useful in cases
where the heavy-atom sites are very
strong, and rarely hurts in cases where
they are not. The heavy-atom sites are
specified with "ha_file"
use_hl_anom_in_denmod= False You can choose to use HL coefficients not
including model information (HLanom) in density
modification. They must be present in your data
file
use_hl_anom_in_denmod_with_model= False You can choose to use HL
coefficients not including model
information (HLanom) in density
modification when model information is
used. They must be present in your
data file
fom_for_extreme_dm= 0.35 If FOM of phasing is less up to
fom_for_extreme_dm then defaults for density
modification become: mask_type=wang wang_radius=20
mask_cycles=1 minor_cycles=4
refinement
refine= True Refine with standard reciprocal-space refinement
refine_pdb_in= False Refine input model (if any) before using it
use_hl_anom_in_refinement= False You can choose to use HL coefficients
not including model information (HLanom) in
refinement. They must be present in your data
file
include_ha_in_refinement= True You can choose to include your heavy-atom
sites in the model for refinement. This is a
good idea if your structure includes these
heavy-atom sites (i.e., for SAD or MAD
structures where you are not using a native
dataset). Heavy-atom sites that overlap an
atom in your model will be ignored.
refine_se_occ= True You can choose to refine the occupancy of SE atoms
in a SEMET structure (default=True). This only applies if
semet=true
ordered_solvent= True You can add waters during refinement
flood_with_waters= False You can use the parameters file in
$PHENIX/phenix/phenix/autosol/flood.par to add lots
of waters during the phase improvement stage
macro_cycles= None You can set the number of macro_cycles in refinement
Default (None) will use phenix.refine default
add_free_r_if_needed= True If your input data file has no FreeR_flag
then it will be added
allow_overlapping= True You can allow atoms in your ligand files to
overlap atoms in your protein/nucleic acid model.
This overrides 'keep_pdb_atoms' Useful in early
stages of model-building and refinement The ligand
atoms get the altloc indicator 'L' NOTE: the ligand
occupancy gets refined by default. You can turn this
off with fix_ligand_occupancy=True
fix_ligand_occupancy= False If allow_overlapping=True then ligand
occupancies are refined as a group. You can turn
this off with fix_ligand_occupancy=true NOTE: has
no effect if allow_overlapping=False
skip_hexdigest= False You may wish to ignore the hexdigest of the free R
flags in your input PDB file if the dataset you provide
is not identical to the one that you refined with (but
has the same free R flags).
ncs_in_refinement= *torsion cartesian None Use torsion_angle refinement
of NCS. Alternative is cartesian or None (None will
use phenix.refine default)
correct_special_position_tolerance= None Adjust tolerance for special
position check. If 0., then check
for clashes near special positions
is not carried out. This sometimes
allows phenix.refine to continue
even if an atom is near a special
position. If 1., then checks within
1 A of special positions. If None,
then uses phenix.refine default. (1)
rs_refine= True You can run real-space refinement after model-building
NOTE: real_space refinement requires a source of FreeR_flag
and standard requires Fobs SigFobs and a source of FreeR_flag
For real-space refinement you can supply either an mtz file
with a FreeR_flag column or an mtz map file that has all the
FreeR reflections removed
model_building
fit_loops= True Include loop fitting in full model-building. At lower
resolution (3.5 A) it may be best to skip this step
trace_loops= False Use trace_loops algorithm in loop fitting
standard_loops= True Use standard_loops algorithm in loop fitting
loop_lib= False Use loop_lib algorithm in loop fitting
assign_sequence= True Include sequence assignment and short loop joining
in full model-building. At lower resolution (3.5 A) it
may be best to skip this step Only applicable for
chain_type=PROTEIN
min_percent_assigned_for_assign_sequence= 50 Skip assign_sequence if
initial percentage sequence
assigned is lower than
min_percent_placed_for_assign_s
equence
quick= False You can run quickly (superquick_build/delta_phi=30.) or
more thoroughly (default, thorough_build/delta_phi=20.)
insert_helices= False You can find helices and use them as a starting
point for model-building. This is useful if your
resolution is worse than 3 A.
i_ran_seed= 712341 Random seed for model-building
directories
temp_dir= "temp_dir" Optional temporary work directory
output_dir= "" Output directory where files are to be written
top_output_dir= "" Top output directory for control files
base_gui_dir= None Base output path for Phenix GUI only.
crystal_info
ncs_copies= none Number of NCS copies (defines solvent_fraction with
sequence) Normally determined automatically
resolution= 0. high-resolution limit for map calculation
solvent_fraction= None You can specify the solvent fraction Normally it
is set automatically
chain_type= *PROTEIN DNA RNA Chain type (for identifying main-chain and
side-chain atoms)
semet= False You can specify that your protein contains selenomethionine
control
verbose= False Verbose output
raise_sorry= False Raise sorry if problems
debug= False Debugging output
dry_run= False Just read in and check parameter names
write_run_directory_to_file= None The working directory name is written
to this file
resolve_command_list= None You can supply any resolve command here NOTE:
for command-line usage you need to enclose the
whole set of commands in double quotes (")
and each individual command in single quotes (')
like this: resolve_command_list="'no_build'
'b_overall 23' "
| |