Automated Model Building and Rebuilding using AutoBuild


	Python-based Hierarchical ENvironment for Integrated Xtallography
Documentation Home

Automated Model Building and Rebuilding using AutoBuild

Author(s)
Purpose: Purpose of the AutoBuild Wizard
Usage: How the AutoBuild Wizard works; Automation and user control; Core modules in the AutoBuild Wizard; What the AutoBuild wizard needs to run; ...and optional files; Anisotropy correction and B-factor sharpening; Specifying which columns of data to use from input data files; Specifying other general parameters; Running from a parameters file; Picking waters in AutoBuild; Keeping waters from your input file in AutoBuild; Twinning and AutoBuild; Specifying phenix.refine parameters; Specifying resolve/resolve_pattern parameters; Including ligand coordinates in AutoBuild; Specifying arbitrary commands and cif files for phenix.refine; Output files from AutoBuild; Standard building, rebuild_in_place, and multiple-models; Parallel jobs, nproc, nbatch, number_of_parallel_models and how AutoBuild works in parallel; Model editing during rebuilding with the Coot-PHENIX interface; Resolution limits in AutoBuild
Sample AutoBuild Commands: Run AutoBuild beginning with experimental data; Run AutoBuild beginning with a model and rebuild in place; Add more residues to a model or rebuild a model; Run AutoBuild automatically after AutoSol; Merge in hires data; Truncate density at heavy-atom sites; Skip NCS in model_building and refinement; Make a SA-omit map around atoms in target.pdb; Make a simple composite omit map; Make a SA composite omit map; Combine composite OMIT files from a set of parallel runs on different computers; Make an iterative-build omit map around atoms in target.pdb; Make a sa-omit map around residues 3 and 4 in chain A of coords.pdb; Create one very good rebuilt model; Touch up a model; Remove the worst-fitting residues from a model; Create 20 very good rebuilt models that are as different as possible; Combining files from a nearly-complete autobuild run with rebuild-in-place=true; Morph an MR model and rebuild it; Build an RNA chain; Build a DNA chain; Density-modify with or without a model and make maps; Density-modify starting with your map coefficients and make maps; Calculate a prime-and-switch map
Possible Problems: General Limitations; Specific limitations and problems
Literature
Additional information: List of all AutoBuild keywords

Author(s)

AutoBuild Wizard: Tom Terwilliger
PHENIX GUI: Nathaniel Echols
phenix.refine: Ralf W. Grosse-Kunstleve, Peter Zwart and Paul D. Adams
RESOLVE: Tom Terwilliger
phenix.xtriage: Peter Zwart

Purpose

Purpose of the AutoBuild Wizard

The purpose of the AutoBuild Wizard is to provide a highly automated system for model rebuilding and completion. The Wizard design allows the user to specify data files and parameters through an interactive GUI, or alternatively through a parameters file. The AutoBuild Wizard begins with datafiles with structure factor amplitudes and uncertainties, along with either experimental phase information or a starting model, carries out cycles of model-building and refinement alternating with model-based density modification, and producing a relatively complete atomic model.

The AutoBuild Wizard uses RESOLVE, xtriage and phenix.refine to build an atomic model, refine it, and improve it with iterative density modification, refinement, and model-building

The Wizard begins with either experimental phases (i.e., from AutoSol) or with an atomic model that can be used to generate calculated phases. The AutoBuild Wizard produces a refined model that can be nearly complete if the data are strong and the resolution is about 2.5 A or better. At lower resolutions (2.5 - 3 A) the model may be less complete and at resolutions > 3A the model may be quite incomplete and not well refined.

The AutoBuild Wizard can be used to generate OMIT maps (simple omit, SA-omit, iterative-build omit) that can cover the entire unit cell or specific residues in a PDB file.

The AutoBuild Wizard can generate a set of models compatible with experimental data (multiple_models)

Usage

The AutoBuild Wizard can be run from the PHENIX GUI, from the command-line, and from parameters files. All three versions are identical except in the way that they take commands from the user. See Using the PHENIX Wizards for details of how to run a Wizard. The command-line version will be described here.

How the AutoBuild Wizard works

The AutoBuild Wizard begins with experimental structure factor amplitudes, along with either experimental or model-based estimates of crystallographic phases. The phase information is improved by using statistical density modification to improve the correlation of NCS-related density in the map (if present) and to improve the match of the distribution of electron densities in the map with those expected from a model map. This improved map is then used to build and refine an atomic model.

In subsequent cycles, the models from previous cycles are used as a source of phase information in statistical density modification, iteratively improving the quality of the map used for model-building.

Additionally, during the first few cycles additional phase information is obtained by detecting and enhancing (1) the presence of commonly-found local patterns of density in the map, and (2) the presence of density in the shape of helices and strands. The final model obtained is analyzed for residue-based map correlation and density at the coordinates of individual atoms, and an analysis including a summary of atoms and residues that are in strong, moderate, or weak density and out of density is provided.

Automation and user control

The AutoBuild Wizard has been designed for ease of use combined with maximal user control, with as many parameters set automatically by the Wizard as possible, but maintaining parameters accessible to the user through a GUI and through parameters files. The Wizard uses the input/output routines of the cctbx library, allowing data files of many different formats so that the user does not have to convert their data to any particular format before using the Wizard. Use of the phenix.refine refinement package in the AutoBuild Wizard allows a high degree of automation of refinement so that the neither user nor Wizard is required to specify parameters for refinement. The phenix.refine package automatically includes a bulk solvent model and automatically places solvent molecules.

Core modules in the AutoBuild Wizard

The five core modules in the AutoBuild Wizard are

(1) building a new model into an electron density map
(2) rebuilding an existing model
(3) refinement
(4) iterative model- building beginning from experimental phase information, and
(5) iterative model-building beginning from a model.

The standard procedures available in the AutoBuild Wizard that are based on these modules include:

(a) model-building and completion starting from experimental phases,
(b) rebuilding a model from scratch, with or without experimental phase information, and
(c) rebuilding a model in place, maintaining connectivity and sequence register.

Starting from a set of experimental phases and structure factor amplitudes, normally procedure (a) is carried out, and then the resulting model is rebuilt with procedure (b).

Starting from a model (e.g., from molecular replacement) and experimental structure factor amplitudes, procedure (c) is by default carried out if the starting model differs less than about 50% in sequence from the desired model, and otherwise procedure (b) is used. It is generally a good idea to specify which you want to happen using the keyword "rebuild_in_place=True" (to keep the basic input model) or "rebuild_in_place=False" (to build a new model).

What the AutoBuild wizard needs to run

(1) a data file, optionally with phases and HL coeffs and freeR flag (w1.sca or data=w1.sca)
(2) a sequence file (seq.dat or seq_file=seq.dat) or a model (coords.pdb or model=coords.pdb)

...and optional files

(3) coefficients for a starting map (map_file=resolve.mtz)
(4) a file for refinement (refinement_file=exptl_fobs_freeR_flags.mtz)
(5) a high-resolution datafile (hires_file=high_res.sca)

Anisotropy correction and B-factor sharpening

The AutoBuild wizard will apply an anistropy correction and B-factor sharpening to all the raw experimental data by default (controlled by they keyword remove_aniso=True). The target overall Wilson B factor can be set with the keyword b_iso, as in b_iso=25. By default the target Wilson B will be 10 times the resolution of the data (e.g., if the resolution is 3 A then b_iso=30.), or the actual Wilson B of the data, whichever is lower.

If an anisotropy correction is applied then the entire AutoBuild run will be carried out with anisotropy-corrected and sharpened data. At the very end of the run the final model will be re-refined against the uncorrected refinement data and this re-refined model and the uncorrected refinement data (with freeR flags) will be written out as overall_best.pdb and overall_best_refine_data.mtz.

Specifying which columns of data to use from input data files

If one or more of your data files has column names that the Wizard cannot identify automatically, you can specify them yourself. You will need to provide one column "name" for each expected column of data, with "None" for anything that is missing.

For example, if your data file ref.mtz has columns FP SIGFP and FreeR then you might specify

refinement_file=ref.mtz
input_refinement_labels="FP SIGFP None None None None None None FreeR"

The keywords for labels and anticipated input labels (program labels) are:

input_labels (for data file): FP SIGFP PHIB FOM HLA HLB HLC HLD FreeR_flag
input_refinement_labels: FP SIGFP FreeR_flag
input_map_labels: FP PHIB FOM
input_hires_labels: FP SIGFP FreeR_flag

You can find out all the possible label strings in a data file that you might use by typing:

phenix.autosol display_labels=w1.mtz  # display all labels for w1.mtz

NOTES: if your data files contain a mixture of amplitude and intensity data then only the amplitude data is available. If you have only intensity data in a data file and want to select specific columns, then you need to specify the column names as they are after importing the data and conversion to amplitudes (see below under General Limitations for details).

Specifying other general parameters

You can specify many more parameters as well. See the list of keywords, defaults and descriptions at the end of this page and also general information about running Wizards at Using the PHENIX Wizards for how to do this. Some of the most common parameters are:

data=w1.sca       # data file
model=coords.pdb  # starting model
rebuild_in_place=true # rebuild input model in place
rebuild_in_place=false # build a new model; add or subtract residues 
                       #   from input model as necessary
seq_file=seq.dat  # sequence file
map_file=map_coeffs.mtz # coefficients for a starting map for building
resolution=3     # dmin of 3 A
s_annealing=True  # use simulated annealing refinement at start of each cycle
n_cycle_build_max=5  # max number of build cycles (starting from experimental phases)
n_cycle_rebuild_max=5  # max number of rebuild cycles (starting from a model)

Running from a parameters file

You can run phenix.autobuild from a parameters file. This is often convenient because you can generate a default one with:

phenix.autobuild --show_defaults > my_autobuild.eff

and then you can just edit this file to match your needs and run it with:

phenix.autobuild  my_autobuild.eff

Picking waters in AutoBuild

By default AutoBuild will instruct phenix.refine to pick waters using its standard procedure. This means that if the resolution of the data is high enough (typically 3 A) then waters are placed.

You can tell AutoBuild not to have phenix.refine pick waters with the command:

place_waters=False

If you want to place waters at a lower resolution, you will need to reset the low-resolution cutoff for placing waters in phenix.refine. You would do that in a "refinement_params.eff" file containing lines like these (see below for passing parameters to phenix.refine with an ".eff" file):

refinement {
  ordered_solvent {
    low_resolution = 2.8
  }
}

Keeping waters from your input file in AutoBuild

You can tell AutoBuild to keep the waters in your input file when you are using rebuild_in_place (the default is to toss them and replace them with new ones). You can say,

keep_input_waters=True
place_waters=No

NOTE: If you specify keep_input_waters=True you should also specify either "place_waters=No" or "keep_pdb_atoms=No" . This is because if place_waters=Yes and keep_pdb_atoms=Yes then phenix.refine will add waters and then the wizard will keep the new waters from the new PDB file created by phenix.refine preferentially over the ones in your input file.

Twinning and AutoBuild

AutoBuild does not know about twinning, but you can incorporate a twin law into the refinement steps in the AutoBuild procedure if your crystal is twinned. Use phenix.xtriage to identify twinning and the twin law. Then specify the twin law in a parameters file (see next section) and provide that to AutoBuild with the keyword such as "refine_eff_file=twin_law.eff"

You may also want to try using the keyword "two_fofc_in_rebuild" which will use the 2Fo-Fc map from phenix.refine in model-building.

Specifying phenix.refine parameters

You can control phenix.refine parameters that are not specified directly by AutoBuild using a refinement parameters (.eff) file:

refine_eff_file=refinement_params.eff    # set any phenix.refine params not set by AutoBuild

This file might contain a twin-law for refinement:

refinement {
  twinning {
    twin_law = "-k, -h, -l"
  }
}

You can put any phenix.refine parameters in this file, but a few parameters that are set directly by AutoBuild override your inputs from the refine_eff_file. These parameters are listed below.

Refinement parameters that must be set using AutoBuild Wizard keywords (overwriting any values provided by user in input_eff_file)

phenix.refine keyword

Wizard keyword(s) and notes

refinement.main.number_of_macro_cycles

ncycle_refine

refinement.main.simulated_annealing

s_annealing (only applies to 1st refinement in rebuild. SA in any other refinements controlled by input_eff_file, if any)

refinement.ncs.find_automatically

refine_with_ncs=True turns on automatic ncs search

refinement.main.ncs

refine_with_ncs=True turns on ncs

refinement.ncs.coordinate_sigma

Normally not set by Wizard. However if the Wizard keyword ncs_refine_coord_sigma_from_rmsd is True then the ncs coordinate sigma is equal to ncs_refine_coord_sigma_from_rmsd_ratio times the rmsd among ncs copies

refinement.main.random_seed

i_ran_seed sets the random seed at the beginning of a Wizard... this affects refinement.main.random_seed but does not set it to the value of i_ran_seed (because i_ran_seed gets updated by several different routines)

refinement.main.ordered_solvent

place_waters=True will set ordered_solvent to True. Note that this only has an effect if the value of the resolution cutoff for adding waters (refinement.ordered_solvent.low_resolution) is higher than the resolution used for refinement.

refinement.main.ordered_solvent

place_waters_in_combine=True will set ordered_solvent to True, only applying this to the final combination step of multiple-model generation. Note that this only has an effect if the value of the resolution cutoff for adding waters (refinement.ordered_solvent.low_resolution) is higher than the resolution used for refinement.

refinement.ordered_solvent.low_resolution

ordered_solvent_low_resolution=3.0 (default) will set the resolution cutoff for adding waters (refinement.ordered_solvent.low_resolution) to 3 A. If the resolution used for refinement is larger than the value of ordered_solvent_low_resolution then ordered solvent is not added.

refinement.main.use_experimental_phases

use_mlhl=True will set refinement.main.use_experimental_phases to True

refinement.refine.strategy

The Wizard keywords refine refine_b refine_xyz all affect refinement.refine.strategy. If refine=True then refinement is carried out. If refine_b=True (default) isotropic displacement factors are refined. If refine_xyz=True (default) coordinates are refined.

refinement.main.occupancy_max

max_occ=1.0 sets the value of refinement.main.occupancy_max to 1.0. Default is to do nothing and use the default from phenix.refine (1.0)

refinement.refine.occupancies.individual

The combination of Wizard keywords of semet=True and refine_se_occ=True will add "(name SE)" to the value of refinement.refine.occupancies.individual. You can add to your .eff file other names of atoms to have occupancies refined as well.

refinement.main.high_resolution

Either of the Wizard keywords refinement_resolution and resolution will set the value of refinement.main.high_resolution, with refinement_resolution being used if available.

refinement.pdb_interpretation.link_distance_cutoff

link_distance_cutoff

The following parameters controlling phenix.refine output are set directly in AutoBuild and cannot be set by the user

refinement.output.write_eff_file
refinement.output.write_geo_file
refinement.output.write_def_file
refinement.output.write_maps
refinement.output.write_map_coefficients

Specifying resolve/resolve_pattern parameters

Similarly, you can control resolve and resolve_pattern parameters. For these parameters, your inputs will not be overridden by AutoBuild. The format is a little tricky: you have to put two sets of quotes around the command like this:

resolve_command="'resolution 200 3'"    # NOTE ' and " quotes

This will put the text

resolution 200 3

at the end of every temporary command file created to run resolve. (This is why it is not overridden by AutoBuild commands; they will all come before your commands in the resolve command file.) Note that some commands in resolve may be incompatible with this usage.

Including ligand coordinates in AutoBuild

If your input PDB file contains ligands (anything other than solvent that is not protein if your chain_type=PROTEIN, for example) then by default these ligands will be kept, used in refinement, and written out to your output PDB file. Any solvent molecules will by default be discarded. You can change this behavior by changing the keywords from these defaults:

keep_input_ligands=True
keep_input_waters=False

The AutoBuild Wizard will use phenix.elbow to generate geometries for any ligands that are not recognized.

You can also tell AutoBuild to add the contents of any PDB files that you wish to supply to the current version of the structure just before refinement, so all the refined models produced contain whatever AutoBuild has built, plus the contents of these PDB files. This can be done through the GUI, the command-line, or a parameters file. In the command-line version you do this with:

input_lig_file_list=my_ligand.pdb

NOTE: The files in input_lig_file_list will be edited to make them all HETATM records to tell AutoBuild to ignore these residues in rebuilding.

NOTE You may need to tell phenix.refine about the geometry of your ligands. You will get an error message if the ligand is not recognized and an automatic run of phenix.elbow does not succeed in generating your ligand. In that case you will want to run phenix.elbow to create a cif definition file for this ligand:

phenix.elbow my_ligand.pdb --id=LIG

where LIG is the 3-letter ID code that you use in my_ligand.pdb to identify your ligand. If the automatic run does not work you may need to give phenix.elbow additional information to generate your ligand.

Once phenix.elbow has generated your ligand you can use the keyword "cif_def_file_list" to tell AutoBuild about this ligand:

cif_def_file_list=elbow.LIG.my_ligand.pdb.cif

Specifying arbitrary commands and cif files for phenix.refine

You can tell AutoBuild to apply any set of cif definitions to the model during refinement by using a combination of specification files and the commands cif_def_file_list and refine_eff_file_list:

refine_eff_file_list=link.eff cif_def_file_list=link.cif

This example comes from the phenix.refine manual page in which a link is specified in a cif definition file link.cif:

 data_mod_5pho
#
loop_
_chem_mod_atom.mod_id
_chem_mod_atom.function
_chem_mod_atom.atom_id
_chem_mod_atom.new_atom_id
_chem_mod_atom.new_type_symbol
_chem_mod_atom.new_type_energy
_chem_mod_atom.new_partial_charge
 5pho     add      .      O5T    O    OH      .
loop_
_chem_mod_bond.mod_id
_chem_mod_bond.function
_chem_mod_bond.atom_id_1
_chem_mod_bond.atom_id_2
_chem_mod_bond.new_type
_chem_mod_bond.new_value_dist
_chem_mod_bond.new_value_dist_esd
 5pho     add      O5T     P         coval        1.520    0.020

and this is applied with a parameters file link.eff:

 refinement.pdb_interpretation.apply_cif_modification
{
  data_mod = 5pho
  residue_selection = resname GUA and name O5T
}

You can have any number of cif files and parameters files.

Output files from AutoBuild

When you run AutoBuild the output files will be in a subdirectory with your run number:

AutoBuild_run_1_/   # subdirectory with results

The key output files that are produced are:

A summary file listing the results of the run and the other files produced:
```
AutoBuild_summary.dat  # overall summary
```
A log file describing the entire run: the other files produced:
```
AutoBuild_run_1_1.log # overall log file
```
A warnings file listing any warnings about the run
```
AutoBuild_warnings.dat  # any warnings
```
Final refined model
```
overall_best.pdb
```
NOTE: The "working_best.pdb" file is the current working best model. If an anisotropy correction and sharpening are applied (remove_aniso=True) then working_best.pdb will be refined against the corrected data. At the end of the run the last working_best.pdb will be re-refined against the original data (overall B refined only) and written out as overall_best.pdb.
Final map coefficients used to build refined model. Use FWT PHWT in maps. Normally this is a density-modified map from resolve.
```
overall_best_denmod_map_coeffs.mtz
```
sigmaA-weighted 2mFo-DFc and Fo-Fc map coefficients from phenix.refine based on the last working_best.pdb model These map coefficients will be sharpened anisotropy-corrected if the remove_aniso=True. (The file working_best.pdb is the same as overall_best.pdb, except it is refined against sharpened, anisotropy-corrected data if remove_aniso=True). The map coefficients are 2FOFCWT PH2FOFCWT for the 2mFo-DFc map and FOFC and PHFOFC for the Fo-Fc difference map. These map coefficients are filled (missing reflections are given Fc values.)
```
overall_best_refine_map_coeffs.mtz
```
MTZ file with FP, phases and HL coeffs if present, and freeR_flags for refinement
```
overall_best_refine_data.mtz
```
NOTE: The labels for this mtz file are typically:
```
 FP SIGFP PHIM FOMM HLAM HLBM HLCM HLDM FreeR_flag
```
The file overall_best_refine_data.mtz (identical to the file exptl_fobs_phases_freeR_flags.mtz) has a copy of the (experimental) HL coefficients that were input to autobuild. The labels HLAM HLBM etc have the ending "M" because they were copied by resolve and it outputs these labels...but in fact they are not density modified phases from autobuild, just copied straight from the input data file.
Final log file for model-building
```
overall_best.log
```
Final log file for refinement
```
overall_best.log_refine
```
Evaluation of fit of model to map
```
overall_best.log_eval
```
Summary of NCS information
```
overall_best_ncs_info.ncs
```

Standard building, rebuild_in_place, and multiple-models

The AutoBuild Wizard has two overall methods for building a model. The first method (standard build) is to build a model from scratch. This involves identification of where helices (and strands, for proteins) are located, extension using fragment libraries, connection of segments, identification of side-chains, and sequence alignment. These methods are augmented in the standard building procedure by loop-fitting and building model outside of the region that has already been built. The second method (rebuild_in_place) takes an existing model and rebuilds it without adding or deleting any residues and without changing the connectivity of the chain. The way this works is a segment of the model is deleted and then is filled-in again by rebuilding from the remaining ends. This is repeated for overlapping segments covering the entire model. NOTE: If you are using rebuild_in_place then your model must be quite similar to your sequence file, and in particular the model must not extend in the N-terminal direction beyond your sequence file. Minor edits (amino acid replacements) will be done automatically. The multiple-models approach really has two levels of multiple models. At the first level, several (multiple_models_group_number, default is number_of_parallel_models) models are built (using rebuild_in_place) and are then recombined into a single good model. At the next level, this whole process may be done more than once (multiple_models_number times), yielding several very good models. By default, if you ask for rebuild_in_place, then you will get a single very good model, created by running rebuild_in_place several times and recombining the models.

Parallel jobs, nproc, nbatch, number_of_parallel_models and how AutoBuild works in parallel

The AutoBuild Wizard is set up to take advantage of multi-processor machines or batch queues by splitting the work into separate tasks. See Tutorial 4: Iterative model-building, density modification and refinement starting from experimental phases and Tutorial 6: Automatically rebuilding a structure solved by Molecular Replacement for a description of the method used by the AutoBuild Wizard to run build jobs as sub-processes and to combine the results into single models. Here are the key factors that determine how splitting model-building into batches and running them on one or more processors works:

nbatch is the number of batches of work. As long as nbatch is fixed then the results of running the Wizard will be the same, no matter how many processors are used. Normally you will not need to adjust it.
nproc is the number of processors to split the work among
number_of_parallel_models is the number of models to build at once. The default is to set number_of_parallel_models=nbatch. This affects both standard building (number_of_parallel_models sets how many initial models to build) and rebuild_in_place (number_of_parallel_models determines whether a single model is built or a set of models are built and recombined into a single model).

Phenix.autobuild is set up so that you can specify the number of processors (nproc). Here is how to choose how to set it:

If you are using rebuild_in_place=False, then use nproc=4. (Any more will not make any difference.)
If you are using rebuild_in_place=True, then use nproc=5. (Again, any more will not make any difference.)
If you are calculating an omit map, then use nproc=5 * number of omit regions (i.e., up to 100 or more, depending on how many processors you have)

Additionally you will want to set two more parameters:

run_command ="command you use to submit a job to your system"
background=False   # probably false if this is a cluster, true if this is a multiprocessor machine

If you have a queueing system with 20 nodes, then you probably submit jobs with something like "qsub -someflags myjob.sh" # where someflags are whatever flags you use (or just "qsub myjob.sh" if no flags) Then you might use

 run_command="qsub -someflags"  background=False nproc=20

 run_command="qsub"  background=False nproc=20

or If you have a 20-processor machine instead, then you might say

 run_command=sh  background=True nproc=20

so that it would run your jobs with sh on your machine, and run them all in the background (i.e., all at one time).

Model editing during rebuilding with the Coot-PHENIX interface

The AutoBuild Wizard allows you to edit a model and give it back to the Wizard during the iterative model-building, density modification and refinement process. The Wizard will consider the model that you give it along with the models that it generates automatically, and will choose the parts of your model that fit the density better than other models. You can edit a model using the PHENIX-Coot interface. This interface is accessible through via the command-line. The PHENIX-Coot interface is accessible via the command-line. When a model has been produced by the AutoSol Wizard, you can open a new window and type:

phenix.autobuild coot

which will start Coot with your current map and model. When Coot has been loaded, your map and model will be displayed along with a PHENIX-Coot Interface window. You can edit your model and then save it, giving it back to PHENIX with the button labelled something like Save model as COMM/overall_best_coot_7.pdb. This button creates the indicated file and also tells PHENIX to look for this file and to try and include the contents of the model in the building process. The precise use of the model that you save depends on the type of model-building that is being carried out by the AutoBuild Wizard. If you are using rebuild_in_place then the main-chain and side-chains of the model are considered as replacements for the current working model. Any ligands or unrecognized residues are (by default) not rebuilt but are included in refinement. By default, solvent in the model is ignored. If you are not using rebuild_in_place, only the main-chain conformation is considered, and the side-chains are ignored. Ligands (but not solvent) in the model are (by default) kept and included in refinement. As the AutoBuild Wizard continues to build new models and create new maps, you can update in the PHENIX-Coot Interface to the current best model and map with the button Update with current files from PHENIX.

Resolution limits in AutoBuild

There are several resolution limits used in AutoBuild. You can leave them all to default, or you can set any of them individually. Here is a list of these limits and how their default values are set:

Name

Description

How default value is set

resolution

Overall resolution. Used as high-resolution limit for density modification. Used as default for refinement resolution and model-building resolution if they are not set.

Resolution of input datafile. If a hires datafile is provided, the resolution of that data is used.

refinement_resolution

Resolution for refinement

value of "resolution"

resolution_build

Resolution for model-building

value of "resolution"

overall_resolution

Resolution to truncate all data. This should only be used if you need to truncate the data in order to get the Wizard to run. It causes the Wizard to ignore all data at higher resolution than overall_resolution. It is normally better to use the resolution keyword to define the resolution limits, as that will keep all the data in the output and working files.

None

multiple_models_starting_resolution

Resolution for the initial rebuilding of a model in the multiple-models procedure. Normally a low resolution to generate diversity.

4 A by default

Sample AutoBuild Commands

NOTE: Output files will be in subdirectories labelled "AutoBuild_run_1_" "AutoBuild_run_2_" etc.

Run AutoBuild beginning with experimental data

phenix.autobuild data=solve_1.mtz seq_file=seq.dat
input_ncs_file=ha.pdb

Here the data in solve_1.mtz (FP SIGFP PHIB FOM HLA HLB HLC HLD) will be used as the starting point for density modification. Then a model will be built and refined. In subsequent cycles the models that have been built will be used to improve the phases in density modification. If NCS can be found from the sites in ha.pdb or from any models that are built, then NCS will be used in density modification.

Run AutoBuild beginning with a model and rebuild in place

phenix.autobuild data=w1.sca seq.dat model=coords.pdb \
rebuild_in_place=True

Here "rebuild_in_place=True" tells AutoBuild to keep the overall model you have supplied, not to add or subtract residues from it, except that AutoBuild will try to edit the model to match the sequence in your sequence file. The AutoBuild Wizard will use your model and the data in w1.sca to generate starting phases, then it will carry out density modification to improve those phases, and adjust your model, rebuilding the model to match the resulting map and refining the model. This will be done iteratively, with the new model from each cycle being used at the start of the next one. If NCS is found in your model then it will be used in the density modification process.

Add more residues to a model or rebuild a model

phenix.autobuild data=solve_1.mtz seq_file=seq.dat \
   model=coords.pdb rebuild_in_place=False

Here "rebuild_in_place=False" tells AutoBuild to build a new model, adding or subtracting residues as necessary. The data in solve_1.mtz (FP SIGFP PHIB FOM HLA HLB HLC HLD) will be used along with your model as the starting point for density modification. Then a new model will be built and refined. In subsequent cycles the models that have been built will be used to improve the phases in density modification. If NCS is found in your model or any model that is built, then it will be used in density modification.

Run AutoBuild automatically after AutoSol

phenix.autobuild after_autosol

AutoBuild will identify the AutoSol run (in your working directory) with the highest overall score, then it will take the experimental phases (solve_xx.mtz or phaser_xx.mtz, where xx is the solution number) from that run, along with the corresponding density-modified map (resolve_xx.mtz) and the heavy_atom file (ha_xx.pdb_formatted.pdb) as inputs. Additionally, data for refinement are read in from exptl_fobs_freeR_flags_xx.mtz. AutoBuild will then build a model, refine it, use the refined model in density modification, then iterate the model-building, refinement, and density modification process until no further improvement in the model occurs.

Merge in hires data

phenix.autobuild data=solve_2.mtz hires_file=w1.sca  seq_file=seq.dat

The high-resolution data in w1.sca will be used for FP and SIGFP. Other information from solve_2.mtz (PHIB FOM HLA HLB HLC HLD) will be kept.

Truncate density at heavy-atom sites

phenix.autobuild data=solve_2.mtz seq_file=seq.dat input_ha_file=ha.pdb truncate_ha_sites_in_resolve=True

The heavy-atom sites in ha.pdb will be used to mark locations where high density is to be ignored during initial cycles of density modification. This can be useful if the heavy-atom peaks are very pronounced in the experimental map.

Skip NCS in model_building and refinement

phenix.autobuild data=solve_2.mtz seq_file=seq.dat find_ncs=False refine_with_ncs=False

The keyword "find_ncs=False" disables the finding of NCS from the models that are built and its use in density modification and model-building. The keyword "refine_with_ncs=False" disables finding NCS and its use in the refinement process. Together they prevent all use of NCS.

Make a SA-omit map around atoms in target.pdb

phenix.autobuild data=data.mtz model=coords.pdb omit_box_pdb=target.pdb   composite_omit_type=sa_omit

Coefficients for the output omit map will be in the file resolve_composite_map.mtz in the subdirectory OMIT/ . An additional map coefficients file omit_region.mtz will show you the region that has been omitted.

Make a simple composite omit map

phenix.autobuild data=data.mtz model=coords.pdb composite_omit_type=simple_omit

Coefficients for the output omit map will be in the file resolve_composite_map.mtz in the subdirectory OMIT/ .

Make a SA composite omit map

phenix.autobuild data=data.mtz model=coords.pdb composite_omit_type=sa_omit

Coefficients for the output simulated-annealing composite omit map will be in the file resolve_composite_map.mtz in the subdirectory OMIT/ .

Combine composite OMIT files from a set of parallel runs on different computers

If you run a composite OMIT job but it fails at the last step of combining files, or if you run all the individual omit boxes on different machines, you can still combine them all into one single composite omit map.

You can do this by copying all the individual mtz files with map coefficients for omit regions to a single directory.

Here is a script you can edit and use to combine omit maps representing different omit regions into one.

NOTE: you need to ensure that the OMIT regions are defined the same in the runs where you got your overall_best_denmod_map_coeffs.mtz_OMIT_REGION_1 etc files and this run. You ensure that with the n_xyz command that sets the grid. You can copy this from one of your resolve log files created when you ran your omit (i.e., AutoBuild_run_1_/TEMP0/AutoBuild_run_1_/TEMP0/resolve.log will have a line like "nu nv nw: 32 32 32 " and you copy those numbers).

 ------------------------------------
#!/bin/csh -f
# COMBINE OMIT SCRIPT
phenix.resolve << EOD
hklin exptl_fobs_phases_freeR_flags.mtz
labin FP=FP SIGFP=SIGFP
n_xyz 32 32 32  # YOU MUST SET THIS BASED ON THE nu nv nw in a resolve log
file.
solvent_content 0.85
no_build
ha_file NONE
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_1
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_2
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_3
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_4
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_5
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_6
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_7
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_8
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_9
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_10
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_11
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_12
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_13
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_14
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_15
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_16
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_17
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_18
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_19
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_20
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_21
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_22
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_23
combine_map overall_best_denmod_map_coeffs.mtz_OMIT_REGION_24
omit
EOD
# END OF COMBINE OMIT SCRIPT

Make an iterative-build omit map around atoms in target.pdb

phenix.autobuild data=w1.sca model=coords.pdb omit_box_pdb=target.pdb \
   composite_omit_type=iterative_build_omit

Make a sa-omit map around residues 3 and 4 in chain A of coords.pdb

phenix.autobuild data=w1.sca model=coords.pdb omit_box_pdb=coords.pdb \
   omit_res_start_list=3 omit_res_end_list=4 omit_chain_list=A   \
   composite_omit_type=sa_omit

Create one very good rebuilt model

phenix.autobuild data=data.mtz model=coords.pdb multiple_models=True \
  include_input_model=True  \
  multiple_models_number=1 n_cycle_rebuild_max=5

The final model will be in the subdirectory MULTIPLE_MODELS in the file all_models.pdb (this file will contain just one model). Note that this procedure will keep the sequence that is present in coords.pdb. If you supply a sequence file it will edit the sequence of coords.pdb to match your sequence file and discard any residues that do not match. (If you want to input a sequence file but not edit the sequence in coords.pdb and not discard any non-matching residues, then specify also edit_pdb=False.) Note also that if include_input_model=True then no randomization cycle will be carried out and multiple_models_starting_resolution is ignored.

Touch up a model

phenix.autobuild data=data.mtz model=coords.pdb \
touch_up=True worst_percent_res_rebuild=2 min_cc_res_rebuild=0.8

You can rebuild just the worst parts of your model by setting touch_up=True. You can decide what parts to rebuild based on a minimum model-map correlation (by residue). You can decide how much to rebuild using worst_percent_res_rebuild or with min_cc_res_rebuild, or both.

Remove the worst-fitting residues from a model

phenix.autobuild data=data.mtz model=coords.pdb \
 delete_bad_residues_only=True \
 input_map_file=map_coeffs.mtz \
 worst_percent_res_rebuild=2 min_cc_res_rebuild=0.8

The trimmed model will be in the file (the run number may vary):

AutoBuild_run_1_/starting_model_trimmed.pdb

and the removed residues will be in the file:

AutoBuild_run_1_/starting_model_removed_residues.pdb

You can delete just the worst parts of your model by setting delete_bad_residues_only=True. You can decide what parts to remove based on a minimum model-map correlation (by residue). You can decide how much to remove using worst_percent_res_rebuild or with min_cc_res_rebuild, or both. (these are the same parameters used to decide which residues to rebuild in touch_up=True). Here the input_map_file is optional; if you do not provide it then a model- based density modified map will be used to evaluate your model.

Create 20 very good rebuilt models that are as different as possible

phenix.autobuild data=data.mtz model=coords.pdb multiple_models=True \
   multiple_models_number=20 n_cycle_rebuild_max=5

The 20 final models will be in the subdirectory MULTIPLE_MODELS in the file all_models.pdb. This procedure is useful for generating an ensemble of models that are each individually consistent with the data, and yet are diverse. The variation among these models is an indication of the uncertainty in each of the models. Note that the ensemble of models is not a representation of the ensemble of structures that is truly present in the crystal.

Combining files from a nearly-complete autobuild run with rebuild-in-place=true

If you have run autobuild with rebuild_in_place=True then the last step is combining the models that have been produced. If you ran the job in separate batches and want to combine the final models, you can use the script below.

Note that all the models must have exactly the same set of atoms (aside from any solvent).

Basically you run a dummy autobuild run to create a directory and database entries, then you copy your files there, then you run autobuild and tell it to carry on and do the combine step.

--------------------------------------------------------
#!/bin/csh -f
#COMBINE_MODELS SCRIPT

if (-d PDS || -d AutoBuild_run_1_) then
 echo "Please run in a directory without PDS or AutoBuild_run_1_"
 exit 1
endif

echo "Setting up combine models with a dummy run. NOTE:
multiple_models_group_number must be correct"

phenix.autobuild fobs.mtz multiple_models=true seq_file=seq.dat
combine_only=true multiple_models_group_number=2
multiple_models_number=1 > dummy_autobuild.log

echo "Copying files to AutoBuild_run_1_/MULTIPLE_MODELS"
mkdir AutoBuild_run_1_/MULTIPLE_MODELS
cp coords1.pdb AutoBuild_run_1_/MULTIPLE_MODELS/initial_model.pdb_1_1 
cp coords2.pdb AutoBuild_run_1_/MULTIPLE_MODELS/initial_model.pdb_1_2 
cp map_coeffs_1.mtz AutoBuild_run_1_/MULTIPLE_MODELS/initial_model.mtz_1_1
cp map_coeffs_2.mtz AutoBuild_run_1_/MULTIPLE_MODELS/initial_model.mtz_1_2

ls AutoBuild_run_1_/MULTIPLE_MODELS/

echo "Running autobuild to combine files in
AutoBuild_run_1_/MULTIPLE_MODELS"

phenix.autobuild combine_only=true seq_file=seq.dat carry_on=true run=1 > autobuild_combine.log

# END OF COMBINE_MODELS SCRIPT
-------------------------------------------------------

Morph an MR model and rebuild it

phenix.autobuild data=data.mtz model=MR.pdb \
morph=True rebuild_in_place=False seq_file=seq.dat

You can have autobuild morph your input model, distorting it to match the density-modified map that is produced from your model and data. This can be used to make an improved starting model in cases where the MR model is very different than the structure that is to be solved. For the morphing to work, the two structures must be topologically similar and differ mostly by movements of domains or motifs such as a group of helices or a sheet. The morphing process consists of identifying a coordinate shift to apply to each N (or P for nucleic acids) atom that maximizes the local density correlation between the model and the map. This is smoothed and applied to the structure to generate a morphed structure.

Build an RNA chain

phenix.autobuild data=solve_1.mtz seq_file=seq.dat chain_type=RNA

Build a DNA chain

phenix.autobuild data=solve_1.mtz seq_file=seq.dat chain_type=DNA

Density-modify with or without a model and make maps

You can use the AutoBuild Wizard as a convenient way to run resolve density modification with or without including model-based information. Just use a command like this:

phenix.autobuild data=data.mtz model=coords.pdb \
   maps_only=True seq_file=seq.dat

phenix.autobuild data=data.mtz  \
   maps_only=True seq_file=seq.dat

The Wizard will calculate the same map that it would normally calculate given these data, and then it will write the map out and stop.

Density-modify starting with your map coefficients and make maps

You can use the AutoBuild Wizard as a convenient way to run resolve density modification starting with map coefficients you define. Just use a command like this:

phenix.autobuild data=data.mtz \
     maps_only=True  seq_file=seq.dat \
     map_file=starting_map.mtz map_labels="2FOFCWT PH2FOFCWT"

The Wizard will start with the phases in starting_map.mtz calculate the same map that it would normally calculate given these data, and then it will write the map out and stop.

Calculate a prime-and-switch map

phenix.autobuild data=data.mtz solvent_fraction=.6 \
   ps_in_rebuild=True model=coords.pdb maps_only=True

The output prime-and-switch map will be in the file prime_and_switch.mtz.

Possible Problems

General Limitations

The AutoBuild wizard edits input PDB files to remove multiple conformations. It will also renumber residues if the file contains residues with insertion codes. All references to residue numbers (e.g. rebuild_res_start_list) refer to the edited, renumbered model. This model can be found in the AutoBuild_run_1_ (or appropriate) directory as "edited_pdb.pdb".
If you are using rebuild_in_place then your model must be quite similar to your sequence file, and in particular the model must not extend in the N-terminal direction beyond your sequence file. Minor edits (amino acid replacements) will be done automatically.
The AutoBuild wizard expects residue numbers to not decrease along a chain. It will stop if residue 250 in chain B is found between residues 116 and 117 in the same chain, for example. To get around this, use insertion codes (make residue 250 residue 116A instead).
The keywords "cell" and "sg" have been replaced with "unit_cell" and "space_group" to make the keywords the same as in other phenix applications.
The AutoBuild model-building can only build one type of chain at a time (default chain_type='PROTEIN'; other choices are RNA and DNA). If you supply a PDB file containing more than one type of chain for rebuilding, then all the residues that are not that type of chain are treated as ligands and are (by default, keep_input_ligands=True) included in refinement but not in rebuilding. Any input solvent molecules are (by default, keep_input_waters=False) ignored.
You can include more than one type of chain in rebuilding by supplying one type of chains as ligands with input_lig_file_list and rebuilding another type:
```
chain_type=PROTEIN  # build only protein
input_lig_file_list=MyDNA.pdb  # just read in DNA coordinates and include in refinement
```
In this case only protein chains will be built, but the DNA coordinates in MyDNA.pdb will be included in all refinements and will be written out to the final coordinate file. You may wish to add the keyword:
```
keep_pdb_atoms=False  #keep the ligand atoms if model (pdb) and ligand overlap
```
which will tell AutoBuild that the ligand (DNA) atoms are to be kept if the model that is being built (protein) overlaps with it. (The default is to keep the model that is being built and to discard any ligand atoms that overlap). This whole process is likely to require substantial editing of the PDB files by hand because when you build DNA, a lot of chains are going to be built into the protein region, and when you build protein, it is going to be accidentally built into the DNA.
Any file in input_lig_file_list containing ATOM records will have them replaced with HETATM records. This is so that the rebuild_in_place algorithm does not try to use them in rebuilding.
The ligand generation routine in phenix.elbow will not generate heme groups at this point. Most other ligands can be automatically generated.
If your input data file contains both intensity data and amplitude data, only the amplitude data is exposed in the AutoBuild Wizard. If you want to use the intensity data then you have to create a file that does not have amplitude data in it.
If your input data file has only intensity data and you wish to specify which columns of data the AutoBuild Wizard is to use, then you have to specify the names that the columns will have AFTER importing the data and conversion to amplitudes, not the original column names. These column names may not be obvious. Here is how to find out what they will be. Do a quick dummy run like this with XXX as labels:
```
phenix.autobuild w2.sca coords.pdb input_labels="XXX XXX"
```
The Wizard will print out a list of available labels like this:
```
Sorry, the label XXX does not exist as an amplitude array in
the input_data_file ImportRawData_run_8_/w2_PHX.mtz
...available labels are: ['w2', 'SIGw2', 'None']
```
Then you know that the correct command is:
```
phenix.autobuild w2.sca coords.pdb input_labels="w2 SIGw2"
```
The AutoBuild Wizard cannot build modified residues. If you supply a model with modified residues, these will be taken out of the chain and treated as ligands, and the chain will be broken at that point. By default the modified residues will be added to your model just before refinement and a cif definitions file will be automatically generated for these residues. You can also add these residues with the input_lig_file_list procedure if you want.
The AutoBuild Wizard will not build very short chains unless you set the variable group_ca_length (default=4 for building a model from scratch) to a smaller number. The shortest chain that will be built is group_ca_length. If you use rebuild_in_place, then the default shortest chain allowed is 1 residue, so any part of a model you supply is rebuilt.

Specific limitations and problems

By default the AutoBuild Wizard splits jobs into one or more parts (determined by the parameter "nbatch") and runs them as sub-processes. These may run sequentially or in parallel, depending on the value of the parameter "nproc" . In some cases the running of sub-processes can lead to timing errors in which a file is not written fully before it is to be read by the next process. This appears more often when jobs are run on nfs-mounted disks than on a local disk. If this occurs, a solution is to set the parameter "nbatch=1" so that the jobs not be run as sub-processes. You can also specify "number_of_parallel_models=1" which will do much the same thing. Note that changing the value of "nbatch" will normally change the results of running the Wizard. (Changing the value of "nproc" does not change the results, it changes only how many jobs are run at once.)
In many versions of the shell tcsh (and sh), the length of the shell variable PATH is limited (for example to 4096 characters). If your PATH is quite long then when AutoBuild runs a sub-process, it may accidentally increase the PATH to a value that is over the limit. The symptom is that you get a message like "Word too long". If this happens, try 'echo $PATH' to see if it is very long...and if so see if you can remove some entries in it. Or...you may want to shorten your path in PHENIX by specifying: remove_path_word_list='coot cns ccp4' (add as many paths that you have but do not need within PHENIX). Or...you may want to install a new version of tcsh which will allow a much longer path. You can get a new version from ftp://ftp.astron.com/pub/tcsh/
The size of the asymmetric unit in the SOLVE/RESOLVE portion of the AutoBuild wizard is limited by the memory in your computer and the binaries used. The Wizard is supplied with regular-size ("", size=6), giant ("_giant", size=12), huge ("_huge", size=18) and extra_huge ("_extra_huge", size=36). Larger-size versions can be obtained on request.
The AutoBuild Wizard can take most settings of most space groups, however it can only use the hexagonal setting of rhombohedral space groups (eg., #146 R3:H or #155 R32:H), and it cannot use space groups 114-119 (not found in macromolecular crystallography) even in the standard setting due to difficulties with the use of asuset in the version of ccp4 libraries used in PHENIX for these settings and space groups.

Literature

Iterative model building, structure refinement and density modification with the PHENIX AutoBuild wizard. T. C. Terwilliger, R. W. Grosse-Kunstleve, P. V. Afonine, N. W. Moriarty, P. H. Zwart, L.-W. Hung, R. J. Read, and P. D. Adams Acta Cryst. D64, 61-69 (2008)
[pdf]

Interpretation of ensembles created by multiple iterative rebuilding of macromolecular models. T. C. Terwilliger, R. W. Grosse-Kunstleve, P. V. Afonine, P. D. Adams, N. W. Moriarty P. H. Zwart, R. J. Read, D. Turk and L.-W. Hung Acta Cryst. D63, 597-610 (2007)
[pdf]

Using prime-and-switch phasing to reduce model bias in molecular replacement. T. C. Terwilliger Acta Cryst. D60, 2144-2149 (2004)
[pdf]

Improving macromolecular atomic models at moderate resolution by automated iterative model building, statistical density modification and refinement. T.C. Terwilliger. Acta Cryst. D59, 1174-1182 (2003)
[pdf]

Statistical density modification using local pattern matching. T.C. Terwilliger. Acta Cryst. D59, 1688-1701 (2003)
[pdf]

Automated side-chain model building and sequence assignment by template matching. T.C. Terwilliger. Acta Cryst. D59, 45-49 (2003)
[pdf]

Automated main-chain model building by template matching and iterative fragment extension. T.C. Terwilliger. Acta Cryst. D59, 38-44 (2003)
[pdf]

Rapid automatic NCS identification using heavy-atom substructures T.C. Terwilliger. Acta Cryst. D58, 2213-2215 (2002)
[pdf]

Statistical density modification with non-crystallographic symmetry T.C. Terwilliger. Acta Cryst. D58, 2082-2086 (2002)
[pdf]

Maximum likelihood density modification T. C. Terwilliger Acta Cryst. D56 , 965-972 (2000)
[pdf]

Maximum-likelihood density modification with pattern recognition of structural motifs. T. C. Terwilliger Acta Cryst. D57 , 1755-1762 (2001)
[pdf]

Map-likelihood phasing T. C. Terwilliger Acta Cryst. D57 , 1763-1775 (2001)
[pdf]

Additional information

List of all AutoBuild keywords

------------------------------------------------------------------------------- 
Legend: black bold - scope names
        black - parameter names
        red - parameter values
        blue - parameter help
        blue bold - scope help
        Parameter values:
          * means selected parameter (where multiple choices are available)
          False is No
          True is Yes
          None means not provided, not predefined, or left up to the program
          "%3d" is a Python style formatting descriptor
------------------------------------------------------------------------------- 
autobuild
   data= None Datafile. This file can be a .sca or mtz or other standard file.
         The Wizard will guess the column identification. You can specify the
         column labels to use with: input_labels='FP SIGFP PHIB FOM HLA HLB
         HLC HLD FreeR_flag' Substitute any labels you do not have with None.
         If you only have myFP and mysigFP you can just say input_labels='myFP
         mysigFP'. If you have free R flags, phase information or HL
         coefficients that you want to use then an mtz file is required. If
         this file contains phase information, this phase information should
         be experimental (i.e., MAD/SAD/MIR etc), and should not be
         density-modified phases (enter any files with density-modified phases
         as input_map_file instead). NOTE: If you supply HL coefficients they
         will be used in phase recombination. If you supply PHIB or PHIB and
         FOM and not HL coefficients, then HL coefficients will be derived
         from your PHIB and FOM and used in phase recombination. If you also
         specify a hires data file, then FP and SIGFP will come from that data
         file (and not this one) If an input_refinement_file is specified,
         then F, Sigma, FreeR_flag (if present) from that file will be used
         for refinement instead of this one.
   model= None PDB file with starting model. NOTE: If your PDB file has been
          previously refined, then please make sure that you provide the free
          R flags that were used in that refinement. These can come from the
          data file or from the refinement_file.
   seq_file= Auto Sequence file. The format is plain text, with chains
             separated by a line starting with > or a blank line. Any blanks
             and unrecognized characters are ignored. You need only input 1
             copy of each unique chain. Enter name of file with 1-letter code
             of protein sequence NOTES: 1. lines starting with > are
             ignored and separate chains 2. FASTA format is fine 3. If there
             are multiple copies of a chain, just enter one copy. 4. If you
             enter a PDB file for rebuilding and it has the sequence you want,
             then the sequence file is not necessary. NOTE: You can also enter
             the name of a PDB file that contains SEQRES records, and the
             sequence from the SEQRES records will be read, written to
             seq_from_seqres_records.dat, and used as your input sequence.
             NOTE: for AutoBuild you can specify start_chains_list on the
             first line of your sequence file: >> start_chains_list 23
             11 5 NOTE: default for this keyword is Auto, which means
             "carry out normal process to guess this keyword". This
             means if you specify "after_autosol" in AutoBuild,
             AutoBuild will automatically take the value from AutoSol. If you
             do not want this to happen, you can specify None which means
             "No file" If you have a duplex DNA, enter each strand
             as a separate chain.
   map_file= Auto MTZ file containing starting map. This file must be a mtz
             file. The Wizard will guess the column identification. You can
             specify the column labels to use with: input_map_labels='FP PHIB
             FOM' Substitute any labels you do not have with None. If you only
             have myFP and myPHIB you can just say input_map_labels='myFP
             myPHIB'. This map will be used in the first cycle of
             model-building. NOTE 1: If use_map_file_as_hklstart=True then
             this file will be used instead to start density modification.
             NOTE 2: default for this keyword is Auto, which means "carry
             out normal process to guess this keyword". This means if you
             specify "after_autosol" in AutoBuild, AutoBuild will
             automatically take the value from AutoSol. If you do not want
             this to happen, you can specify None which means "No
             file"
   refinement_file= Auto File for refinement. This file can be a .sca or mtz
                    or other standard file. This file will be merged with your
                    data file, with any phase information coming from your
                    data file. If this file has free R flags, they will be
                    used, otherwise if the data file has them, those will be
                    used, otherwise they will be generated. The Wizard will
                    guess the column identification. You can specify the
                    column labels to use with: input_refinement_labels='FP
                    SIGFP FreeR_flag' Substitute any labels you do not have
                    with None. If you only have myFP and mysigFP you can just
                    say input_refinement_labels='myFP mysigFP'. Data file to
                    use for refinement. The data in this file should not be
                    corrected for anisotropy. It will be combined with
                    experimental phase information (if any) from
                    input_data_file for refinement. If you leave this blank,
                    then the data in the input_data_file will be used in
                    refinement. If no anisotropy correction is applied to the
                    data you do not need to specify a datafile for refinement.
                    If an anisotropy correction is applied to the data files,
                    then you should enter an uncorrected datafile for
                    refinement. Any standard format is fine; normally only F
                    and sigF will be used. Bijvoet pairs and duplicates will
                    be averaged. If an mtz file is provided then a free R flag
                    can be read in as well. Any HL coeffs and phase
                    information in this file is ignored. NOTE: default for
                    this keyword is Auto, which means "carry out normal
                    process to guess this keyword". This means if you
                    specify "after_autosol" in AutoBuild, AutoBuild
                    will automatically take the value from AutoSol. If you do
                    not want this to happen, you can specify None which means
                    "No file"
   hires_file= Auto File with high-resolution data. This file can be a .sca or
               mtz or other standard file. The Wizard will guess the column
               identification. You can specify the column labels to use with:
               input_hires_labels='FP SIGFP'.
   crystal_info
      unit_cell= None Enter cell parameter (a b c alpha beta gamma)
      space_group= None Space Group symbol (i.e., C2221 or C 2 2 21)
      solvent_fraction= None Solvent fraction in crystals (0 to 1). This is
                        normally set automatically from the number of NCS
                        copies and the sequence.
      chain_type= *Auto PROTEIN DNA RNA You can specify whether to build
                  protein, DNA, or RNA chains. At present you can only build
                  one of these in a single run. If you have both DNA and
                  protein, build one first, then run AutoBuild again,
                  supplying the prebuilt model in the
                  "input_lig_file_list" and build the other. NOTE:
                  default for this keyword is Auto, which means "carry
                  out normal process to guess this keyword". The process
                  is to look at the sequence file and/or input pdb file to see
                  what the chain type is. If there are more than one type, the
                  type with the larger number of residues is guessed. If you
                  want to force the chain_type, then set it to PROTEIN RNA or
                  DNA.
      resolution= 0 High-resolution limit. Used as resolution limit for
                  density modification and as general default high-resolution
                  limit. If resolution_build or refinement_resolution are set
                  then they override this for model-building or refinement. If
                  overall_resolution is set then data beyond that resolution
                  is ignored completely. Zero means keep everything.
      dmax= 500 Low-resolution limit
      overall_resolution= 0 If overall_resolution is set, then all data beyond
                          this is ignored. NOTE: this is only suggested if you
                          have a very big cell and need to truncate the data
                          to allow the wizard to run at all. Normally you
                          should use 'resolution' and 'resolution_build' and
                          'refinement_resolution' to set the high-resolution
                          limit
      sequence= None Plain text containing 1-letter code of protein sequence
                Same as seq_file except the sequence is read directly, not
                from a file. If both are given, seq_file is ignored.
   input_files
      input_labels= None Labels for input data columns
      input_hires_labels= None Labels for input hires file (FP SIGFP
                          FreeR_flag)
      input_map_labels= None Labels for input map coefficient columns (FP PHIB
                        FOM) NOTE: FOM is optional (set to None if you wish)
      input_refinement_labels= None Labels for input refinement file columns
                               (FP SIGFP FreeR_flag)
      input_ha_file= None If the flag "truncate_ha_sites_in_resolve"
                     is set then density at sites specified with input_ha_file
                     is truncated to improve the density modification
                     procedure.
      cif_def_file_list= None You can enter any number of CIF definition
                         files. These are normally used to tell phenix.refine
                         about the geometry of a ligand or unusual residue.
                         You usually will use these in combination with
                         "PDB file with metals/ligands" (keyword
                         "input_lig_file_list" ) which allows you to
                         attach the contents of any PDB file you like to your
                         model just before it gets refined. You can use
                         phenix.elbow to generate these if you do not have a
                         CIF file and one is requested by phenix.refine
      input_lig_file_list= None This script adds the contents of these PDB
                           files to each model just prior to refinement.
                           Normally you might use this to put in any
                           heavy-atoms that are in the refined structure (for
                           example the heavy atoms that were used in phasing),
                           or to add a ligand to your model. If the atoms in
                           this PDB file are not recognized by phenix.refine,
                           then you can specify their geometries with a cif
                           definitions file using the keyword
                           "cif_def_files_list". You can easily
                           generate cif definitions for many ligands using
                           phenix.elbow in PHENIX. You can put anything you
                           like in the files in input_lig_file_list, but any
                           atoms that fall within 1.5 A of any atom in the
                           current model will be tossed (not written to the
                           model).
      keep_input_ligands= True You can choose whether to (by default) let the
                          wizard keep ligands by separating them out from the
                          rest of your model and adding them back to your
                          rebuilt model, or alternatively to remove all
                          ligands from your input pdb file before
                          rebuild_in_place.
      keep_input_waters= False You can choose whether to keep input waters
                         (solvent) when using rebuild_in_place. If you keep
                         them, then you should specify either
                         "place_waters=No" or
                         "keep_pdb_atoms=No" because if
                         place_waters=True and keep_pdb_atoms=True then
                         phenix.refine will add waters and then the wizard
                         will keep the new waters from the new PDB file
                         created by phenix.refine preferentially over the ones
                         in your input file.
      keep_pdb_atoms= True If true, keep the model coordinates when model and
                      ligand coordinates are within dist_close_overlap and
                      ligands in input_lig_file_list are being added to the
                      current model. If false, keep instead the ligand
                      coordinates.
      refine_eff_file_list= None You can enter any number of refinement
                            parameter files. These are normally used to tell
                            phenix.refine defaults to apply, as well as
                            creating specialized definitions such as unusual
                            amino acid residues and linkages. These parameters
                            override the normal phenix.refine defaults. They
                            themselves can be overridden by parameters set by
                            the Wizard and by you, controlling the Wizard.
                            NOTE: Any parameters set by AutoBuild directly
                            (such as number_of_macro_cycles, high_resolution,
                            etc...) will not be taken from this parameters
                            file. This is useful only for adding extra
                            parameters not normally set by AutoBuild.
      map_file_is_density_modified= False You can specify that the
                                    input_map_file has been density modified.
                                    (This changes the assumptions on
                                    statistics of the map.)
      map_file_fom= None You can specify the FOM of the input map file (useful
                    in cases where the map file has only FWT PHFWT and no FOM
                    column). This FOM is used to set the default smoothing
                    radius for the density modification solvent boundary.
      use_map_file_as_hklstart= False You can specify that the file named as
                                input_map_file will be used as starting
                                coefficients for density modification in the
                                first cycle. NOTE: if maps_only=True and
                                input_map_file is set, then
                                use_map_file_as_hklstart will be set to True
      use_map_in_resolve_with_model= False You can specify that the current
                                     map file be used as hklstart in density
                                     modification with a model.
   aniso
      remove_aniso= True Remove anisotropy from data files before use Note:
                    map files are assumed to be already corrected and are not
                    affected by this. Also the input refinement file is not
                    affected by this.
      b_iso= None Target overall B value for anisotropy correction. Ignored if
             remove_aniso = False. If None, default is minimum of (max_b_iso,
             lowest B of datasets, target_b_ratio*resolution)
      max_b_iso= 40. Default maximum overall B value for anisotropy
                 correction. Ignored if remove_aniso = False. Ignored if b_iso
                 is set. If used, default is minimum of (max_b_iso, lowest B
                 of datasets, target_b_ratio*resolution)
      target_b_ratio= 10. Default ratio of target B value to resolution for
                      anisotropy correction. Ignored if remove_aniso = False.
                      Ignored if b_iso is set. If used, default is minimum of
                      (max_b_iso, lowest B of datasets,
                      target_b_ratio*resolution)
   decision_making
      acceptable_r= 0.25 Used to decide whether the model is acceptable enough
                    to quit if it is not improving much. A good value is 0.25
      r_switch= 0.4 R-value criteria for deciding whether to use R-value or
                map correlation as a criteria for model quality. A good value
                is 0.40
      semi_acceptable_r= 0.3 Used to decide whether the model is acceptable
                         enough to skip rebuilding the model from scratch and
                         focus on adding loops and extending it. A good value
                         is 0.3
      min_cc_res_rebuild= 0.5 You can rebuild just the worst parts of your
                          model by setting touch_up=True. You can decide what
                          parts to rebuild based on a minimum model-map
                          correlation (by residue). You can decide how much to
                          rebuild using worst_percent_res_rebuild or with
                          min_cc_res_rebuild, or both.
      min_seq_identity_percent= 50 The sequence in your input PDB file will be
                                adjusted to match the sequence in your
                                sequence file (if any). If there are
                                insertions/deletions in your model and the
                                wizard does not seem to identify them, you can
                                split up your PDB file by adding records like
                                this: BREAK You can specify the minimum
                                sequence identity between your sequence file
                                and a segment from your input PDB file to
                                consider the sequences to be matched. Default
                                is 50.0%. You might want a higher number to
                                make sure that deletions in the sequence are
                                noticed.
      dist_close= None If main-chain atom rmsd is less than dist_close then
                  crossover between chains in different models is allowed at
                  this point. If you input a negative number the defaults will
                  be used
      dist_close_overlap= 1.5 Model or ligand coordinates but not both are
                          kept when model and ligand coordinates are within
                          dist_close_overlap and ligands in
                          input_lig_file_list are being added to the current
                          model. NOTE: you might want to decrease this if your
                          ligand atoms get removed by the wizard. Default=1.5
                          A
      loop_cc_min= 0.4 You can specify the minimum correlation of density from
                   a loop with the map.
      group_ca_length= 4 In resolve building you can specify how short a
                       fragment to keep. Normally 4 or 5 residues should be
                       the minimum.
      group_length= 2 In resolve building you can specify how many fragments
                    must be joined to make a connected group that is kept.
                    Normally 2 fragments should be the minimum.
      include_molprobity= False You can choose to include the clash score from
                          MolProbity as one of the scoring criteria in
                          comparing and merging models. The score is combined
                          with the model-map correlation CC by summing in a
                          weighted clashscore. If clashscore for a residue has
                          a value < ok_molp_score then its value is
                          (clashscore-ok_molp_score)*scale_molp_score,
                          otherwise its value is zero.
      ok_molp_score= None You can choose to include the clash score from
                     MolProbity as one of the scoring criteria in comparing
                     and merging models. The score is combined with the
                     model-map correlation CC by summing in a weighted
                     clashscore. If clashscore for a residue has a value <
                     ok_molp_score (the threshold defined by ok_molp_score)
                     then its value is
                     (clashscore-ok_molp_score)*scale_molp_score, otherwise
                     its value is zero.
      scale_molp_score= None You can choose to include the clash score from
                        MolProbity as one of the scoring criteria in comparing
                        and merging models. The score is combined with the
                        model-map correlation CC by summing in a weighted
                        clashscore. If clashscore for a residue has a value <
                        ok_molp_score then its value is
                        (clashscore-ok_molp_score)*scale_molp_score, otherwise
                        its value is zero.
   density_modification
      thorough_denmod= *Auto True False Choose whether you want to go for
                       thorough density modification when no model is used
                       ("False" speeds it up and for a terrible map
                       is sometimes better)
      hl= False You can choose whether to calculate hl coeffs when doing
          density modification (True) or not to do so (False). Default is No.
      mask_type= *histograms probability wang Choose method for obtaining
                 probability that a point is in the protein vs solvent region.
                 Default is "histograms". If you have a SAD dataset
                 with a heavy atom such as Pt or Au then you may wish to
                 choose "wang" because the histogram method is
                 sensitive to very high peaks. Options are: histograms:
                 compare local rms of map and local skew of map to values from
                 a model map and estimate probabilities. This one is usually
                 the best. probability: compare local rms of map to
                 distribution for all points in this map and estimate
                 probabilities. In a few cases this one is much better than
                 histograms. wang: take points with highest local rms and
                 define as protein.
      mask_from_pdb= None You can specify a PDB file to define a mask for the
                     macromolecule in density modification (i.e., the solvent
                     boundary). All points within rad_mask_from_pdb of an atom
                     in the PDB file defined by mask_from_pdb will be
                     considered to be within the macromolecule
      rad_mask_from_pdb= 2 You can define the radius for calculation of the
                         protein mask Applies only to mask_from_pdb
      modify_outside_delta_solvent= 0.05 You can set the initial solvent
                                    content to be a little lower than
                                    calculated when you are running
                                    modify_outside_model Usually 0.05 is fine.
      modify_outside_model= False You can choose whether to modify the density
                            in the "protein" region outside the
                            region specified in your current model by matching
                            histograms with the region that is specified by
                            that model. This can help by raising the density
                            in this protein region up to a value similar to
                            that where atoms are already placed.
      truncate_ha_sites_in_resolve= *Auto True False You can choose to
                                    truncate the density near heavy-atom sites
                                    at a maximum of 2.5 sigma. This is useful
                                    in cases where the heavy-atom sites are
                                    very strong, and rarely hurts in cases
                                    where they are not. The heavy-atom sites
                                    are specified with
                                    "input_ha_file" and the radius
                                    is rad_mask
      rad_mask= None You can define the radius for calculation of the protein
                mask Applies only to truncate_ha_sites_in_resolve. Default is
                resolution of data.
      use_resolve_fragments= True This script normally uses information from
                             fragment identification as part of density
                             modification for the first few cycles of
                             model-building. Fragments are identified during
                             model-building. The fragments are used, with
                             weighting according to the confidence in their
                             placement, in density modification as targets for
                             density values.
      use_resolve_pattern= True Local pattern identification is normally used
                           as part of density modification during the first
                           few cycles of model building.
      use_hl_anom_in_denmod= False Default is False (use HL coefficients in
                             density modification) NOTE: if True, you must
                             supply HLanom coefficients Allows you to specify
                             that HL coefficients including only the phase
                             information from the imaginary (anomalous
                             difference) contribution from the anomalous
                             scatterers are to be used in density
                             modification. Two sets of HL coefficients are
                             produced by Phaser. HLA HLB etc are HL
                             coefficients including the contribution of both
                             the real scattering and the anomalous
                             differences. HLanomA HLanomB etc are HL
                             coefficients including the contribution of the
                             anomalous differences alone. These HL
                             coefficients for anomalous differences alone are
                             the ones that you will want to use in cases where
                             you are bringing in model information that
                             includes the real scattering from the model used
                             in Phaser, such as when you are carrying out
                             density modification with a model or refinement
                             of a model If use_hl_anom_in_denmod=True then the
                             HLanom HL coefficients from Phaser are used in
                             density modification
      use_hl_anom_in_denmod_with_model= False See use_hl_anom_in_denmod If
                                        use_hl_anom_in_denmod=True then the
                                        HLanom HL coefficients from Phaser are
                                        used in density modification with a
                                        model
      mask_as_mtz= False Defines how omit_output_mask_file
                   ncs_output_mask_file and protein_output_mask_file are
                   written out. If mask_as_mtz=False it will be a ccp4 map. If
                   mask_as_mtz=True it will be an mtz file with map
                   coefficients FP PHIM FOMM (all three required)
      protein_output_mask_file= None Name of map to be written out
                                representing your protein (non-solvent)
                                region. If mask_as_mtz=False the map will be a
                                ccp4 map. If mask_as_mtz=True it will be an
                                mtz file with map coefficients FP PHIM FOMM
                                (all three required)
      ncs_output_mask_file= None Name of map to be written out representing
                            your ncs asymmetric unit. If mask_as_mtz=False the
                            map will be a ccp4 map. If mask_as_mtz=True it
                            will be an mtz file with map coefficients FP PHIM
                            FOMM (all three required)
      omit_output_mask_file= None Name of map to be written out representing
                             your omit region. If mask_as_mtz=False the map
                             will be a ccp4 map. If mask_as_mtz=True it will
                             be an mtz file with map coefficients FP PHIM FOMM
                             (all three required)
   maps
      maps_only= False You can choose whether to skip all model-building and
                 just calculate maps and write out the results. This also runs
                 just 1 cycle and turns on HL coefficients.
      n_xyz_list= None You can specify the grid to use for map calculations.
   model_building
      build_type= *RESOLVE RESOLVE_AND_BUCCANEER You can choose to build
                  models with RESOLVE or with RESOLVE and BUCCANEER #and
                  TEXTAL and how many different models to build with RESOLVE.
                  The more you build, the more likely to get a complete model.
                  Note that rebuild_in_place can only be carried out with
                  RESOLVE model-building. For BUCCANEER model building you
                  need CCP4 version 6.1.2 or higher and BUCCANEER version
                  1.3.0 or higher
      allow_negative_residues= False Normally the wizard does not allow
                               negative residue numbers, and all residues with
                               negative numbers are rejected when they are
                               read in. You can allow them if you wish.
      highest_resno= None Highest residue number to be considered
                     "placed" in sequence for rebuild_in_place
      semet= False You can specify that the dataset that is used for
             refinement is a selenomethionine dataset, and that the model
             should be the SeMet version of the protein, with all SD of MET
             replaced with Se of MSE.
      use_met_in_align= *Auto True False You can use the heavy-atom positions
                        in input_ha_file as markers for Met SD positions.
      base_model= None You can enter a PDB file with coordinates to be used as
                  a starting point for model-building. These coordinates will
                  be included in the same way as fragments placed by searching
                  for helices and strand in initial model-building. Note the
                  difference from the use of models in
                  consider_main_chain_list, which are merged with models after
                  they are built. NOTE: Only use this if you want to keep the
                  input model and just add to it.
      consider_main_chain_list= None This keyword lets you name any number of
                                PDB files to consider as templates for
                                model-building. Every time models are built,
                                the contents of these files will be merged
                                with them and the best parts will be kept.
                                NOTE: this only uses the main-chain atoms of
                                your PDB files.
      dist_connect_max_helices= None Set maximum distance between ends of
                                helices and other ends to try and connect them
                                in insert_helices.
      edit_pdb= True You can choose to edit the input PDB file in
                rebuild_in_place to match the input sequence (default=True).
                NOTE: residues with residue numbers higher than
                'highest_resno' are assumed to not have a known sequence and
                will not be edited. By default the value of 'highest_resno' is
                the highest residue number from the sequence file, after
                adding it to the starting residue number from
                start_chains_list. You can also set it directly
      helices_strands_only= False You can choose to use a quick model-building
                            method that only builds secondary structure. At
                            low resolution this may be both quicker and more
                            accurate than trying to build the entire structure
                            If you are running the AutoSol Wizard, normally
                            you should choose 'False' as standard building is
                            quick. When your structure is solved by AutoSol,
                            go on to AutoBuild and build a more complete model
                            (still using helices_strands_only=False). NOTE:
                            helices_strands_only does not apply in AutoSol if
                            phase_improve_and_build=True
      helices_strands_start= False You can choose to use a quick
                             model-building method that builds secondary
                             structure as a way to get started...then model
                             completion is done as usual. (Contrast with
                             helices_strands_only which only does secondary
                             structure)
      cc_helix_min= None Minimum CC of helical density to map at low
                    resolution when using helices_strands_only
      cc_strand_min= None Minimum CC of strand density to map when using
                     helices_strands_only
      trace_loops= False Use trace_loops algorithm in loop fitting
      standard_loops= True Use standard_loops algorithm in loop fitting
      loop_lib= False Use loop_lib algorithm in loop fitting
      include_input_model= True The keyword include_input_model defines
                           whether the input model (if any) is to be crossed
                           with models that are derived from it, and the best
                           parts of each kept. It also defines whether the
                           input model is to be included in combination steps
                           during initial model-building. Note that if
                           multiple_models=True and include_input_model=True
                           then no initial cycle of randomization will be
                           carried out and the keyword
                           multiple_models_starting_resolution is ignored. In
                           most cases you should use include_input_model=True
                           If you want to generate maximum diversity with
                           multiple-models then you may wish to use
                           include_input_model=False. Also if you want to
                           decrease the amount of bias from your starting
                           model you may wish to use
                           include_input_model=False.
      input_compare_file= None If you are rebuilding a model or already think
                          you know what the model should be, you can include a
                          comparison file in rebuilding. The model is not used
                          for anything except to write out information on
                          coordinate differences in the output log files.
                          NOTE: this feature does not always work correctly.
      merge_models= False You can choose to only merge any input models and
                    write out the resulting model. The best parts of each
                    model will be kept based on model-map correlation.
                    Normally used along with number_of_parallel_models=1
      morph= False You can choose whether to distort your input model in order
             to match the current working map. This may be useful for MR
             models that are quite distant from the correct structure.
      morph_cycles= 2 Number of iterations of morphing each time it is run.
      morph_rad= 7 Smoothing radius for morphing. The density from your model
                 and from the map are calculated with the radius rad_morph,
                 then they are adjusted to overlap optimally
      n_ca_enough_helices= None Set maximum number of CA to add to ends of
                           helices and other ends to try and connect them in
                           insert_helices.
      delta_phi= 20 Approximate angular sampling for search for regular
                 secondary structure in building
      offsets_list= 53 7 23 You can specify an offset for the orientation of
                    the helix and strand templates in building. This is used
                    in generating different starting models.
      ps_in_rebuild= False You can choose to use a prime-and-switch resolve
                     map in all cycles of rebuilding instead of a
                     density-modified map. This is normally used in
                     combination with maps_only to generate a prime-and-switch
                     map.
      use_ncs_in_ps= False You can choose to use NCS in prime-and-switch
      remove_outlier_segments_z_cut= 3.0 You can remove any segments that are
                                     not assigned to sequence during
                                     model-building if the mean density at
                                     atomic positions are more than
                                     remove_outlier_segments_z_cut sd lower
                                     than the mean for the structure.
      refine= True This script normally refines the model during building. Say
              False to skip refinement
      reference_model= None You can specify a reference model for refinement
      resolution_build= 0 Enter the high-resolution limit for model-building.
                        If 0.0, the value of resolution is used as a default.
      restart_cycle_after_morph= 5 Morphing (if morph=True) will go only up to
                                 this cycle, and then the morphed PDB file
                                 will be used as a starting PDB file from then
                                 on, removing all previous models. If
                                 restart_cycle_after_morph=0 then the model
                                 will be morphed and not rebuilt
      retrace_before_build= False You can choose to retrace your model n_mini
                            times and use a map based on these retraced models
                            to start off model-building. This is the default
                            for rebuilding models if you are not using
                            rebuild_in_place. You can also specify
                            n_iter_rebuild, the number of cycles of
                            retrace-density-modify-build before starting the
                            main build.
      reuse_chain_prev_cycle= True You can choose to allow model-building to
                              include atoms from each cycle in the model the
                              next cycle or not
      richardson_rotamers= *Auto True False You can choose to use the rotamer
                           library from SC Lovell, JM Word, JS Richardson and
                           DC Richardson (2000) " The Penultimate Rotamer
                           Library" Proteins: Structure Function and
                           Genetics 40 389-408. if you wish. Typically this
                           works well in RESOLVE model-building for
                           nearly-final models but not as well earlier in the
                           process . Default (Auto) is to use these rotamers
                           for rebuild_in_place but not otherwise.
      rms_random_frag= None Rms random position change added to residues on
                       ends of fragments when extending them If you enter a
                       negative number, defaults will be used.
      rms_random_loop= None Rms random position change added to residues on
                       ends of loops in tries for building loops If you enter
                       a negative number, defaults will be used.
      start_chains_list= None You can specify the starting residue number for
                         each of the unique chains in your structure. If you
                         use a sequence file then the unique chains are
                         extracted and the order must match the order of your
                         starting residue numbers. For example, if your
                         sequence file has chains A and B (identical) and
                         chains C and D (identical to each other, but
                         different than A and B) then you can enter 2 numbers,
                         the starting residues for chains A and C. NOTE: you
                         need to specify an input sequence file for
                         start_chains_list to be applied.
      trace_as_lig= False You can specify that in building steps the ends of
                    chains are to be extended using the LigandFit algorithm.
                    This is default for nucleic acid model-building.
      track_libs= False You can keep track of what libraries each atom in a
                  built structure comes from.
      two_fofc_in_rebuild= False You can choose to use a sigmaa-weighted
                           2Fo-Fc map in all cycles of rebuilding instead of a
                           density-modified map. If the model is poor this can
                           sometimes allow model-building in place to work
                           even when it will not for density-modified maps.
      refine_map_coeff_labels= "2FOFCWT PH2FOFCWT" You can pick which map
                               coefficients from phenix.refine to use if
                               two_fofc_in_rebuild=True
      filled_2fofc_maps= True You can choose to use filled 2Fo-Fc maps when
                         two_fofc_in_rebuild is used. Default is True
      map_phasing= False You can choose to use statistical density
                   modification starting with a 2mFo-DFc map, including model
                   information instead of a standard density-modified map.
                   This density modification will include NCS if present.
      use_any_side= True You can choose to have resolve model-building place
                    the best-fitting side chain at each position, even if the
                    sequence is not matched to the map.
      use_cc_in_combine_extend= False You can choose to use the correlation of
                                density rather than density at atomic
                                positions to score models in combine_extend
   multiple_models
      combine_only= False Once you have created a set of initial models you
                    can merge them together into a final set. This option is
                    useful if you have split up the creation of multiple
                    models into different directories, and then you have
                    copied all the initial models to one directory for
                    combining.
      multiple_models= False You can build a set of models, all compatible
                       with your data. You can specify how many models with
                       multiple_models_number. If you are using
                       rebuild_in_place you can specify whether to generate
                       starting models or not with multiple_models_starting.
      multiple_models_first= 1 Specify which model to build first
      multiple_models_group_number= 5 You can build several initial models and
                                    merge them. Normally 5 initial models is
                                    fine.
      multiple_models_last= 20 Specify which model to end with
      multiple_models_number= 20 Specify how many models to build.
      multiple_models_starting= True You can specify how to generate starting
                                models for multiple models. If you are using
                                rebuild_in_place and you specify
                                "True" then the Wizard will rebuild
                                your starting model at the resolution
                                specified in
                                multiple_models_starting_resolution. If you
                                are not using rebuild_in_place the Wizard will
                                always build a starting model at the current
                                resolution.
      multiple_models_starting_resolution= 4 You can set the resolution for
                                           rebuilding an initial model. A
                                           value of 0.0 will use the
                                           resolution of the dataset.
      place_waters_in_combine= True You can choose whether phenix.refine
                               automatically places ordered solvent (waters)
                               during the last cycle of multiple-model
                               generation. This is separate from place_waters,
                               which applies to all other cycles.
   ncs
      find_ncs= *Auto True False This script normally deduces ncs information
                from the NCS in chains of models that are built during
                iterative model-building. The update is done each cycle in
                which an improved model is obtained. Say False to skip this.
                See also "input_ncs_file" which can be used to
                specify NCS at the start of the process. If
                find_ncs="No" then only this starting NCS will be
                used and it will not be updated. You can use find_ncs
                "No" to specify exactly what residues will be used
                in NCS refinement and exactly what NCS operators to use in
                density modification. You can use the function
                $PHENIX/phenix/phenix/command_line/simple_ncs_from_pdb.py to
                help you set up an input_ncs_file that has your specifications
                in it.
      input_ncs_file= None You can enter NCS information in 3 ways: (1) an
                      ncs_spec file produced by AutoSol or AutoBuild with NCS
                      information (2) a heavy-atom PDB file that contains ncs
                      in the heavy-atom sites (3) a PDB file with a model that
                      contains chains with NCS The wizard will derive NCS
                      information from any of these if specified. See also
                      "find_ncs" which determines whether the wizard
                      will update NCS from models that are built during
                      iterative building.
      ncs_copies= None Number of copies of the molecule in the au (note: only
                  one type of molecule allowed at present)
      ncs_refine_coord_sigma_from_rmsd= False You can choose to use the
                                        current NCS rmsd as the value of the
                                        sigma for NCS restraints. See also
                                        ncs_refine_coord_sigma_from_rmsd_ratio
      ncs_refine_coord_sigma_from_rmsd_ratio= 1 You can choose to multiply the
                                              current NCS rmsd by this value
                                              before using it as the sigma for
                                              NCS restraints See also
                                              ncs_refine_coord_sigma_from_rmsd
      no_merge_ncs_copies= False Normally False (do merge NCS copies). If
                           True, then do not use each NCS copy to try to build
                           the others.
      optimize_ncs= True This script normally deduces ncs information from the
                    NCS in chains of models that are built during iterative
                    model-building. Optimize NCS adds a step to try and make
                    the molecule formed by NCS as compact as possible, without
                    losing any point-group symmetry.
      use_ncs_in_build= True Use NCS information in the model assembly stage
                        of model-building. Also if no_merge_ncs_copies is not
                        set, then use each NCS copy to try to build the
                        others.
   omit
      composite_omit_type= *None simple_omit refine_omit sa_omit
                           iterative_build_omit Your choices of types of OMIT
                           maps are: None - normal operation, no omit
                           simple_omit - omit the atoms in OMIT region in
                           calculating a sigmaA-weighted 2mFo-DFc map with no
                           refinement. refine_omit - as simple_omit, but
                           refine with standard refinement. sa_omit - omit the
                           atoms in OMIT region, carry out simulated-annealing
                           refinement, then calculate a sigmaA-weighted
                           2mFo-DFc map. iterative_build_omit - set occupancy
                           of atoms in OMIT region to 0 throughout an entire
                           iterative model-building, density modification and
                           refinement process (takes a long time). All these
                           omit map types are available as composite omit maps
                           (default) or as omit maps around a region defined
                           by a PDB file (using omit_box_pdb_list) The
                           resulting OMIT map will be in the directory OMIT
                           with file name resolve_composite_map.mtz . This mtz
                           file contains the map coefficients to create the
                           OMIT map. The file "omit_region.mtz"
                           contains the coefficients for a map showing the
                           boundaries of the OMIT region.
      n_box_target= None You can tell the Wizard how many omit boxes to try
                    and set up (but it will not necessarily choose your number
                    because it has to be nicely divisible into boxes that fit
                    your asymmetric unit). A suitable number is 24. The larger
                    the number of boxes, the better the map will be, but the
                    longer it will take to calculate the map.
      n_cycle_image_min= 3 Pattern recognition (resolve_pattern) and fragment
                         identification ("image based density
                         modification") are used as part of the density
                         modification process. These are normally only useful
                         in the first few cycles of iterative model-building.
                         This script tries model-building both with and
                         without including image information, and proceeds
                         with the most complete model. Once at least
                         n_cycle_image_min cycles have been carried out with
                         image information, if the image-based map results in
                         a less-complete model than the one without image
                         information, image information is no longer included.
      n_cycle_rebuild_omit= 10 Model-building is normally carried out using
                            the "best" available map. If
                            omit_on_rebuild is True, then every
                            n_cycle_rebuild_omit cycle of model rebuilding, a
                            composite omit map is used instead. If you specify
                            0 and omit_on_rebuild is True, omit maps will be
                            used every cycle. Normally every 10th cycle is
                            optimal.
      offset_boundary= 2. Specify the boundary in A around atoms in
                       omit_box_pdb for definition of omit region. Contrast
                       with omit_boundary which applies for composite omit
      omit_boundary= 2. Specify the boundary in A around atoms in omit_boxes
                     for definition of omit region. Contrast with
                     offset_boundary which applies for omit_box_pdb
      omit_box_start= 0 To only carry out omit in some of the omit boxes, use
                      omit_box_start and omit_box_end
      omit_box_end= 0 To only carry out omit in some of the omit boxes, use
                    omit_box_start and omit_box_end
      omit_box_pdb_list= None This keyword applies if you have set OMIT region
                         specification to "omit_around_pdb". To
                         automatically set an OMIT region specify a PDB
                         file(s) with omit_box_pdb_list. The omit region
                         boundaries will be the limits in x y z of the atoms
                         in this file, plus a border of offset_boundary. To
                         use only some of the atoms in the file, specify
                         values for starting, ending and chain to omit
                         (omit_res_start_list and omit_res_end_list and
                         omit_chain_list) If you specify more than one file
                         (or if you specify more than one segment of a file
                         with omit_chain_list or omit_res_start_list and
                         omit_res_end_list) then a set of omit runs will be
                         carried out and combined into one composite omit.
      omit_chain_list= None You can choose to omit just a portion of your
                       model keywords omit_res_start_list 3 omit_res_end_list
                       4 omit_chain_list chain1 (use "" to select
                       all chains) The residues from 3 to 4 of chain1 will be
                       omitted. You can specify more than one region by
                       listing them separated by spaces If you specify more
                       than one region, a separate omit run will be carried
                       out for each one and then the maps will be put together
                       afterwards. If there are more than one chains in the
                       input PDB file then only the chain defined by
                       omit_chain will be omitted NOTE: Zero for start and end
                       and "" for chain is the same as choosing
                       everything
      omit_offset_list= 0 0 0 0 0 0 To carry out one iterative build omit,
                        with a region defined in grid units, enter
                        nxs,nxe,nys,nye,nzs,nze in omit_offset_list.
      omit_on_rebuild= False You can specify whether to use an omit map for
                       building the model on rebuild cycles. Default is True
                       if you start with a model, False if you are building a
                       model from scratch. The omit map is calculated every
                       n_cycle_rebuild_omit cycles
      omit_selection= None Selection string defining atoms in input pdb to be
                      used to define the OMIT region. For use with
                      omit_region_specification=omit_selection
      omit_region_specification= *composite_omit omit_around_pdb
                                 omit_selection You can specify what region an
                                 omit (simple/sa-omit/iterative-build-omit)
                                 map is to be calculated for. Composite omit
                                 will create a map over the entire asymmetric
                                 unit by dividing the asymmetric unit into
                                 overlapping boxes, calculating omit maps for
                                 each, and splicing all the results together
                                 into a single composite omit map. You can
                                 tell the Wizard how many omit boxes to try
                                 and set up with the keyword
                                 "n_box_target" (but it will not
                                 necessarily choose your number because it has
                                 to be nicely divisible into boxes that fit
                                 your asymmetric unit). Omit around PDB will
                                 omit around the region defined by the PDB
                                 file(s) you enter for omit_box_pdb (or around
                                 the residues in that PDB file that you
                                 specify). If you specify omit_around_pdb then
                                 you must enter a pdb file to omit around. If
                                 you specify omit_selection you must enter a
                                 selection string in omit_selection
      omit_res_start_list= None You can choose to omit just a portion of your
                           model keywords omit_res_start_list 3
                           omit_res_end_list 4 omit_chain_list chain1 (use
                           " " for blank). The residues from 3 to 4
                           of chain1 will be omitted. You can specify more
                           more than one region by listing them separated by
                           spaces If you specify more than one region, a
                           separate omit run will be carried out for each one
                           and then the maps will be put together afterwards.
                           If there are more than one chains in the input PDB
                           file then only the chain defined by omit_chain will
                           be rebuilt. NOTE: Zero for start and end and
                           "" for chain is the same as choosing
                           everything
      omit_res_end_list= None You can choose to omit just a portion of your
                         model keywords omit_res_start_list 3
                         omit_res_end_list 4 omit_chain_list chain1 (use
                         " " for blank). The residues from 3 to 4 of
                         chain1 will be omitted. You can specify more more
                         than one region by listing them separated by spaces
                         If you specify more than one region, a separate omit
                         run will be carried out for each one and then the
                         maps will be put together afterwards. If there are
                         more than one chains in the input PDB file then only
                         the chain defined by omit_chain will be omitted NOTE:
                         Zero for start and end and "" for chain is
                         the same as choosing everything
   rebuild_in_place
      min_seq_identity_percent_rebuild_in_place= 50 The sequence in your input
                                                 PDB file will be adjusted to
                                                 match the sequence in your
                                                 sequence file (if any) You
                                                 can specify the minimum
                                                 sequence identity between
                                                 your sequence file and a
                                                 segment from your input PDB
                                                 file to consider the
                                                 sequences to be matched.
                                                 Default is 50.0%. You might
                                                 want a higher number to make
                                                 sure that deletions in the
                                                 sequence are noticed. The
                                                 value you specify applies to
                                                 rebuild_in_place only. Use
                                                 min_seq_identity_percent
                                                 instead for non
                                                 rebuild_in_place runs.
      n_cycle_rebuild_in_place= None Number of cycles for rebuild_in_place for
                                multiple models only
      n_rebuild_in_place= 1 You can choose how many times to rebuild your
                          model in place with rebuild_in_place
      rebuild_chain_list= None You can choose to rebuild just a portion of
                          your model keywords rebuild_res_start_list 3
                          rebuild_res_end_list 4 rebuild_chain_list chain1
                          (use " " for blank). The residues from 3
                          to 4 of chain1 will be rebuilt. You can specify more
                          than one region by using the Parameter Group Options
                          button to add lines. If there are more than one
                          chains in the input PDB file then only the chain
                          defined by rebuild_chain will be rebuilt. The
                          smallest region that can be rebuilt is 4 residues.
      rebuild_in_place= *Auto True False You can choose to rebuild your model
                        while fixing the sequence alignment by iteratively
                        rebuilding segments within the model. This is done
                        n_rebuild_in_place times, then the models are
                        recombined, taking the best-fitting parts of each.
                        Crossovers allowed where main-chain atom rmsd is less
                        than dist_close. Note that the sequence of the input
                        model must match the supplied sequence closely enough
                        to allow a clear alignment. Also this method does not
                        build any new chain, it just moves the existing model
                        around. Normally this procedure is useful if the model
                        is greater than 95% identical with the target
                        sequence. You can include information directly from
                        the starting model if you want with the keyword
                        include_input_model. Then this model will be
                        recombined with the models that are built based on it.
                        Note that this requires that the input model have a
                        sequence that is identical to the model to be rebuilt.
                        You can also rebuild just a portion of the model with
                        the keywords keywords rebuild_res_start_list 3
                        rebuild_res_end_list 4 rebuild_chain_list chain1 (use
                        " " for blank) The residues from 3 to 4 of
                        chain1 will be rebuilt. You can specify more than one
                        region by using the Parameter Group Options button to
                        add lines NOTE: if a region cannot be rebuilt the
                        original coordinates will be preserved for that
                        region.
      rebuild_near_chain= None You can specify where to rebuild either with
                          rebuild_res_start_list rebuild_res_end_list
                          rebuild_chain_list or with rebuild_near_res and
                          rebuild_near_chain and rebuild_near_dist.
      rebuild_near_dist= 7.5 You can specify where to rebuild either with
                         rebuild_res_start_list rebuild_res_end_list
                         rebuild_chain_list or with rebuild_near_res and
                         rebuild_near_chain and rebuild_near_dist.
      rebuild_near_res= None You can specify where to rebuild either with
                        rebuild_res_start_list rebuild_res_end_list
                        rebuild_chain_list or with rebuild_near_res and
                        rebuild_near_chain and rebuild_near_dist.
      rebuild_res_end_list= None You can choose to rebuild just a portion of
                            your model keywords rebuild_res_start_list 3
                            rebuild_res_end_list 4 rebuild_chain_list chain1
                            (use " " for blank). The residues from 3
                            to 4 of chain1 will be rebuilt. You can specify
                            more than one region by using the Parameter Group
                            Options button to add lines. If there are more
                            than one chains in the input PDB file then only
                            the chain defined by rebuild_chain will be
                            rebuilt. The smallest region that can be rebuilt
                            is 4 residues.
      rebuild_res_start_list= None You can choose to rebuild just a portion of
                              your model keywords rebuild_res_start_list 3
                              rebuild_res_end_list 4 rebuild_chain_list chain1
                              (use " " for blank). The residues from
                              3 to 4 of chain1 will be rebuilt. You can
                              specify more than one region by using the
                              Parameter Group Options button to add lines. If
                              there are more than one chains in the input PDB
                              file then only the chain defined by
                              rebuild_chain will be rebuilt. The smallest
                              region that can be rebuilt is 4 residues.
      rebuild_side_chains= False You can choose to replace side chains (with
                           extend_only) before rebuilding the model (not
                           normally used)
      redo_side_chains= True You can chooses to have AutoBuild choose whether
                        to replace all your side chains in rebuild_in_place,
                        taking new ones if they fit the density better. If
                        True, this is applied to all side chains, not only
                        those that are rebuilt.
      replace_existing= True In rebuild_in_place the usual default is to force
                        the replacement of all residues, even if the rebuilt
                        ones are not as good a fit as the original. The
                        rebuilt model is then crossed with the original model
                        (if include_input_model=True) and the better parts of
                        each are then kept. You can override the replacement
                        of all residues in the initial model-building by
                        saying "False" (do not force replacement of
                        residues, keep whatever is better). Additionally if
                        you set the "touch_up" flag then the default
                        is "True": keep whatever is better.
      delete_bad_residues_only= False You can simply delete the worst parts of
                                your model and write out the resulting model
                                with delete_bad_residues_only=True The
                                criteria used are the ones set with touch_up.
                                Any residues that would be rebuild by
                                touch_up=True will be deleted by
                                delete_bad_residues_only. NOTE:
                                delete_bad_residues_only ignores ligands,
                                waters etc. so you may need to put them back
                                in afterwards.
      touch_up= False You can rebuild just the worst parts of your model by
                setting touch_up=True. You can decide what parts to rebuild
                based on an minimum model-map correlation (by residue). This
                is set with min_cc_residue_rebuild=0.82 Alternatively you can
                rebuild the worst percentage of these:
                worst_percent_res_rebuild=6. If a value is set for both of
                these then residues qualifying in either way are rebuilt.
                NOTE: touch_up is only available with rebuild_in_place.
      touch_up_extra_residues= None Number of residues on each side of the
                               residues identified in touch_up that you want
                               to rebuild. Normally you will want to rebuild
                               one or more on each side.
      worst_percent_res_rebuild= 2 You can rebuild just the worst parts of
                                 your model by setting touch_up=True. You can
                                 decide how much to rebuild using
                                 worst_percent_res_rebuild or with
                                 min_cc_res_rebuild, or both.
      smooth_range= None You can specify what number of residues to smooth in
                    making choices for touch_up and delete_bad_residues_only
                    Typically use 3 or 5.
      smooth_minimum_length= None If specified, then any segments remaining
                             after smoothing that are shorter than
                             smooth_mininum_length will be removed.
   refinement
      refine_b= True You can choose whether phenix.refine is to refine
                individual atomic displacement parameters (B values)
      refine_se_occ= True You can choose to refine the occupancy of SE atoms
                     in a SEMET structure (default=True). This only applies if
                     semet=true
      skip_clash_guard= True Skip refinement check for atoms that clash
      correct_special_position_tolerance= None Adjust tolerance for special
                                          position check. If 0., then check
                                          for clashes near special positions
                                          is not carried out. This sometimes
                                          allows phenix.refine to continue
                                          even if an atom is near a special
                                          position. If 1., then checks within
                                          1 A of special positions. If None,
                                          then uses phenix.refine default. (1)
      use_mlhl= True This script normally uses information from the input file
                (HLA HLB HLC HLD) in refinement. Say No to only refine on Fobs
      place_waters= True You can choose whether phenix.refine automatically
                    places ordered solvent (waters) during the refinement
                    process.
      refinement_resolution= 0 Enter the high-resolution limit for refinement
                             only. This high-resolution limit can be different
                             than the high-resolution limit for other steps.
                             The default ("None" or 0.0) is to use
                             the overall high-resolution limit for this run
                             (as set by resolution)
      ordered_solvent_low_resolution= None You can choose what resolution
                                      cutoff to use fo placing ordered solvent
                                      in phenix.refine. If the resolution of
                                      refinement is greater than this cutoff,
                                      then no ordered solvent will be placed,
                                      even if
                                      refinement.main.ordered_solvent=True.
      link_distance_cutoff= 3 You can specify the maximum bond distance for
                            linking residues in phenix.refine called from the
                            wizards.
      r_free_flags_fraction= 0.1 Maximum fraction of reflections in the free R
                             set. You can choose the maximum fraction of
                             reflections in the free R set and the maximum
                             number of reflections in the free R set. The
                             number of reflections in the free R set will be
                             up the lower of the values defined by these two
                             parameters.
      r_free_flags_max_free= 2000 Maximum number of reflections in the free R
                             set. You can choose the maximum fraction of
                             reflections in the free R set and the maximum
                             number of reflections in the free R set. The
                             number of reflections in the free R set will be
                             up the lower of the values defined by these two
                             parameters.
      r_free_flags_use_lattice_symmetry= True When generating r_free_flags you
                                         can decide whether to include lattice
                                         symmetry (good in general, necessary
                                         if there is twinning).
      r_free_flags_lattice_symmetry_max_delta= 5 You can set the maximum
                                               deviation of distances in the
                                               lattice that are to be
                                               considered the same for
                                               purposes of generating a
                                               lattice-symmetry-unique set of
                                               free R flags.
      allow_overlapping= False You can allow atoms in your ligand files to
                         overlap atoms in your protein/nucleic acid model.
                         This overrides 'keep_pdb_atoms' Useful in early
                         stages of model-building and refinement The ligand
                         atoms get the altloc indicator 'L' NOTE: The ligand
                         occupancy will be refined by default if you set
                         allow_overlapping=True (because of the altloc
                         indicator) You can turn this off with
                         fix_ligand_occupancy=True
      fix_ligand_occupancy= None If allow_overlapping=True then ligand
                            occupancies are refined as a group. You can turn
                            this off with fix_ligand_occupancy=true NOTE: has
                            no effect if allow_overlapping=False
      remove_outlier_segments= True You can remove any segments that are not
                               assigned to sequence if their mean B values are
                               more than remove_outlier_segments_z_cut sd
                               higher than the mean for the structure. NOTE:
                               this is done after refinement, so the R/Rfree
                               are no longer applicable; the remarks in the
                               PDB file are removed
      twin_law= None You can specify a twin law for refinement like this:
                twin_law='-h,k,-l'
      max_occ= None You can choose to set the maximum value of occupancy for
               atoms that have their occupancies refined. Default is None (use
               default value of 1.0 from phenix.refine)
      refine_before_rebuild= True You can choose to refine the input model
                             before rebuilding it
      refine_with_ncs= True This script can allow phenix.refine to
                       automatically identify NCS and use it in refinement.
                       NOTE: ncs refinement and placing waters automatically
                       are mutually exclusive at present.
      refine_xyz= True You can choose whether phenix.refine is to refine
                  coordinates
      s_annealing= False You can choose to carry out simulated annealing
                   during the first refinement after initial model-building
      skip_hexdigest= False You may wish to ignore the hexdigest of the free R
                      flags in your input PDB file if (1) the dataset you
                      provide is not identical to the one that you refined
                      with (but has the same free R flags), or (2) you are
                      providing both an input_data_file and an
                      input_refinement_file or input_hires_file and. In the
                      second case, the resulting composite file may not have
                      the same hexdigest even though the free R flags are
                      copied over. The default is to set skip_hexdigest=True
                      for case #2. For case #1 you have to tell the Wizard to
                      skip the hexdigest (because it cannot know about this).
      use_hl_anom_in_refinement= False See use_hl_anom_in_denmod. If
                                 use_hl_anom_in_refinement=True then the
                                 HLanom HL coefficients from Phaser are used
                                 in refinement
   thoroughness
      build_outside= True Define whether to use the BuildOutside module in
                     model_building
      connect= True Define whether to use the connect module in
               model_building. This module tries to connect nearby chains with
               loops, without using the sequence. This is different than
               fit_loops (which uses the sequence to identify the exact number
               of residues in the loop).
      extensive_build= False You can choose whether to build a new model on
                       every cycle and carry out extra model-building steps
                       every cycle. Default is False (build a new model on
                       first cycle, after that carry out extra steps).
      fit_loops= True You can fit loops automatically if sequence alignment
                 has been done.
      insert_helices= True Define whether to use the insert_helices module in
                      model_building. This module tries to insert helices
                      identified with find_helices_strands into the current
                      working model. This can be useful as the standard build
                      sometimes builds strands into helical density at low
                      resolution.
      n_cycle_build= None Choose number of cycles of building and chain
                     extension during each cycle of model-building. (default
                     of 1 ).
      n_cycle_build_max= 6 Maximum number of cycles for iterative
                         model-building, starting from experimental phases
                         without a model. Even if a satisfactory model is not
                         found, a maximum of n_cycle_build_max cycles will be
                         carried out.
      n_cycle_build_min= 1 Minimum number of cycles for iterative
                         model-building, starting from experimental phases
                         without a model. Even if a satisfactory model is
                         found, n_cycle_build_min cycles will be carried out.
      n_cycle_rebuild_max= 15 Maximum number of cycles for iterative
                           model-rebuilding, starting from a model. Even if a
                           satisfactory model is not found, a maximum of
                           n_cycle_rebuild_max cycles will be carried out.
      n_cycle_rebuild_min= 1 Mininum number of cycles for iterative
                           model-rebuilding, starting from a model. Even if a
                           satisfactory model is found, n_cycle_rebuild_min
                           cycles will be carried out.
      n_mini= 10 You can choose how many times to retrace your model in
              "retrace_before_build"
      n_random_frag= 0 In resolve building you can randomize each fragment
                     slightly so as to generate more possibilities for tracing
                     based on extending it.
      n_random_loop= 3 Number of randomized tries from each end for building
                     loops If 0, then one try. If N, then N additional tries
                     with randomization based on rms_random_loop.
      n_try_rebuild= 2 Number of attempts to build each segment of chain
      ncycle_refine= 3 Choose number of refinement cycles (3)
      number_of_models= None This parameter lets you choose how many initial
                        models to build with RESOLVE within a single build
                        cycle. This parameter is now superseded by
                        number_of_parallel_models, which sets the number of
                        models (but now entire build cycles) to carry out in
                        parallel. None or zero means set it automatically.
                        That is what you normally should use. The
                        number_of_models is by default set to 1 and
                        number_of_parallel_models is set to the value of
                        nbatch (typically 4).
      number_of_parallel_models= 0 This parameter lets you choose how many
                                 models to build in parallel. None or 0 means
                                 set it automatically. That is what you
                                 normally should use. You can set this to 1 to
                                 prevent the wizard from running multiple jobs
                                 in parallel
      skip_combine_extend= False You can choose whether to skip the
                           combine-extend step in model-building if only one
                           model is available
      fully_skip_combine_extend= False You can choose whether to skip the
                                 combine-extend step in model-building in all
                                 cases
      thorough_loop_fit= True Try many conformations and accept them even if
                         the fit is not perfect? If you say True the
                         parameters for thorough loop fitting are:
                         n_random_loop=100 rms_random_loop=0.3
                         rho_min_main=0.5 while if you say False those for
                         quick loop fitting are: n_random_loop=20
                         rms_random_loop=0.3 rho_min_main=1.0
   general
      coot_name= "coot" If your version of coot is called something else, then
                 you can specify that here.
      i_ran_seed= 72432 Random seed (positive integer) for model-building and
                  simulated annealing refinement
      raise_sorry= False You can have any failure end with a Sorry instead of
                   simply printout to the screen
      background= True When you specify nproc=nn, you can run the jobs in
                  background (default if nproc is greater than 1) or
                  foreground (default if nproc=1). If you set run_command=qsub
                  (or otherwise submit to a batch queue), then you should set
                  background=False, so that the batch queue can keep track of
                  your runs. There is no need to use background=True in this
                  case because all the runs go as controlled by your batch
                  system. If you use run_command='sh ' (or similar, sh is
                  default) then normally you will use background=True so that
                  all the jobs run simultaneously.
      max_wait_time= 1.0 You can specify the length of time (seconds) to wait
                     when looking for a file. If you have a cluster where jobs
                     do not start right away you may need a longer time to
                     wait. The symptom of too short a wait time is 'File not
                     found'
      wait_between_submit_time= 1.0 You can specify the length of time
                                (seconds) to wait between each job that is
                                submitted when running sub-processes. This can
                                be helpful on NFS-mounted systems when running
                                with multiple processors to avoid file
                                conflicts. The symptom of too short a
                                wait_between_submit_time is File exists:....
      cache_resolve_libs= True Use caching of resolve libraries to speed up
                          resolve
      resolve_size= 12 Size for solve/resolve
                    ("","_giant",
                    "_huge","_extra_huge" or a number
                    where 12=giant 18=huge
      check_run_command= False You can have the wizard check your run command
                         at startup
      run_command= "sh " When you specify nproc=nn, you can run the
                   subprocesses as jobs in background with sh (default) or
                   submit them to a queue with the command of your choice
                   (i.e., qsub ). If you have a multi-processor machine, use
                   sh. If you have a cluster, use qsub or the equivalent
                   command for your system. NOTE: If you set run_command=qsub
                   (or otherwise submit to a batch queue), then you should set
                   background=False, so that the batch queue can keep track of
                   your runs. There is no need to use background=True in this
                   case because all the runs go as controlled by your batch
                   system. If nproc is greater than 1 and you use
                   run_command='sh '(or similar, sh is default) then normally
                   you will use background=True so that all the jobs run
                   simultaneously.
      last_process_is_local= True If true, run the last process in a group in
                             background with sh as part of the job that is
                             submitting jobs. This prevents having the job
                             that is submitting jobs sit and wait for all the
                             others while doing nothing
      skip_r_factor= False You can skip R-factor calculation if refinement is
                     not done and maps_only=True
      skip_xtriage= False You can bypass xtriage if you want. This will
                    prevent you from applying anisotropy corrections, however.
      base_path= None You can specify the base path for files (default is
                 current working directory)
      temp_dir= None Define a temporary directory (it must exist)
      clean_up= True At the end of the entire run the TEMP directories will be
                removed if clean_up is True. The default is yes, delete these
                directories. If you want to remove them after your run is
                finished use a command like "phenix.autobuild run=1
                clean_up=True" Files listed in keep_files will not be
                deleted
      solution_output_pickle_file= None At end of run, write solutions to this
                                   file in output directory if defined
      title= None Enter any text you like to help identify what you did in
             this run
      top_output_dir= None This is used in subprocess calls of wizards and to
                      tell the Wizard where to look for the STOPWIZARD file.
      wizard_directory_number= None This is used by the GUI to define the run
                               number for Wizards. It is the same as
                               desired_run_number NOTE: this value can only be
                               specified on the command line, as the directory
                               number is set before parameters files are read.
      verbose= False Command files and other verbose output will be printed
      extra_verbose= False Facts and possible commands will be printed every
                     cycle if True
      debug= False You can have the wizard stop with error messages about the
             code if you use debug. NOTE: you cannot use Pause with debug.
             Additionally the output goes to the terminal if you specify
             "debug=True"
      require_nonzero= True Require non-zero values in data columns to
                       consider reading in.
      remove_path_word_list= None List of words identifying paths to remove
                             from PATH These can be used to shorten your PATH.
                             For example... cns ccp4 coot would remove all
                             paths containing these words except those also
                             containing phenix. Capitalization is ignored.
      fill= False Fill in all missing reflections to resolution res_fill.
            Applies to density modified maps. See also filled_2fofc_maps in
            autobuild.
      res_fill= None Resolution for filling in missing data (default = highest
                resolution of any datafile). Only applies to density modified
                maps. Default is fill to high resolution of data. Ignored if
                fill=False
      keep_files= overall_best* AutoBuild_run_*.log List of files that are not
                  to be cleaned up. wildcards permitted
      after_autosol= False You can specify that you want to continue on
                     starting with the highest-scoring run of AutoSol in your
                     working directory.
      nbatch= 3 You can specify the number of processors to use (nproc) and
              the number of batches to divide the data into for parallel jobs.
              Normally you will set nproc to the number of processors
              available and leave nbatch alone. If you leave nbatch as None it
              will be set automatically, with a value depending on the Wizard.
              This is recommended. The value of nbatch can affect the results
              that you get, as the jobs are not split into exact replicates,
              but are rather run with different random numbers. If you want to
              get the same results, keep the same value of nbatch.
      nproc= 1 You can specify the number of processors to use (nproc) and the
             number of batches to divide the data into for parallel jobs.
             Normally you will set nproc to the number of processors available
             and leave nbatch alone. If you leave nbatch as None it will be
             set automatically, with a value depending on the Wizard. This is
             recommended. The value of nbatch can affect the results that you
             get, as the jobs are not split into exact replicates, but are
             rather run with different random numbers. If you want to get the
             same results, keep the same value of nbatch. If you set
             nproc=Auto and your machine has n processors, then it will use
             n-1 processors, or 1 if only 1 is available
      quick= False Run everything quickly (number_of_parallel_models=1
             n_cycle_build_max=1 n_cycle_rebuild_max=1)
      resolve_command_list= None Commands for resolve. One per line in the
                            form: keyword value value can be optional
                            Examples: coarse_grid resolution 200 2.0 hklin
                            test.mtz NOTE: for command-line usage you need to
                            enclose the whole set of commands in double quotes
                            (") and each individual command in single
                            quotes (') like this:
                            resolve_command_list="'no_build' 'b_overall
                            23' "
      resolve_pattern_command_list= None Commands for resolve_pattern. One per
                                    line in the form: keyword value value can
                                    be optional Examples: resolution 200 2.0
                                    hklin test.mtz NOTE: for command-line
                                    usage you need to enclose the whole set of
                                    commands in double quotes (") and
                                    each individual command in single quotes
                                    (') like this:
                                    resolve_pattern_command_list="'resolut
                                   ion 200 20' 'hklin test.mtz' "
      ignore_errors_in_subprocess= False Try to ignore errors in sub-processes
                                   This is useful in cases where a very rare
                                   crash occurs and you want to just ignore
                                   that step and go on.
      send_notification= False
      notify_email= None
   special_keywords
      write_run_directory_to_file= None Writes the full name of a run
                                   directory to the specified file. This can
                                   be used as a call-back to tell a script
                                   where the output is going to go.
   run_control
      coot= None Set coot to True and optionally run=[run-number] to run Coot
            with the current model and map for run run-number. In some wizards
            (AutoBuild) you can edit the model and give it back to PHENIX to
            use as part of the model-building process. If you just say coot
            then the facts for the highest-numbered existing run will be
            shown.
      ignore_blanks= None ignore_blanks allows you to have a command-line
                     keyword with a blank value like
                     "input_lig_file_list="
      stop= None You can stop the current wizard with "stopwizard"
            or "stop". If you type "phenix.autobuild run=3
            stop" then this will stop run 3 of autobuild.
      display_facts= None Set display_facts to True and optionally
                     run=[run-number] to display the facts for run run-number.
                     If you just say display_facts then the facts for the
                     highest-numbered existing run will be shown.
      display_summary= None Set display_summary to True and optionally
                       run=[run-number] to show the summary for run
                       run-number. If you just say display_summary then the
                       summary for the highest-numbered existing run will be
                       shown.
      carry_on= None Set carry_on to True to carry on with highest-numbered
                run from where you left off.
      run= None Set run to n to continue with run n where you left off.
      copy_run= None Set copy_run to n to copy run n to a new run and continue
                where you left off.
      display_runs= None List all runs for this wizard.
      delete_runs= None List runs to delete: 1 2 3-5 9:12
      display_labels= None display_labels=test.mtz will list all the labels
                      that identify data in test.mtz. You can use the label
                      strings that are produced in AutoSol to identify which
                      data to use from a datafile like this:
                      peak.data="F+ SIGF+ F- SIGF-". The entire
                      string in quotes counts here You can use the individual
                      labels from these strings as identifiers for data
                      columns in AutoSol or AutoBuild like this:
                      input_refinement_labels="FP SIGFP FreeR_flags"
                      # each individual label counts
      dry_run= False Just read in and check parameter names
      params_only= False Just read in and return parameter defaults. Not for
                   general use
      display_all= False Just read in and display parameter defaults
   non_user_parameters These are obsolete parameters and parameters that the
                       wizards use to communicate among themselves. Not
                       normally for general use.
      gui_output_dir= None Used only by the GUI
      background_map= None You can supply an mtz file (REQUIRED LABELS: FP
                      PHIM FOMM) to use as map coefficients to calculate the
                      electron density in all points in an omit map that are
                      not part of any omitted region. (Default="")
      boundary_background_map= None You can supply an mtz file (REQUIRED
                               LABELS: FP PHIM FOMM) to use as map
                               coefficients to calculate the electron density
                               in all points in the boundary map that are not
                               part of any omitted region.
                               (Default="")
      extend_try_list= True You can fill out the list of parallel jobs to
                       match the number of jobs you want to run at one time,
                       as specified with nbatch.
      force_combine_extend= False You can choose whether to force the
                            combine-extend step in model-building
      model_list= None This keyword lets you name any number of PDB files to
                  consider as starting models for model-building. NOTE: This
                  differs from consider_main_chain_list which will try to add
                  your PDB files EVERY cycle of merging models. In contrast
                  model_list will only do it on the first cycle. NOTE: this
                  only uses the main-chain atoms of your PDB files.
      oasis_cnos= None Enter number of C N O and S atoms here if you have
                  OASIS and want to run it before resolve density modification
                  like this: "C 250 N 121 O 85 S 3"
      offset_boundary_background_map= None You can set the offset of the
                                      boundary_background_map.
      sg= None Obsolete. Use space_group instead
      input_data_file= None Not normally used (same as "data=").
      input_map_file= Auto Not normally used. (Same as map_file).
      input_refinement_file= Auto Not normally used. Same as refinement_file
      input_pdb_file= None Not normally used. Same as "model="
      input_seq_file= Auto Not normally used. Same as seq_file