Using the PHENIX Wizards
Purpose
Any Wizard can be run from the PHENIX GUI, from the command-line, and
from keyworded parameters files. All three versions are identical except
in the way that they take commands and keywords from the user.
This page describes how to run a Wizard and what a Wizard does in
general. The specific Wizard help pages describe the details of each
PHENIX Wizard.
Overview of Structure Determination with the PHENIX Wizards
You can use the AutoSol Wizard to solve structures by SAD, MAD,
SIR/SIRAS, and MIR/MIRAS.
The AutoSol Wizards together can carry out MRSAD. The AutoSol
Wizard can also combine SAD, MAD, SIR, and MIR datasets and solve the
structure using all available data.
Once you have experimental or MR phases, you can carry out iterative
model-building, density modification, and refinement with the AutoBuild
Wizard to improve your model. Finally you can use the rebuild_in_place
feature of the AutoBuild Wizard to make one very good final model.
If your structure contains ligands, you can place them using the
LigandFit Wizard
This help page describes how to run the Wizards from a GUI, the
command-line, or a parameters file. The individual Wizard documentation
pages describe the strategies and commands for each Wizard:
Usage
Wizard data directories, sub-directories, Facts, and the PDS (Project Data Storage)
- The directory that you are in when you start up PHENIX is your
working directory.
- Each run of a Wizard will have all output data in a subdirectory of
your working directory named like this (for AutoSol run 3):
AutoSol_run_3_/
- This subdirectory will have one or more temporary directories:
AutoSol_run_3_/TEMP0/
which contain intermediate files. These temporary directories will be
deleted when the Wizard is finished (unless you set the parameter
clean_up to False)
- For OMIT and MULTIPLE-MODEL runs, the final OMIT maps and multiple
models will be in a subdirectory of your run directory:
AutoSol_run_3_/OMIT/
AutoSol_run_3_/MULTIPLE_MODELS/
- All the parameter values as well as any other information that a
Wizard generates during its run is stored in the PDS (Project Data
Storage) and/or the Wizard Facts. The Facts are values of parameters
and pointers to files in the PDS. The Facts keep track of the current
knowledge available to the Wizard. Each time a step is completed by a
Wizard, the new Facts are saved (overwriting old ones for that run).
As the Facts define the state of the Wizard, the Wizard can be
restarted any time by loading the appropriate set of Facts.
- The PDS (Project data storage) will be in your working directory:
./PDS/
The PDS contains the output of each of your runs for all Wizards and a
record of all the Facts (parameters and data) for each run. If you
delete a run using the PHENIX Wizard GUI or with a command like
"phenix.autosol delete_runs=2", the corresponding entries in the PDS
are also deleted. You can copy the PDS from one place to another. Note
that if you delete directories such as "AutoSol_run_1_" by hand then
the corresponding information remains in the PDS. For this reason it is
best to use the GUI or specific commands to delete runs.
Running a Wizard using a multiprocessor machine or on a cluster
You can take advantage of having a multiprocessor machine or a cluster
when running the wizards (Currently this applies to the LigandFit and
AutoBuild Wizards). For example, adding command
nproc=4
to a command-line command for a Wizard will use 4 processors to run the
wizard (if possible). Normally you will run the parallel processes in
the background with the default of
background=True
If you have a cluster with a batch queue, you can send subprocesses to
the batch queue with
run_command=qsub
(or whatever your batch command is). In this case you will use
background=False
so that the batch queue can keep track of your jobs.
The Wizards divide the data into nbatch batches during processing. The
value of
nbatch=3
is set from 3 to 5 by default (depending on the Wizard) and is
appropriate if you have up to nbatch processors. If you have more, then
you may wish to increase nbatch to match the number of processors. The
reason it is done this way is that the value of nbatch can affect the
results that you get, as the jobs are not split into exact replicates,
but are rather run with different random numbers. If you want to get the
same results, keep the same value of nbatch.
Running a Wizard from a GUI
Basic operation of a Wizard from the GUI
- Start up the PHENIX GUI in your working directory by typing "phenix"
- Answer "yes" to the question "Do you want to make it a project
directory?".
- Launch a Wizard from the PHENIX GUI by double-clicking on the name of
the Wizard ("AutoSol") under "Wizards" in the Strategy Interface of
the main GUI.
- The Wizard will come up in a blue window and will open a grey
Parameters window asking you for information on what files to use and
what to do.
- Enter the file names and make choices as necessary (NOTE: to select a
file click on the yellow box to the right of the file entry field. To
add a new file entry field click on the "Parameter group options" tab
if present).
- Proceed to the next window by clicking "Continue" in the upper left
corner of the grey Parameters window.
- The Wizard will guide you through the necessary inputs, then it will
continue on its own until it is finished.
- When the Wizard is done, you can double-click on the Display icon
(the little magnifying glass on the upper left of the blue Wizard
window) to show a list of files and maps that can be displayed.
(NOTE: The Display Options window is updated when you open it. Once
this window is open you cannot open it again until you close it.
Sometimes this window may be behind other windows and this will
prevent you from opening it again.)
- You can open the Parameters window any time the Wizard is stopped by
clicking on the Parameters icon (4 little lines in the upper left
corner of the blue Wizard window). This allows you to carry out some
of the more advanced options below.
- Your output log file will be in a file called "AutoSol.1.output" for
an AutoSol run. You can also see the same file by clicking on the
"LOG" button at the lower right of the blue or green window.
Keeping track of multiple runs of a Wizard from the GUI
- You can run more than one Wizard job at a time if you want. Each run
of a Wizard is put in a separate sub-directory (e.g.,
"AutoSol_run_1_").
- When you start a Wizard, it will start a new run of that Wizard.
- If you want to continue on with the highest-numbered run of a Wizard,
you can start the Wizard with the continue button for that Wizard
(for example the continue_AutoSol button).
- If you want to go back to a previous run, you can use the Run Control
and Run Number selections near the bottom of any Parameters window
(NOTE: to open the parameters window click on the lines at the upper
left of the blue Wizard window). Select goto_run and choose a run
number to go to.
- If you want to copy a previous run and go on, use the Run Control and
Run Number selections and select copy_run and choose a run number to
copy. The Wizard will create a new run (with number equal to the
highest previous number plus one) and carry on with it.
- To see what runs are available, select View or Delete Runs in the
Navigate tab at the lower left of any Parameters window.
- If you want to stop the Wizard, hit the PAUSE button on the green
Wizard window (the Wizard is green when running, blue or purple when
stopped). NOTE: this may take a little time, particularly if Phaser
or HYSS or phenix.refine are running. In those cases if you really
want to stop the Wizard right away, got to "Strategy" and then select
"Stop Strategy" and it will be stopped.
Setting parameters of a Wizard from the GUI
- You can set any parameter in a Wizard by selecting the variable in
the Choose Variable to Set tab. The next time you click Continue, the
Wizard will save all the current inputs as usual, and then instead of
going on to the next step, it will open a window asking you for the
new value of that variable. When you enter it and press Continue, the
Wizard will continue on with what it was doing, but with this new
value.
- NOTE that some parameters (e.g., resolution) may affect many steps.
If a prior step is affected by a parameter that is changed, the
Wizard does not go back and change it. If you want the parameter
change to affect something that has already been done, you need to
re-run the corresponding step.
- NOTE that you can set any SOLVE, RESOLVE or RESOLVE_PATTERN keyword
when you are running a Wizard using the "resolve_command",
"solve_command" or "resolve_pattern_command" keywords. These can
be set in the GUI from the Choose Variable pull-down menu. You just
type in the command to the entry form like this: (for
resolve_command):
res_start 4.0
telling resolve in this case to start out density modification at a
resolution of 4 A. This allows you to control what solve, resolve and
resolve_pattern do more finely than you otherwise can in the Wizards.
Navigating steps in a Wizard from the GUI
- When the Wizard is done or Paused, you can select any available step
in the Navigate tab at the middle bottom of any Parameter window.
This tells the Wizard to get any necessary inputs for that step and
to then carry it out.
- The Wizards normally start out in Manual mode (one step at a time,
asking user for inputs). Once the necessary inputs are entered, the
Wizard enters Automatic mode (no more asking for inputs until
something required is missing). You can control this by specifying
Manual or Automatic in the Auto/Manual tab at the bottom right of any
Wizard.
Running a Wizard from the command-line
Basic operation of a Wizard from the command-line
- You can run a wizard from the command line like this (autosol is the
AutoSol wizard):
phenix.autosol data=w1.sca seq_file=seq.dat 2 Se
- The command_line interpreter will try to interpret obvious
information (2 means sites=2, Se means atom_type=Se) and will run
the wizard.
- To see all the information about this wizard and the keywords that
you can set for this wizard, type:
phenix.autosol --help all
- Any wizard keyword can be entered at the command line (not just the
ones labelled "command-line only"). The documentation for each wizard
lists all the keywords that apply to that wizard.
- If you want to stop a Wizard, you can create a file "STOPWIZARD" and
put it in the subdirectory (i.e., AutoSol_2_/) where the Wizard is
running. This is like hitting the PAUSE button on the GUI and stops
the wizard cleanly.
Keeping track of multiple runs of a Wizard from the command-line
- When you start a Wizard from the command line, the default is to
start a new run of that Wizard.
- To see all the available runs of this Wizard, type:
phenix.autosol show_runs
- To delete runs 1,2 and 4-7 of this Wizard, type something like this:
phenix.autosol delete_runs="1 2 4-7"
Note that the group of numbers is enclosed in quotes ("). This tells the
input parser (iotbx.phil) that all these numbers go with the one keyword
of delete_runs. Note also that there are no spaces around the "=" sign!
- To go back to run 2 and carry on (remembering all previous inputs and
possibly adding new ones, in this case setting the resolution) type
something like:
phenix.autosol run=2 resolution=3.0
- To carry on with the current highest-numbered run (remembering all
previous inputs and possibly adding new ones, in this case setting
the resolution) type something like:
phenix.autosol carry_on resolution=3.0
- To copy run 2 to a new run and carry on from there (remembering
all previous inputs and possibly adding new ones, in this case
setting the resolution) type something like:
phenix.autosol copy_run=2 resolution=3.0
Setting parameters of a Wizard from the command-line
When you run a Wizard from the command-line, two files are produced and
put in the subdirectory of the Wizard (e.g., AutoBuild_run_3_/).
- A parameters (".eff") file will be produced that you can edit to
rerun the Wizard:
phenix.autosol autosol.eff
This autosol.eff file (for AutoSol) contains the values of all the
AutoSol parameters at the time of starting the Wizard.
Note that the syntax in the autosol.eff file is very slightly different
than the syntax from the command line. From the command line, if a value
has several parts, you enclose them in quotes and there are no spaces
around the "=" sign:
phenix.autosol ... input_phase_labels="FP PHIM FOMM"
In the .eff file, you MUST leave off the quotes or the three values will
be treated as one, and you should leave blanks around the "=" sign:
input_phase_labels = FP PHIM FOMM
The reason these are different is that in the .eff file, the structure
of the file and the brackets tell the PHIL parser what is grouped
together, while from the commmand line, the quotes tell the parser what
is to be grouped together.
- A parameters file (".eff") is produced that you can edit and use like
this:
phenix.autosol parameters.eff
- To get keyword help on a specific keyword you can type:
phenix.autosol --help data # get help on the keyword data for autosol
- To show current Facts (values of all parameters) for
highest_numbered run:
phenix.autosol show_facts
- To show current Facts (values of all parameters) for run 3:
phenix.autosol run=3 show_facts
phenix.autosol show_summary
- When you use a keyword like data= you need to give enough
information to specify this keyword uniquely. You can see all the
keywords for each PHENIX Wizard or tool at the end of the
documentation for that Wizard or tool. This will have entries like
this (for AutoSol):
autosol
sites= None Number of heavy-atom sites. (Command-line only)
which describes the keyword sites in the scope defined by
autosol. You can explicitly specify this on the command line with:
autosol.sites=3
which in this case is entirely the same as
sites=3
- NOTE that you can set any SOLVE, RESOLVE or RESOLVE_PATTERN keyword
in PHENIX using the "resolve_command", "solve_command" or
"resolve_pattern_command" keywords from the command line. The
format is a little tricky: you have to put two sets of quotes around
the command like this:
resolve_command="'ligand_start start.pdb'" # NOTE ' and " quotes
This will put the text
ligand_start start.pdb
at the end of every temporary command file created to run resolve.
Running a Wizard from a parameters file
Parameters files are an easy way to specify any parameters that you want
to use when running a Wizard. They are structured in a clear way and you
can edit them to set the values that you want.
You can get a parameters file to edit with any wizard by running the
wizard with the flag "--show_defaults":
phenix.autosol --show_defaults
Here is a parameters file "sad.eff" to run a SAD dataset with
"phenix.autosol sad.eff":
autosol { atom_type = Se
sites = 2
seq_file = sequence.dat
crystal_info {
space_group = C2
unit_cell = 76.08 27.97 42.36 90 103.2 90
resolution = 2.6
}
wavelength {
data = high.sca
lambda = 0.9600
f_prime = -1.5
f_double_prime = 3
}
}
Note the scope names ("autosol" or "crystal_info") followed by paired
brackets ({ ....}) which enclose sets of parameters that are related.
The values of parameters are usually entered on one line, without
quotation marks (as in this example) unless they are to be all
considered as a single item.
You can specify almost anything either in a parameters file or on the
command line. In the above example you could also just say:
phenix.autosol atom_type = Se sites = 2 seq_file = sequence.dat \
space_group = C2 \
unit_cell = "76.08 27.97 42.36 90 103.2 90" \
resolution = 2.6 \
data = high.sca \
lambda = 0.9600 \
f_prime = -1.5 \
f_double_prime = 3
(note that the cell parameters are in quotes on the command line and not
in the parameters file) and you would get the same results. For simple
cases the command-line format is fine, but for anything with a lot of
parameters to set it is much easier to just edit a parameters file.
Specific limitations and problems:
- In the GUI version of Wizards, The Display Options window is updated
only when you open it. Further, once this window is open you cannot
open it again until you close it. Sometimes this window may be behind
other windows and this will prevent you from opening it again until
you close the open window.
- The Wizards use file names based on the names of your input files,
but they do not differentiate between files with the same name coming
from different directories. Consequently you should not use two files
with different contents but with the same file name as inputs to a
Wizard, even if they come from separate starting directories.
- If you stop a Wizard and continue on with a command such as
phenix.autobuild run=2 then you can change most parameters with
keywords just as if you were starting from scratch, but if you had
previously changed a keyword away from the default, you cannot set it
back to the default in this way (the Wizard ignores keywords that are
the same as the default).
- You should not work on the same run in two ways at the same time.
This can lead to unpredictable results because the two runs will
really be the same run and the data and databases for the two runs
will be overwriting each other. This means you need to be careful
that if you goto_run 1 of a Wizard in one window that you do not
also goto_run 1 of the same Wizard in another window. On the other
hand, it is perfectly fine to work on run 1 of a Wizard in one window
and run 2 of the same Wizard in another window.
- The PHENIX Wizards can take most settings of most space groups,
however they can only use the hexagonal setting of rhombohedral space
groups (eg., #146 R3:H or #155 R32:H), and cannot use space groups
114-119 (not found in macromolecular crystallography) even in the
standard setting due to difficulties with the use of asuset in the
version of ccp4 libraries used in PHENIX for these settings and space
groups.
Literature
Additional information