Using the PHENIX Wizards
- Purpose
- Overview of Structure Determination with the PHENIX Wizards
- Usage
- Wizard data directories, sub-directories, Facts, and the PDS (Project Data Storage)
- Running a Wizard using a multiprocessor machine or on a cluster
- Running a Wizard from a GUI
- Basic operation of a Wizard from the GUI
- Keeping track of multiple runs of a Wizard from the GUI
- Setting parameters of a Wizard from the GUI
- Navigating steps in a Wizard from the GUI
- Running a Wizard from the command-line
- Basic operation of a Wizard from the command-line
- Keeping track of multiple runs of a Wizard from the command-line
- Setting parameters of a Wizard from the command-line
- Running a Wizard from a parameters file
- Specific limitations and problems:
- Literature
- Additional information
Purpose
Any Wizard can be run from the PHENIX GUI, from the command-line,
and from keyworded parameters files. All three versions are identical except
in the way that they take commands and keywords from the user.
This page describes how to run a Wizard and what a Wizard does in general.
The specific Wizard help pages describe the details of each PHENIX Wizard.
Overview of Structure Determination with the PHENIX Wizards
You can use the AutoSol Wizard to solve structures by SAD, MAD, SIR/SIRAS,
and MIR/MIRAS. The AutoMR Wizard can solve a structure by MR. The AutoMR and
AutoSol Wizards together can carry out MRSAD.
The AutoSol Wizard can also combine SAD, MAD, SIR, and MIR datasets and solve
the structure using all available data.
Once you have experimental or MR phases, you can carry out iterative
model-building, density modification, and refinement with the AutoBuild Wizard
to improve your model. Finally you can use the rebuild_in_place feature of the
AutoBuild Wizard to make one very good final model.
If your structure contains ligands, you can place them using the
LigandFit Wizard
This help page describes how to run the Wizards from a GUI,
the command-line, or a parameters file. The
individual Wizard documentation pages
describe the strategies and commands for each Wizard:
Usage
Wizard data directories, sub-directories, Facts, and the
PDS (Project Data Storage)
- The directory that you are in when you start up PHENIX is
your working directory.
- Each run of a Wizard will have all output data in a subdirectory of
your working directory named like this (for AutoSol run 3):
AutoSol_run_3_/
- This subdirectory will have one or more temporary directories:
AutoSol_run_3_/TEMP0/
which contain intermediate files. These temporary
directories will be deleted when
the Wizard is finished (unless you set the parameter clean_up to False)
- For OMIT and MULTIPLE-MODEL runs, the final OMIT maps and multiple
models will be in a subdirectory of your run directory:
AutoSol_run_3_/OMIT/
AutoSol_run_3_/MULTIPLE_MODELS/
- All the parameter values as well as any other information that a
Wizard generates during its run is stored in the PDS (Project Data
Storage) and/or the Wizard Facts. The Facts are values of parameters
and pointers to files in the PDS. The Facts keep track of the current
knowledge available to the Wizard. Each time a step is completed by
a Wizard, the new Facts are saved (overwriting old ones for that run).
As the Facts define the state of the Wizard, the Wizard can be restarted
any time by loading the appropriate set of Facts.
- The PDS (Project data storage) will be in your working directory:
./PDS/
The PDS contains the output of each of your runs for all
Wizards and a record of all the Facts
(parameters and data) for each run. If you delete a run using
the PHENIX Wizard GUI or with
a command like "phenix.autosol delete_runs=2", the corresponding
entries in the PDS are also deleted. You can copy the PDS from one place to
another. Note that if you delete directories such as
"AutoSol_run_1_" by hand then the corresponding information
remains in the PDS. For this reason
it is best to use the GUI or specific commands to delete runs.
Running a Wizard using a multiprocessor machine or on a cluster
You can take advantage of having a multiprocessor machine or a cluster
when running the wizards (Currently this applies to the LigandFit and
AutoBuild Wizards). For example, adding command
nproc=4
to a command-line command for a Wizard will use 4 processors to run the
wizard (if possible). Normally you will run the parallel processes in
the background with the default of
background=True
If you have a cluster with a batch queue, you can send subprocesses to the
batch queue with
run_command=qsub
(or whatever your batch command is). In this case you will use
background=False
so that the batch queue can keep track of your jobs.
The Wizards divide the data into nbatch batches during processing. The
value of
nbatch=3
is set from 3 to 5 by default (depending on the Wizard)
and is appropriate if you have up to nbatch processors. If you
have more, then you may wish to increase nbatch to match the number of
processors. The reason it is done this way is that the value of nbatch
can affect the results that you get, as the jobs are not split into
exact replicates, but are rather run with different random numbers.
If you want to get the same results, keep the same value of nbatch.
Running a Wizard from a GUI
Basic operation of a Wizard from the GUI
- Start up the PHENIX GUI in your working directory by typing "phenix"
- Answer "yes" to the question "Do you want to make it a project directory?".
- Launch a Wizard from the PHENIX GUI by double-clicking on the
name of the Wizard ("AutoSol")
under "Wizards" in the Strategy Interface of the main GUI.
- The Wizard will come up in a blue window and will
open a grey Parameters window asking you for information on what files to use
and what to do.
- Enter the file names and make choices as necessary (NOTE: to select a
file click on the yellow box to the right of the file entry field. To add a
new file entry field click on the "Parameter group options" tab if present).
- Proceed to the next window by clicking "Continue" in the upper left
corner of the grey Parameters window.
- The Wizard will guide you through the necessary inputs, then it will
continue on its own until it is finished.
- When the Wizard is done, you can double-click on the Display icon
(the little magnifying glass on the upper left of the blue Wizard window)
to show a list of files and maps that can be displayed. (NOTE:
The Display Options window is updated when you open it. Once this
window is open you cannot open it again until you close it. Sometimes this
window may be behind other windows and this will prevent you from
opening it again.)
- You can open the Parameters window any time the Wizard is stopped by
clicking on the Parameters icon (4 little lines in the upper left corner
of the blue Wizard window). This allows you to carry out some of the
more advanced options below.
- Your output log file will be in a file called "AutoSol.1.output" for
an AutoSol run. You can also see the same file by clicking on the "LOG" button
at the lower right of the blue or green window.
Keeping track of multiple runs of a Wizard from the GUI
- You can run more than one Wizard job at a time if you want. Each run of a
Wizard is put in a separate sub-directory (e.g., "AutoSol_run_1_").
- When you start a Wizard, it will start a new run of that Wizard.
- If you want to continue on with the highest-numbered run of a Wizard,
you can start the Wizard with the continue button for that Wizard
(for example the continue_AutoSol button).
- If you want to go back to a previous run, you can use the Run Control and
Run Number selections near the bottom of any Parameters window (NOTE: to open the parameters window click on the lines at the upper left of the blue Wizard
window). Select goto_run and choose a run number to go to.
- If you want to copy a previous run and go on, use the Run Control and Run
Number selections and select copy_run and choose a run number to copy. The
Wizard will create a new run (with number equal to the highest previous number
plus one) and carry on with it.
- To see what runs are available, select View or Delete Runs in the
Navigate tab at the lower left of any Parameters window.
- If you want to stop the Wizard, hit the PAUSE button on the green Wizard
window (the Wizard is green when running, blue or purple when stopped). NOTE:
this may take a little time, particularly if Phaser or HYSS or phenix.refine
are running. In those cases if you really want to stop the Wizard right away,
got to "Strategy" and then select "Stop Strategy" and it will be stopped.
Setting parameters of a Wizard from the GUI
- You can set any parameter in a Wizard by selecting the
variable in the Choose Variable to Set tab. The next time you click
Continue, the Wizard will save all the current inputs as usual, and
then instead of going on to the next step, it will open a window asking
you for the new value of that variable. When you enter it and press
Continue, the Wizard will continue on with what it was doing, but with
this new value.
- NOTE that some parameters (e.g., resolution) may affect many steps.
If a prior step is affected by a parameter that is changed, the Wizard
does not go back and change it. If you want the parameter change to affect
something that has already been done, you need to re-run the corresponding
step.
- NOTE that you can set any SOLVE, RESOLVE or RESOLVE_PATTERN keyword
when you are running a Wizard using the "resolve_command", "solve_command" or "resolve_pattern_command" keywords. These can be set in the GUI from the Choose Variable
pull-down menu. You just type in the command to the entry form like this:
(for resolve_command):
res_start 4.0
telling resolve in this case to start out density modification at a resolution
of 4 A.
This allows you to control what solve, resolve and resolve_pattern do more
finely than you otherwise can in the Wizards.
Navigating steps in a Wizard from the GUI
- When the Wizard is done or Paused, you can select any available
step in the Navigate tab at the middle bottom of any Parameter window.
This tells the Wizard to get any necessary inputs for
that step and to then carry it out.
- The Wizards normally start out in Manual mode (one step at a time,
asking user for inputs). Once the necessary inputs are entered, the
Wizard enters Automatic mode (no more asking for inputs until something
required is missing). You can control this by specifying Manual
or Automatic in the Auto/Manual tab at the bottom right of any Wizard.
Running a Wizard from the command-line
Basic operation of a Wizard from the command-line
Keeping track of multiple runs of a Wizard from the command-line
- When you start a Wizard from the command line, the default is
to start a new run of that Wizard.
- To see all the available runs of this Wizard, type:
phenix.autosol show_runs
- To delete runs 1,2 and 4-7 of this Wizard, type something like this:
phenix.autosol delete_runs="1 2 4-7"
Note that the group of numbers is enclosed in quotes ("). This tells the
input parser (iotbx.phil) that all these numbers go with the one keyword
of delete_runs. Note also that there are no spaces around the "=" sign!
- To go back to run 2 and carry on (remembering all previous inputs
and possibly adding new ones, in this case setting the resolution) type
something like:
phenix.autosol run=2 resolution=3.0
- To carry on with the current highest-numbered run
(remembering all previous inputs
and possibly adding new ones, in this case setting the resolution) type
something like:
phenix.autosol carry_on resolution=3.0
- To copy run 2 to a new run and carry on from there
(remembering all previous inputs
and possibly adding new ones, in this case setting the resolution) type
something like:
phenix.autosol copy_run=2 resolution=3.0
Setting parameters of a Wizard from the command-line
When you run a Wizard from the command-line, two files are produced
and put in the subdirectory of the Wizard (e.g., AutoBuild_run_3_/).
- A parameters (".eff") file will be produced that you can
edit to rerun the Wizard:
phenix.autosol autosol.eff
This autosol.eff file (for AutoSol) contains the values of all the AutoSol
parameters at the time of starting the Wizard.
Note that the syntax in the autosol.eff file is very slightly different
than the syntax from the command line. From the command line, if a value has
several parts, you enclose them in quotes and there are no spaces around the
"=" sign:
phenix.autosol ... input_phase_labels="FP PHIM FOMM"
In the .eff file, you MUST leave off the quotes or the three values will
be treated as one, and you should leave blanks around the "=" sign:
input_phase_labels = FP PHIM FOMM
The reason these are different is that in the .eff file, the structure of
the file and the brackets tell the PHIL parser what is grouped together,
while from the commmand line, the quotes tell the parser what is to be
grouped together.
- A parameters file (".eff")
is produced that you can edit and use like this:
phenix.autosol parameters.eff
- To get keyword help on a specific keyword you can type:
phenix.autosol --help data # get help on the keyword data for autosol
- To show current Facts (values of all parameters) for highest_numbered run:
phenix.autosol show_facts
- To show current Facts (values of all parameters) for run 3:
phenix.autosol run=3 show_facts
- To show current summary:
phenix.autosol show_summary
-
When you use a keyword like data= you need to give
enough information to specify this keyword uniquely. You can see all
the keywords for each PHENIX Wizard or tool at the end of the
documentation for that Wizard or tool. This will have entries like this
(for AutoSol):
autosol
sites= None Number of heavy-atom sites. (Command-line only)
which describes the keyword sites in the scope defined by
autosol. You can explicitly specify this on the command line with:
autosol.sites=3
which in this case is entirely the same as
sites=3
- NOTE that you can set any SOLVE, RESOLVE or RESOLVE_PATTERN keyword
in PHENIX using the "resolve_command", "solve_command" or
"resolve_pattern_command" keywords from the command line. The format is
a little tricky: you have to put two sets of quotes around the command like
this:
resolve_command="'ligand_start start.pdb'" # NOTE ' and " quotes
This will put the text
ligand_start start.pdb
at the end of every temporary command file created to run resolve.
Running a Wizard from a parameters file
Parameters files are an easy way to specify any parameters that you
want to use when running a Wizard. They are structured in a clear way and
you can edit them to set the values that you want.
You can get a parameters file to edit with any wizard by running the
wizard with the flag "--show_defaults":
phenix.autosol --show_defaults
Here is a parameters file "sad.eff" to run a SAD dataset
with "phenix.autosol sad.eff":
autosol { atom_type = Se
sites = 2
seq_file = sequence.dat
crystal_info {
space_group = C2
unit_cell = 76.08 27.97 42.36 90 103.2 90
resolution = 2.6
}
wavelength {
data = high.sca
lambda = 0.9600
f_prime = -1.5
f_double_prime = 3
}
}
Note the scope names ("autosol" or "crystal_info")
followed by paired brackets ({ ....}) which enclose
sets of parameters that
are related.
The values of parameters are usually entered on one line,
without quotation marks (as in this example) unless they are to be
all considered as a single item.
You can specify almost anything either in a parameters file or on
the command line. In the above example you could also just say:
phenix.autosol atom_type = Se sites = 2 seq_file = sequence.dat \
space_group = C2 \
unit_cell = "76.08 27.97 42.36 90 103.2 90" \
resolution = 2.6 \
data = high.sca \
lambda = 0.9600 \
f_prime = -1.5 \
f_double_prime = 3
(note that the cell parameters are in quotes on the command line and not
in the parameters file)
and you would get the same results.
For simple cases the command-line format is
fine, but for anything with a lot of parameters to set it is much easier to
just edit a parameters file.
Specific limitations and problems:
-
In the GUI version of Wizards,
The Display Options window is updated only when you open it. Further,
once this window is open you cannot open it again until you close it.
Sometimes this window may be behind other windows and this will
prevent you from opening it again until you close the open window.
- The Wizards use file names based on the names of your input files, but
they do not differentiate between files with the same name coming from
different directories. Consequently you should not use two files with
different contents but with the same file name as inputs to a Wizard, even
if they come from separate starting directories.
- If you stop a Wizard and continue on with a command such as
phenix.autobuild run=2 then you can change most parameters with keywords
just as if you were starting from scratch, but if you had previously changed
a keyword away from the default, you cannot set it back to the default in
this way (the Wizard ignores keywords that are the same as the default).
-
You should not work on the same
run in two ways at the same time. This can lead to unpredictable results
because the two runs will really be the same run and the data and
databases for the two runs will be overwriting each other.
This means you need to be careful that if you goto_run 1 of a Wizard
in one window that you do not also goto_run 1 of the same Wizard
in another window.
On the other hand, it is perfectly fine to work on run 1 of a Wizard in
one window and run 2 of the same Wizard in another window.
- The PHENIX Wizards can take most settings of most space groups,
however they can only use the hexagonal setting of rhombohedral space
groups (eg., #146 R3:H or #155 R32:H), and cannot use space groups 114-119
(not found in macromolecular crystallography) even in the standard setting
due to difficulties with the use of asuset in the version of ccp4 libraries
used in PHENIX for these settings and space groups.
Literature
Additional information
|