Using the PHENIX Wizards

Purpose

Any Wizard can be run from the PHENIX GUI, from the command-line, and from keyworded parameters files. All three versions are identical except in the way that they take commands and keywords from the user.

This page describes how to run a Wizard and what a Wizard does in general. The specific Wizard help pages describe the details of each PHENIX Wizard.

Overview of Structure Determination with the PHENIX Wizards

You can use the AutoSol Wizard to solve structures by SAD, MAD, SIR/SIRAS, and MIR/MIRAS. The AutoSol Wizards together can carry out MRSAD. The AutoSol Wizard can also combine SAD, MAD, SIR, and MIR datasets and solve the structure using all available data.

Once you have experimental or MR phases, you can carry out iterative model-building, density modification, and refinement with the AutoBuild Wizard to improve your model. Finally you can use the rebuild_in_place feature of the AutoBuild Wizard to make one very good final model.

If your structure contains ligands, you can place them using the LigandFit Wizard

This help page describes how to run the Wizards from a GUI, the command-line, or a parameters file. The individual Wizard documentation pages describe the strategies and commands for each Wizard:

Usage

Wizard data directories, sub-directories, Facts, and the PDS (Project Data Storage)

AutoSol_run_3_/
AutoSol_run_3_/TEMP0/

which contain intermediate files. These temporary directories will be deleted when the Wizard is finished (unless you set the parameter clean_up to False)

AutoSol_run_3_/OMIT/
AutoSol_run_3_/MULTIPLE_MODELS/
./PDS/

The PDS contains the output of each of your runs for all Wizards and a record of all the Facts (parameters and data) for each run. If you delete a run using the PHENIX Wizard GUI or with a command like "phenix.autosol delete_runs=2", the corresponding entries in the PDS are also deleted. You can copy the PDS from one place to another. Note that if you delete directories such as "AutoSol_run_1_" by hand then the corresponding information remains in the PDS. For this reason it is best to use the GUI or specific commands to delete runs.

Running a Wizard using a multiprocessor machine or on a cluster

You can take advantage of having a multiprocessor machine or a cluster when running the wizards (Currently this applies to the LigandFit and AutoBuild Wizards). For example, adding command

nproc=4

to a command-line command for a Wizard will use 4 processors to run the wizard (if possible). Normally you will run the parallel processes in the background with the default of

background=True

If you have a cluster with a batch queue, you can send subprocesses to the batch queue with

run_command=qsub

(or whatever your batch command is). In this case you will use

background=False

so that the batch queue can keep track of your jobs.

The Wizards divide the data into nbatch batches during processing. The value of

nbatch=3

is set from 3 to 5 by default (depending on the Wizard) and is appropriate if you have up to nbatch processors. If you have more, then you may wish to increase nbatch to match the number of processors. The reason it is done this way is that the value of nbatch can affect the results that you get, as the jobs are not split into exact replicates, but are rather run with different random numbers. If you want to get the same results, keep the same value of nbatch.

Running a Wizard from a GUI

Basic operation of a Wizard from the GUI

Keeping track of multiple runs of a Wizard from the GUI

Setting parameters of a Wizard from the GUI

res_start 4.0

telling resolve in this case to start out density modification at a resolution of 4 A. This allows you to control what solve, resolve and resolve_pattern do more finely than you otherwise can in the Wizards.

Running a Wizard from the command-line

Basic operation of a Wizard from the command-line

phenix.autosol data=w1.sca seq_file=seq.dat 2 Se
phenix.autosol --help all

Keeping track of multiple runs of a Wizard from the command-line

phenix.autosol show_runs
phenix.autosol delete_runs="1 2 4-7"

Note that the group of numbers is enclosed in quotes ("). This tells the input parser (iotbx.phil) that all these numbers go with the one keyword of delete_runs. Note also that there are no spaces around the "=" sign!

phenix.autosol run=2 resolution=3.0
phenix.autosol carry_on resolution=3.0
phenix.autosol copy_run=2 resolution=3.0

Setting parameters of a Wizard from the command-line

When you run a Wizard from the command-line, two files are produced and put in the subdirectory of the Wizard (e.g., AutoBuild_run_3_/).

phenix.autosol autosol.eff

This autosol.eff file (for AutoSol) contains the values of all the AutoSol parameters at the time of starting the Wizard.

Note that the syntax in the autosol.eff file is very slightly different than the syntax from the command line. From the command line, if a value has several parts, you enclose them in quotes and there are no spaces around the "=" sign:

phenix.autosol ... input_phase_labels="FP PHIM FOMM"

In the .eff file, you MUST leave off the quotes or the three values will be treated as one, and you should leave blanks around the "=" sign:

input_phase_labels = FP PHIM FOMM

The reason these are different is that in the .eff file, the structure of the file and the brackets tell the PHIL parser what is grouped together, while from the commmand line, the quotes tell the parser what is to be grouped together.

phenix.autosol parameters.eff
phenix.autosol --help data  # get help on the keyword data for autosol
phenix.autosol show_facts
phenix.autosol run=3 show_facts
phenix.autosol show_summary
autosol
       sites= None Number of heavy-atom sites. (Command-line only)

which describes the keyword sites in the scope defined by autosol. You can explicitly specify this on the command line with:

autosol.sites=3

which in this case is entirely the same as

sites=3
resolve_command="'ligand_start start.pdb'"    # NOTE ' and " quotes

This will put the text

ligand_start start.pdb

at the end of every temporary command file created to run resolve.

Running a Wizard from a parameters file

Parameters files are an easy way to specify any parameters that you want to use when running a Wizard. They are structured in a clear way and you can edit them to set the values that you want.

You can get a parameters file to edit with any wizard by running the wizard with the flag "--show_defaults":

phenix.autosol --show_defaults

Here is a parameters file "sad.eff" to run a SAD dataset with "phenix.autosol sad.eff":

autosol {  atom_type = Se
  sites = 2
  seq_file = sequence.dat
  crystal_info {
    space_group = C2
    unit_cell = 76.08 27.97 42.36 90 103.2 90
    resolution = 2.6
  }
  wavelength {
    data = high.sca
    lambda = 0.9600
    f_prime = -1.5
    f_double_prime = 3
  }
}

Note the scope names ("autosol" or "crystal_info") followed by paired brackets ({ ....}) which enclose sets of parameters that are related.

The values of parameters are usually entered on one line, without quotation marks (as in this example) unless they are to be all considered as a single item.

You can specify almost anything either in a parameters file or on the command line. In the above example you could also just say:

phenix.autosol atom_type = Se sites = 2 seq_file = sequence.dat \
    space_group = C2  \
    unit_cell = "76.08 27.97 42.36 90 103.2 90"  \
    resolution = 2.6  \
    data = high.sca \
    lambda = 0.9600 \
    f_prime = -1.5 \
    f_double_prime = 3

(note that the cell parameters are in quotes on the command line and not in the parameters file) and you would get the same results. For simple cases the command-line format is fine, but for anything with a lot of parameters to set it is much easier to just edit a parameters file.

Specific limitations and problems:

Literature

Additional information