General issues

How do I know if Phaser has solved my structure?

This is answered in detail in the section "Has Phaser solved it?" in the main Phaser manual, but it can be summarized as follows: the final translation function Z score (TFZ) should be above 8, and the log-likelihood gain (LLG) should be positive and as high as possible. Of course, as with experimental phasing, the ability to autobuild most of the model is the single best measure of a correct solution; alternately, running a single round of refinement should result in an R-free below 0.50. Note that if the R-free is above 0.5 after refinement, this often indicates indicates that the solution is actually incorrect - especially when combined with sub-standard TFZ and/or LLG values. Just because Phaser finds a solution, does not mean that it has correctly solved the structure!

Why doesn't Phaser report an R-factor?

The R-factor is far less sensitive than the TFZ and LLG scores, especially when searching for remote homologs or a small part of the overall scattering mass. Phaser does record R-factors for the rigid-body refinement step towards the end of a run, which can be found in the logfile. However, unless these values are relatively low (< 40%), they are not reliable indicators of solution quality, which is why only the Z-scores and LLG are reported in the GUI.

What if I don't get a good solution?

This is also covered in detail in the main Phaser manual, in the section "What to do in difficult cases".

What resolution cutoff should I use?

In short, in most cases you should let Phaser choose the resolution cutoffs. Phaser uses the assumed accuracy of the model (expressed directly as RMSD or indirectly as sequence identity), the size of the model, and both the resolution and number of diffraction observations to determine the signal that will be obtained for each model as a function of resolution. This is used both to determine a resolution cutoff that should be sufficient for a successful MR search, and an optimal search order for the models. The effects of bulk solvent at low resolution are taken into account internally, so you do not need to specify a low-resolution cutoff either.

Should I use the output MTZ file from Phaser for refinement?

No, it is always best to use the original data file for refinement. The MTZ file from Phaser contains map coefficients, but in most cases you will find that better maps can be obtained by immediately running a round of refinement or automated re-building.

How are the ASU contents and search ensembles related?

Technically, they aren't connected. The ensembles are used in the actual MR search; the ASU contents only tells Phaser what to expect for the overall scattering mass of the asymmetric unit (an important factor in the maximum likelihood scoring). Although you may only specify one sequence per component, you can instead provide a molecular weight for a collection of chains. You may also split a model into smaller ensembles (for instance, different domains) while keeping the ASU contents the same, or use a multi-chain search model while specifying a sequence file for each chain.

How can I make Phaser go faster?

Phaser is under continuous development, with a lot of emphasis being placed on speed, so anything we could advise here is probably already being done automatically! Because Phaser now automatically chooses the lowest resolution expected to give a clear solution, changing the resolution limits is not advisable and may even lengthen the run times. It's a good idea to set up a Phaser run to find all copies of all components in one job, because this allows Phaser to choose the optimal search order and to amalgamate solutions for multiple copies found in one search.

Can PHENIX do MRSAD?

Yes, PHENIX can run MRSAD (molecular replacement, combined with SAD phases) by using molecular replacement information in the process of determining the anomalous scatterer substructure. However, this requires separate steps for carrying out the MR and finding the substructure. There are two simple ways to do this; both are described in the AutoSol documentation.

I am using a dimer as my search model. Why am I getting a warning that the unit cell is too full?

Remember, the contents of the asymmetric unit is specified independently of the model. If you have supplied a single sequence to define the ASU contents, check to make sure that you have specified that the contents of the asymmetric unit include two copies.

I want to create an ensemble of models. Should I run Sculptor first and then Ensembler, or the other way round?

Unfortunately, neither way is optimal. When Sculptor is run first, it removes residues from the model, and this may impact on Ensembler's ability to do an optimal superposition. If the default SSM matching is used, the fragmentation of secondary structure elements (prominent when models are distant) may result in incorrect matching, although this is rare. When Ensembler is run first, and trimming is switched on, the resulting main-chain trace can contain many short fragments, which will be discarded by Sculptor since these cannot be reliably matched to the alignment sequence, and the completeness of the resulting model is further decreased. The best compromise is probably to run Sculptor first, so that the sequence-to-chain alignment could be performed optimally. In common cases, this does not impair superposition drastically, since there are still many sites to use. In case SSM alignment fails, other matching algorithms (e.g. residue sequence numbers) can be used.

Refinement after MR

What should I run after Phaser finds a good solution?

We usually recommend one (or both) of these two options (and see the next question for advice on more difficult cases):

If the search model is the same protein (or close enough) and is not expected to have large conformational cahnges, run phenix.refine using the default strategy for at least 8-10 macrocycles, but with options adjusted as appropriate for the resolution of the data: for instance, adding NCS restraints at low resolution if multiple copies of a chain are present, or adding the solvent update at moderate-to-high resolution.
Run AutoBuild; if the search model is of a significantly different protein, turn off "rebuild-in-place" mode so it will build an entirely new model.

I ran Phaser, it found a solution, then I ran phenix.refine, and the R-free is stuck above 0.4 (but below 0.5). How do I fix this?

This is an open-ended question, and there are multiple explanations and solutions possible. We usually recommend a standard series of checks:

Run Xtriage and check your data for any pathologies that may make refinement difficult (such as twinning, incomplete data, or translational NCS).
It is very important that the search model already match the sequence of the target protein. If you run Phaser with a homologous structure as input, you should first run Sculptor to edit the search model as needed. (This is not the same as homology modeling - it simply removes inappropriate atoms, renames and renumbers residues, and adjusts B-factors.) Refinement will not correct any residual problems if you skip this step.
Check whether the MR solution is truly complete - this means that you have placed enough copies of the search model, and that the search model already covers most or all of the target. Gaps in crystal packing or unusually large blobs of difference density may be indicative of missing scatterers. At low resolution the difference density will be weaker, but you can still inspect the lattice. If you are missing significant parts of the model, this will limit how much lower R-factors can get.
Make sure you are running enough cycles of refinement. Sometimes it will take a long time to converge; the default three cycles is the minimum recommnded.

If all of the both indicate that model and data are reasonably sensible, it is likely that there are differences/model errors that are beyond the radius of convergence of refinement. Fortunately, this is usually easy to recognize in the difference maps even at low resolution; for instance, a representative test case at 4.0Å (described in DiMaio et al. 2013):

Here we can see very clearly that the model is simply misfit (although the majority of the protein, mostly outside the field of view, is more or less correct). In this case, there are several tools you can try, many of which were specifically designed to cope with this class of problems:

If you have moderate-to-high-resolution data (at least 3.0Å), you can try building an entirely new model in AutoBuild. This is slower than refinement, but it has a very wide radius of convergence.
The model morphing procedure may be helpful; this is available both as an option in AutoBuild, and a standalone program.
The hybrid Rosetta-Phenix refinement was also designed to deal with cases like this.
DEN refinement was also designed to deal with structures that need more aggressive improvement, and has proven useful in many similar cases.
In extreme cases, MR-Rosetta may be tried; this is capable of improving very poor MR solutions, but is significantly more computationally expensive than the other methods.
Sometimes just adding simulated annealing in phenix.refine will be enough (although for the example pictured above, it was not).

Additionally, if multiple structures of the same protein are available for use as search models, we recommend checking whether any of these is closer to the target than the current search model.

Phaser-MR GUI

Do I actually have to include multiple models in my ensembles?

No, we use "ensemble" in this context to refer to any search model; in many cases a single model is adequate to solve the structure (and these may be much easier to work with).

What is the difference between "high resolution" and "high resolution for refinement"?

The latter is applied during the "refinement and phasing" step only; for rotation and translation searches, the first parameter is used.

How do I search for multiple unique ensembles?

The "Search procedure" tab allows you to set up this procedure. To search for multiple independent ensembles representing different entities in a complex ("a AND b AND c"), click the "Add search" button below the tab, and for each search, check the ensemble(s) to be used at that step. The searches will be performed in the order they appear in the GUI. Note that this is not the same as checking multiple ensembles for a single search, as described in the next question.

How do I tell Phaser to pick the best solution from among several similar ensembles?

To try multiple alternative ensembles (corresponding to the same actual entity in the crystal) at once ("a OR b OR c"), check the box next to each ensemble you want Phaser to try for a single search; only one of these will actually be placed in the output model. (You can, of course, specify any combination of searches and ensembles, for instance "(a OR b) AND c", which would be two separate search instructions.)

I want to run a new search in Phaser starting from the results of a previous job. How do I specify this?

If the previous job was run in the Phenix GUI, you can select it from the drop-down menu labeled "Use partial solution from previous job" in the "Input and general options" tab. This will automatically load the solution. Note that unlike the CCP4i interface, Phaser-MR in Phenix does not make use of .sol files, since the information is passed internally. You should then load the new ensembles to search for; ensembles that were used in the previous run do not need to be re-loaded unless you wish to find more copies. If you have already provided complete information about the ASU contents correctly, this will not need to be changed.

How do I run a new search starting from an existing model in the correct orientation and translation?

This might apply if you have a refined model, or a solution from another program. In this case, you should add the model as an ensemble, and check the box "Ensemble is fixed partial solution". This will instruct Phaser to accept the input coordinates without modification, and skip to the next ensemble to search for. You do not need to specify a separate search for this ensemble, as it will be included automatically.

MRage

How do I tell the program to use a local PDB mirror instead of fetching entries remotely?

See the general FAQ list.

MR-Rosetta

Why does mr_rosetta bomb and say "error while loading shared libraries: libdevel.so: cannot open shared object file: No such file or directory"?

This may indicate that somewhere your system is defining the shared libraries that Rosetta needs, and these are for a place that is not where Rosetta expects them to be. You can try to ignore the previous definitions this way:

If you are using the bash or sh shells:

export PHENIX_TRUST_OTHER_ENV=1

or csh (C-shell):

setenv PHENIX_TRUST_OTHER_ENV 1

in the script where you run mr_rosetta, or before you run it from the command line.

Why does mr_rosetta or mr_model_preparation bomb and say "RuntimeError: Cannot contact EBI DbFetch service"?

This could mean just what it says...but also it could mean that you are behind a firewall and there is a proxy server you need to go through. You can use the following command to specify the proxy server (replacing it with YOUR proxy server).