Tutorial 8: Structure refinement

A set of structure refinement examples using phenix.refine.

Authors

Quick Facts

Run examples

There are two different ways to run the examples:

  1. Type the command:

    % phenix.run_refinement_example 5a
    

    this will create the directory example_5a and run the example 5a in this directory. Running the command above without arguments will run all examples (will take a long time). One can also run several examples like this:

    % phenix.run_refinement_example 1a 2b 3a 3c
    

    this will create 4 directories: example_1a, ..., example_3c and run the examples 1a, 2b, 3a and 3c in these directories.

  2. From PHENIX distribution copy the entire refinement_examples directory to where you plan to run the examples. Then go to any of sub-directories refinement_examples/example_xx and execute a run file. This will start the selected refinement example, wait until it's finished and inspect the output files.

Examples

Refinement of 1akg structure at 1.1 A resolution

In this series of examples we review the simple refinement with all default parameters, automatic picking and refinement of solvent molecules, switching from isotropic to anisotropic ADP refinement, practice monitoring of Rfactors change during refinement and experience the effect of adding and refinement of hydrogen atoms.

example_1a: The simplest possible refinement mode: refinement with all default parameters. This includes 3 macro-cycles of bulk-solvent and anisotropic scaling, individual coordinates and isotropic B-factors refinement, refinement of occupancies for atoms in alternative conformations.

Final: R-work = 0.2264 R-free = 0.2423

The start model does not contain any water molecules placed into it. Next example demonstrates how to add and refine ordered solvent.

example_1b: Same refinement run as in example_1a with automatic water picking and refinement.

Final: R-work = 0.1771 R-free = 0.1855

Automatic water picking and refinement dropped the R-factors by ~5%. At 1.1 A resolution all well ordered non-hydrogen atoms should be refined as anisotropic.

example_1c: Same refinement run as in example_1b with all non-water atoms refined anisotropically.

Final: R-work = 0.1534 R-free = 0.1849

Refinement of non-solvent atoms as anisotropic dropped both Rwork and Rfree ( as expected at this resolution).

example_1d: Same refinement run as in example_1c performed with hydrogen atoms added to the start model.

Final: R-work = 0.1426 R-free = 0.1703

Using the hydrogens in refinement may further improve the model. Indeed, adding the hydrogen atoms and refining them as riding model decreased R-factors by approximately 1%.

example_1e: Same refinement run as in example_1d performed with optimized X-ray target weights for both coordinate and ADP refinement.

Final: R-work = 0.1340 R-free = 0.1658

Although phenix.refine uses automatic procedure for relative xray/geometry weight calculation and finds good weights in most of cases, it may be a good idea to try to optimize the weights. A drop in R-factors by ~1% in this example clearly demonstrates this. Since it is very slow, this option is recommended to run overnight for a final model tune up.

Refinement of 1071B at 3.0A resolution, 2 NCS groups with 3 components in each group.

The aim of these four examples below is to practice determining and specifying of NCS groups (using atom selection syntax). Also it underlines the importance of checking NCS groups that determined automatically by phenix.refine. The last example demonstrates the use of TLS.

example_2a: Refinement with all default parameters. This includes 3 macro-cycles of bulk-solvent and anisotropic scaling, coordinates and individual isotropic B-factors refinement.

Final: R-work = 0.2082 R-free = 0.2729

The number of parameters and relatively low resolution produce fairly large gap between R-work and R-free. Hopefully we can add some more "observations" by using NCS restraints.

example_2b: Same refinement run as in example_2a with using the NCS restraints (NCS operators detected automatically).

Final: R-work = 0.2404 R-free = 0.2839

phenix.refine can automatically determine NCS groups and set up the selections for NCS restraints. It does it by simple analysis of chains in input PDB. Although it does the good job in many cases, in this particular one the found selections are not optimal and their use in refinement is counterproductive: both R-work and R-free jumped up. Visual inspection of the model on graphics suggests more smart selections as demonstrated below.

example_2c: Same refinement run as in example_2a with using the NCS restraints (NCS operators provided manually).

Final: R-work = 0.2147 R-free = 0.2566

The correct selections for NCS groups and their use in refinement produce nice R-factors.

example_2d: Same refinement run as in example_2c with using TLS to model ADP of selected domains.

The use of TLS model for ADP of selected chains results in some further model improvement.

Final: R-work = 0.2079 R-free = 0.2540

Refinement of 1038B at 3.0A resolution, 10 fold NCS.

These three examples below are similar to previous set of examples with two main differences: the automatic determination of NCS groups worked well and the use of TLS modeling resulted in more significant improvement.

example_3a: Refinement with all default parameters. This includes 3 macro-cycles of bulk-solvent and anisotropic scaling, coordinates and individual isotropic B-factors refinement.

Final: R-work = 0.2043 R-free = 0.2743

The gap between R-work and R-free clearly indicated the overfitting. The NCS restraints will help as shown in the next example.

example_3b: Same refinement run as in example_3a with using the NCS restraints (NCS operators detected automatically).

Final: R-work = 0.2207 R-free = 0.2527

Applying NCS restrains decreased R-free and narrowed the gap R-work - R-free. No manual work was necessary to specify NCS groups.

example_3c: Same refinement run as in example_3b with using TLS to model ADP of selected domains.

Final: r_work = 0.2008 r_free = 0.2410

Selecting TLS groups required a closer look at the model on graphics. This is rewarded by ~2% drop in R. Note that the refinement is not fully converged as indicated by continuous drop in R-factors between the last two macro-cycles (see PDB file header: in refinement statistics step-by-step). This suggests that adding a few more refinement macro-cycles may further improve the model.

Refinement using Simulated Annealing (SA)

In this couple of examples we demonstrate the power of SA refinement to improve a poor starting model. Combining it with automatic water picking is a good idea at the cost of just one parameter added to the command line.

example_4a: Refinement with all default parameters. This includes 3 macro-cycles of bulk-solvent and anisotropic scaling, coordinates and individual isotropic B-factors refinement.

Start: r_work = 0.4166 r_free = 0.4678 Final: r_work = 0.3621 r_free = 0.4188

The default refinement run (plus water picking) did a good progress in model improvement.

example_4b: Same refinement run as in example_4a with using SA.

Start: r_work = 0.4166 r_free = 0.4678 Final: r_work = 0.2771 r_free = 0.3414

Adding a SA refinement to the previous refinement run results in big model improvement as indicated by drop in both R-work and R-free factors.

Refinement of f' and f'' for a structure with unknown (to the dictionary) molecule in it

The goal of this example is to show that in case of refinement against anomalous data the refinement of f' and f'' to model the anomalous differences in F+ and F- may deliver another drop in R-factors from small fractions to ~1% or more. Also it shows how to use eLBOW program to generate a CIF (stereochemistry dictionary) file for the unknown item in the input PDB. A PDB entry with the code 1mdg is used in this example.

example_5a: Refinement with all default parameters. This includes 3 macro-cycles of bulk-solvent and anisotropic scaling, coordinates and individual isotropic B-factors refinement.

Final: R-work = 0.1944 R-free = 0.2368

example_5b: Same refinement run as in example_5a with f' and f'' refinement for anomalous scatterer.

Final: R-work = 0.1900 R-free = 0.2287

Note the drop in both R and R-free.

Refinement using twinned data

This example shows how to do the refinement if the data is twinned. phenix.xtriage was used to obtain the twinning operator. For more information about twinning analysis and using in refinement look separate paragraph in PHENIX documentation.

example_6a: Refinement with all default parameters. This includes 3 macro-cycles of bulk-solvent and anisotropic scaling, coordinates and individual isotropic B-factors refinement. A PDB entry with the code 1l2h is used in this example.

Final: R-work = 0.2476 R-free = 0.2755

example_6b: Same refinement run as in example_6a with the twinning taken into account.

Final: R-work = 0.1640 R-free = 0.2069

Questions, comments, more examples