Python-based Hierarchical ENvironment for Integrated Xtallography |
Documentation Home |
Tutorial 5: Solving a structure with MR data
IntroductionThis tutorial will use the structure of a2u-globulin-mr using a search model with 63% sequence identity as an example of how to solve a MR dataset with the AutoMR Wizard. It is designed to be read all the way through, giving pointers for you along the way. Once you have read it all and run the example data and looked at the output files, you will be in a good position to run your own data through AutoMR. Setting up to run PHENIXIf PHENIX is already installed and your environment is all set, then if you type: echo $PHENIXthen you should get back something like this: /xtal//phenix-1.3If instead you get: PHENIX: undefined variablethen you need to set up your PHENIX environment. See the PHENIX installation page for details of how to do this. If you are using the C-shell environment (csh) then all you will need to do is add one line to your .cshrc (or equivalent) file that looks like this: source /xtal/phenix-1.3/phenix_env(except that the path in this statement will be where your PHENIX is installed). Then the next time you log in $PHENIX will be defined. Running the demo a2u-globulin-mr data with AutoMRTo run AutoMR on the demo a2u-globulin-mr data, make yourself a tutorials directory and cd into that directory: mkdir tutorials cd tutorialsNow type the phenix command: phenix.run_example --helpto list the available examples. Choosing a2u-globulin-mr for this tutorial, you can now use the phenix command: phenix.run_example a2u-globulin-mrto solve the a2u-globulin-mr structure with AutoMR. This command will copy the directory $PHENIX/examples/a2u-globulin-mr to your current directory (tutorials) and call it tutorials/a2u-globulin-mr/ . Then it will run AutoMR using the command file run.sh that is present in this tutorials/a2u-globulin-mr/ directory. This command file run.sh is simple. It says: #!/bin/sh echo "Running AutoMR on a2u-globulin data without building..." phenix.automr mup_search.pdb scale.mtz mass=18000. resolution=2.5 \ component_type=protein RMS=1.0 sequence.dat copies=4 \ space_group=p212121 unit_cell="106.820 62.340 114.190 90.00 90.00 90.00" \ build=FalseThe first line (#!/bin/sh) tells the system to interpret the remainder of the text in the file using the sh (or bash) -shell (sh). The command phenix.automr runs the command-line version of AutoMR (see Automated Molecular Replacement using AutoMR for all the details about AutoMR including a full list of keywords). The arguments on the command line tell AutoMR about the search model (mup_search.pdb), the datafile with structure factors (scale.mtz), the molecular mass of the molecule that we are searching for (mass=18000.), and the the resolution (resolution=2.5). Then the command continues with telling AutoMR that the component we are searching for is protein (component_type=protein) and that the search model has an estimated RMS difference from the true structure of about 1.0 A (RMS=1.0). Next the sequence file (sequence.dat) is specified along with the number of copies of the search model to look for (copies=4). Then the space group and cell dimensions are specified (these could also have been simply read from the data file). Finally the Wizard is told not to rebuild the model after MR with rebuild_after_mr=False Note that each of these is specified with an = sign, and that there are no spaces around the = sign. Note the backslash "\" at the end of some of the lines in the phenix.automr command. This tells the C-shell (which interprets everything in this file) that the next line is a continuation of the current line. There must be no characters (not even a space) after the backslash for this to work. Although the phenix.run_example a2u-globulin-mr command has just run AutoMR from a script (run.sh), you can run AutoMR yourself from the command line with the same phenix.automr seq_file= ... command. You can also run AutoMR from a GUI, or by putting commands in another type of script file. All these possibilities are described in Using the PHENIX Wizards. Where are my files?Once you have started AutoMR or another Wizard, an output directory will be created in your current (working) directory. The first time you run AutoMR in this directory, this output directory will be called AutoMR_run_1_ (or AutoMR_run_1_/, where the slash at the end just indicates that this is a directory). All of the output from run 1 of AutoMR will be in this directory. If you run AutoMR again, a new subdirectory called AutoMR_run_2_ will be created. Inside the directory AutoMR_run_1_ there will be one or more temporary directories such as TEMP0 created while the Wizard is running. The files in this temporary directory may be useful sometimes in figuring out what the Wizard is doing (or not doing!). By default these directories are emptied when the Wizard finishes (but you can keep their contents with the command clean_up=False if you want.) What parameters did I use?Once the AutoMR wizard has started (when run from the command line), a parameters file called automr.eff will be created in your output directory (e.g., AutoMR_run_1_/automr.eff). This parameters file has a header that says what command you used to run AutoMR, and it contains all the starting values of all parameters for this run (including the defaults for all the parameters that you did not set). The automr.eff file is good for more than just looking at the values of parameters, though. If you copy this file to a new one (for example automr_hires.eff) and edit it to change the values of some of the parameters (resolution=3.0) then you can re-run AutoMR with the new values of your parameters like this: phenix.automr automr_hires.effThis command will do everything just the same as in your first run but use only the data to 3.0 A. Reading the log files for your AutoMR run fileWhile the AutoMR wizard is running, there are several places you can look to see what is going on. The most important one is the overall log file for the AutoMR run. This log file is located in: AutoMR_run_1_/AutoMR_run_1_1.logfor run 1 of AutoMR. (The second 1 in this log file name will be incremented if you stop this run in the middle and restart it with a command like phenix.automr run=1). The AutoMR_run_1_1.log file is a running summary of what the AutoMR Wizard is doing. Here are a few of the key sections of the log files produced for the a2u-globulin-mr MR dataset. Summary of the command-line argumentsNear the top of the log file you will find: ------------------------------------------------------------ Starting AutoMR with the command: phenix.automr coords=mup_search.pdb data=scale.mtz mass=18000. resolution=2.5 \ component_type=protein RMS=1.0 seq_file=sequence.dat \ input_seq_file=sequence.dat copies=4 space_group=p212121 \ unit_cell='106.820 62.340 114.190 90.00 90.00 90.00' rebuild_after_mr=FalseThis is just a repeat of how you ran AutoMR; you can copy it and paste it into the command line to repeat this run. Running Phaser molecular replacementThe AutoMR Wizard will take the information you have input and use it to run Phaser molecular replacement algorithm: AutoMR_auto_MR AutoMR Run 1 Tue Jul 3 10:40:54 2007This is followed by a summary of some of the input information: CRITERIA FOR PHASER MR RUN: sg : p212121 selection_criteria_rot : Percent_of_best selection_criteria_rot_value : 75 all_plausible_sg_list : ['P 2 2 2', 'P 2 2 21', 'P 21 2 2', 'P 2 21 2', 'P 21 21 2', 'P 2 21 21', 'P 21 2 21', 'P 21 21 21'] use_all_plausible_sg : No overlap_allowed : DICTENS: {'ensemble_1': [['mup_search.pdb', 'RMS', 1.0]]} PDBList entry: ensemble_1 mup_search.pdb RMS 1.0 ... HALL: P 2ac 2ab CELL: (106.81999999999999, 62.340000000000003, 114.19, 90.0, 90.0, 90.0) ENSEMBLE 0 : ensemble_1 ENSEMBLE 1 : ensemble_1 ENSEMBLE 2 : ensemble_1 ENSEMBLE 3 : ensemble_1 ENSEMBLE: ensemble_1 , 1 PDB file(s)Here the list of all plausible space groups are those with the same symmetry in reciprocal space as the one you have input, and hence these might be the correct space group. (If you are not sure which one is correct, then you can tell AutoMR to try all of these with use_all_plausible_sg=Yes). The AutoMR wizard then runs Phaser, and the log file for this is in MR.log. The summary is written to your AutoMR log file. It starts out with a list of steps to be carried out: Steps: Anisotropy correction Cell Content Analysis Fast Rotation Function Fast Translation Function Packing Refinement (if data higher resolution than search resolution) Number of search ensembles = 4 #1: Ensemble ensemble_1 #2: Ensemble ensemble_1 #3: Ensemble ensemble_1 #4: Ensemble ensemble_1 Number of permutations of search ensembles = 1 One test spacegroup P 21 21 21Phaser then carries out each of these steps. The final summary is: OUTPUT FILES ------------ No script files output /net/cci-filer1/vol1/tmp/terwill/from_firebird/PHENIX/structure_lib_tests/MR/a2u-g lobulin/run_070307_new/AutoMR_run_1_/MR.1.pdb /net/cci-filer1/vol1/tmp/terwill/from_firebird/PHENIX/structure_lib_tests/MR/a2u-g lobulin/run_070307_new/AutoMR_run_1_/MR.1.mtzfollowed by the final log likelihood gain (positive is good, anything over 100 is fine, and a very strong solution will be over 1000) and orientations for each of the 4 molecules: Solution #1: Likelihood Gain 1410.51 ENSE ensemble_1 - EULER 295.533, 59.329, 229.980 - FRAC 0.096, -0.302, -0.115 ENSE ensemble_1 - EULER 166.424, 152.963, 316.988 - FRAC -0.241, -0.440, 0.021 ENSE ensemble_1 - EULER 183.783, 14.133, 133.534 - FRAC -0.217, -0.221, 0.085 ENSE ensemble_1 - EULER 68.417, 114.921, 35.347 - FRAC 0.073, -0.100, -0.037 The AutoMR_summary.dat summary fileA quick summary of the results of your AutoMR run is in the AutoMR_summary.dat file in your output directory. This file lists the key files that were produced in your run of AutoMR (all these are in the output directory) and some of the key statistics for the run, including the overall log-likelihood gain. summary for this a2u-globulin-mr MR dataset: **************** SOLUTION MR ******* Log likelihood gain for solution MR: 1410.5121 Output PDB files for solution MR: MR.1.pdb Output MTZ files for solution MR: MR.1.mtz Output log file for solution MR: MR.log Output summary file for solution MR: MR.sum How do I know if I have a good solution?Here are some of the things to look for to tell if you have obtained a correct solution:
What to do nextOnce you have run AutoMR and have obtained a good solution and model, the next thing to do is to run the AutoBuild Wizard. If you run it in the same directory where you ran AutoMR, the AutoBuild Wizard will pick up where the AutoMR Wizard left off and carry out iterative model-building, density modification and refinement to improve your model and map. See the web page Automated Model Building and Rebuilding with AutoBuild for details on how to run AutoBuild. If you do not obtain a good solution, then it's not time to give up yet. There are a number of standard things to try that may improve the structure determination. Here are a few that you should always try:
Additional informationFor details about the AutoMR Wizard, see Automated molecular replacement with AutoMR. For help on running Wizards, see Using the PHENIX Wizards. |