Python-based Hierarchical ENvironment for Integrated Xtallography |
Documentation Home |
Rebuilding an RNA structure with ERRASER
Authorserraser: Fang-Chieh Chou and Rhiju Das, Stanford University PurposeERRASER (Enumerative Real-space Refinment ASsisted by Electron-density under Rosetta) is an application for improving RNA crystal structures based on Rosetta and Phenix. How erraser worksBy supplementing the Rosetta RNA scoring function with electron-density restraint, ERRASER can confidently reduce the errors in RNA crystallographic models while retaining a good fit to the diffraction data. Two ERRASER applications are currently available. The standard ERRASER application remodels all the potentially problematic nucleotides in a RNA model in an automatic fashion, and output one final model. The ERRASER single residue rebuilding application apply ERRASER algorithm to a residue specified by user, and return up to 10 top-score models plus the minimized starting model. The first application is useful for automatic improving the entire model globally, and the second application is useful when the user wants to explore all possible alternative conformations for a problematic nucleotide and examine each one manually. Installation notesTo run ERRASER you need to have Rosetta version 3.5 or later installed on your machine. Rosetta can be downloaded at https://www.rosettacommons.org/software/. For Rosetta installation notes and setting the environmental variable PHENIX_ROSETTA_PATH to tell Phenix where to find Rosetta, refer to the phenix.mr_rosetta documentation under "Installing Rosetta for use with mr_rosetta" (the same Rosetta installation will work for mr_rosetta and ERRASER). Input files preparationinput_model.pdb: A PDB file with your starting RNA model. Please ensure it is in the standard PDB format otherwise ERRASER might not run properly. mapfile.ccp4: A 2mFo-DFc density map for your model in CCP4 format that covers the entire unit cell. Rfree reflections should be removed in map creation to avoid overfitting. We also suggest one to fill the missing data with calculated data to avoid Fourier truncation error. The following options are suggested if PHENIX map calculation utility (GUI) is used in map creation: "Kicked", "Fill missing f obs", and "Exclude free r reflections" set to True. Also in "Map region", select "Unit Cell". ExamplesRunning ERRASER is easy, and can be done both through the Phenix GUI or the command line. In the GUI, the program is listed under the Refinement category. Most configuration options are shown in the main window: Toolbar buttons to launch phenix.maps and the FFT utility are also displayed. From the command-line you can type: phenix.erraser model.pdb mapfile.ccp4 This command will have erraser rebuild your entire model. Here are some commonly used options to include: "map_reso=2.2" gives tha map resolution (2.2 angstrom here), which highly recommended to include as this usually gives better performance. "n_iterate=2" allows ERRASER to iterate twice before output the model. This takes much longer time but might gives better improvement. "fixed_res= A20 B15" gives residues to be fixed during remodeling. The format is chain ID followed by residue numbers. ERRASER currently did not model ligands (anything starts with "HETATM" in PDB, including modifies bases), proteins, and crystal contacts in a structure. Therefore we suggest to fix the position of nucleotides that are in close contact with protein or ligand component. You can also run ERRASER in single residue rebuilding mode: phenix.erraser model.pdb mapfile.ccp4 single_res_mode=True rebuild_res_pdb=A30 This will rebuild just residue 30 in chain A and output up to 10 models for manual inspection. (In the GUI, this is equivalent to checking the box labeled "Single residue rebuilding mode" and entering the residue ID in the field labeled "Residue to be rebuilt".) Output filesmodel_erraser.pdb: A PDB file with your rebuilt RNA model. For single residue rebuilding mode: model_0.pdb, model_1.pdb...: Up to 10 different models with the specified residue rebuilt. In both modes, after the models are generated, the application will analyze the models and output a detailed comparison of the changes introduced by ERRASER. The GUI displays a list of output files and a summary of validation statistics: For the standard rebuilding mode, a subset of the MolProbity analyses will be displayed in additional tabs. Lists of outliers are interactive with Coot (if installed); see the validation documentation for more information. In single-residue mode, a summary of the validation criteria for the specific residue being rebuilt is shown instead: Possible ProblemsSpecific limitations:ERRASER works only for RNA currently. Other parts in crystallographic model, including proteins, modified bases and ligands, are not being modeled. Remodeling of RNA residues that are in close contact with these components may be problematic. Currently crystal contacts are not being modeled, which is known to cause problems in a few test cases when RNA is interacting strongly with its crystal-packing partner (ex. base-pairing and base-stacking). Right now this problem can be resolved by mannually adding the crystal-packing partner into the starting pdb file or forcing these residues as "fixed_res" during the run.
List of all erraser keywords------------------------------------------------------------------------------- Legend: black bold - scope names black - parameter names red - parameter values blue - parameter help blue bold - scope help Parameter values: * means selected parameter (where multiple choices are available) False is No True is Yes None means not provided, not predefined, or left up to the program "%3d" is a Python style formatting descriptor ------------------------------------------------------------------------------- erraser input_files pdb_in= None PDB file with starting model map_file= None Map file (CCP4 format) 2mFo-DFc map file in CCP4 format. Rfree should be excluded. map_reso= 2.5 The resolution of the input density map. It is highly recommended to input the map resolution whenever possible for better result. map_coeffs= None map_labels= None r_free_flags label= None test_flag_value= None disable_suitability_test= False output_files pdb_out= None Output pdb file name (optional) log= erraser.log Output log file params_out= erraser_params.eff Parameters file to rerun erraser directories temp_dir= "" Optional temporary work directory output_dir= "" Output directory where params files are to be written gui_output_dir= None rosetta_path= "" Location of rosetta directories. If you have set PHENIX_ROSETTA_PATH then this can be blank. All rosetta files are located relative to this path You can set the environment variable 'PHENIX_ROSETTA_PATH' to indicate where rosetta is to be found. In csh/tcsh use something like: setenv PHENIX_ROSETTA_PATH /Users/Shared/unix/rosetta In bash/sh use: export PHENIX_ROSETTA_PATH=/Users/Shared/unix/rosetta rosetta_erraser_dir= "" Directory with Rosetta tools for erraser Path is relative to rosetta_path erraser_control single_res_mode= False When is True, ERRASER just rebuild one residue specified in rebuild_res option and output up to 10 models for manual inspection. Overides the standard ERRASER procotol. Required option: rebuild_res_pdb. All other erraser_control options except native_screen_rmsd become unfunctional in this mode. n_iterate= 1 The number of rebuild-minimization iteration in ERRASER. The user can increase the number to achieve best performance. Usually 2-3 rounds will be enough. Alternatively, the user can also take a ERRASER-refined model as the input for a next ERRASER run to achieve mannual iteration. native_screen_rmsd= 3.0 In ERRASER default rebuilding, we only samples conformations that are within 3.0 A to the starting model (which is the 'native' here). The user can modify the RMSD cutoff. If the value of native_screen_RMSD is larger than 10.0, the RMSD screening will be turned off. rebuild_all= False When is True, ERRASER will rebuild all the residues instead of just rebuilding errorenous ones. Residues in '-fixed_res' (see below) are still kept fixed during rebuilding. It is more time consuming but not necessary leads to better result. Standard rebuilding with more iteration cycles is usually prefered. fixed_res= None (Example: fixed_res=A1 fixed_res=A14-19 fixed_res=B9 fixed_res=B10-13; Format is chain ID followed by residue numbers). This allows users ton fix selected RNA residues during ERRASER. For example, because protein and ligands are not modeled in ERRASER, we recommand to fix RNA residues that interacts strongly with these unmodeled atoms. ERRASER will automatically detect residues covalently bonded to removed atoms and hold them fixed during the rebuild, but users need to specify residues having non-covalent interaction with removed atoms manually. extra_res= None (Example: extra_res=A1 extra_res=A14-19 extra_res=B9 extra_res=B10-13; Format is chain ID followed by residue numbers). This allows users to specify extra residues and force ERRASER to rebuild them. ERRASER will automatically pick out incorrect residues, but the user may be able to find some particular residues that was not fixed after one ERRASER run. The user can then re-run ERRASER with the extra_res argument, and force ERRASER to remodel these residues. constrain_chi= True When is True, ERRASER will apply a weak constraint on Chi angle to stay near the input conformer. Only new Chi conformers with a large energy bonus will be accepted. search_syn_pyrimidine_only_when_native_syn= True When is True, ERRASER will only sample syn-chi conformer for pyrimidines if the input residue is in syn conformer. rebuild_res_pdb= None (Example: rebuild_res_pdb=B21; Format is chain ID followed by residue number.) Residue to be rebuilt. Required input for single-residue rebuilding mode; otherwise it is useless. control debug= False Debugging output job_title= None Job title in PHENIX GUI, not used on command line non_user_params print_citations= True Print citation information at end of run |