Rebuild a model using fragments from the PDB
Author(s)
- replace_with_fragments_from_pdb: Tom Terwilliger
Purpose
Create a composite model similar to a supplied model using fragments
from the PDB.
This tool can be used in early stages of model-building to try and
improve parts of a preliminary model by replacing them with similar
fragments from a deposited structure.
Note that the composite model may not actually have good geometry because
the fragments from the PDB are morphed (distorted) to match the supplied
model and also because the junctions between fragments taken from the PDB
are not refined.
Usage
replace_with_fragments_from_pdb can create composite models for protein chains
You can adjust the maximum number of insertions/deletions in each replacement
fragment and whether to morph structures obtained by SSM matching.
You can supply a map file for refinement or calculation of map-model
correlations.
You can also choose to exclude or include specific PDB entries in SSM
matching and fragment searching, or exclude all PDB entries with sequences
similar to a particular PDB entry.
You can choose to create a replacement for just a selected part of your
model (by supplying a selection string that specifies what to select).
How replace_with_fragments_from_pdb works:
Search structures in the PDB (using SSM matching and fragment searching) for
fragments similar to segments in a supplied model. Then choose the best
matching segments for each part of the supplied model and create a composite
model using the fragments from the PDB.
replace_with_fragments_from_pdb uses SSM-based (secondary structure matching)
to find structures in the PDB that are similar to a supplied model. The SSM
matching is the same as in phenix.superpose_and_morph.
replace_with_fragments_from_pdb also uses fragment searching to find shorter
fragments matching each loop (segment between two helices or strands)
in the supplied model.
The fragments obtained with fragment searching are always morphed (distorted)
to make the distance between starting and ending residues match the distance
in the supplied structure.
Fragments obtained from SSM matching are optionally morphed to match the
target structure as well. The distance over which morphing occurs (the
shift-field distance) is adjustable (typically 10 A), where structure is
largely preserved over shorter distances and may be changed over longer
distances.
The supplied model is then used as a template and matching parts of structures
from the PDB are superimposed on the template. The best-matching segments
(smallest rmsd and longest length) are chosen and connected. Gaps are filled
if possible by a targeted fragment search of the PDB.
Examples
Standard run of replace_with_fragments_from_pdb:
Running replace_with_fragments_from_pdb is easy. From the command-line you can type:
phenix.replace_with_fragments_from_pdb model.pdb nproc=8
This will find fragments in the PDB similar to model.pdb, put them together
and write out a new model using these fragments.
Possible Problems
Specific limitations and problems:
The final model may have poor geometry at connections between fragments, so
further refinement is normally required. Additionally the final model may
have poor geometry within segments because morphing is carried out to
match the target structure.
Literature
Additional information
List of all available keywords
- job_title = None Job title in PHENIX GUI, not used on command line
- input_files
- selection = None If specified, use selected part of model
- local_pdb_dir = None Local PDB mirror (something like /User/files/pdb_mirror/)
- map_model
- full_map = None Input full map file
- half_map = None Input half map files
- model = None Input model file
- output
- rebuilt_model = None Output file name. Default is to append _rebuilt to input file
- overwrite = True Overwrite files with same names
- replace_with_fragments_from_pdb
- refine_cycles = 3 Refinement cycles (set to zero to not refine)
- maximum_insertions_or_deletions = None Maximum insertions/deletions per fragment
- database = *pdb100 pdb2018_90 Database to search. pdb100 is unique set of PDB. pdb2018_90 is quality-filtered PDB at 90% identity
- number_of_sequence_neighbors = 100 Number of sequence neighbors to search if using pdb100
- max_models_to_fetch = 100 Maximum models to fetch from PDB
- allow_reverse_connectivity = False Allow connectivity to be backwards or mixed
- morph = False Morph replacement segments (NOTE: morphing of loops is intrinsic)
- shift_field_distance = None Shift field distance. Distance over which morphing leaves model mostly the same. Over longer distances morphing changes model. Default is 10 A.
- pdb_include_list = None List of PDB ID or pdb ID and chain_id to include (2DY1 or 2DY1_A)
- pdb_exclude_list = None List of PDB ID or pdb ID and chain_id to exclude (2DY1 or 2DY1_A)
- exclude_similar_to_this_pdb_id = None Exclude all PDB chains with sequence identity > maximum_percent_identity_to_this_pdb_id to exclude_similar_to_this_pdb_id
- maximum_percent_identity_to_this_pdb_id = 30 Maximum percent identity to supplied PDB chain to include. Exclude all PDB chains with sequence identity > maximum_percent_identity_to_this_pdb_id to exclude_similar_to_this_pdb_id
- maximum_residues_to_try_fragment_search = 30 If try_fragment_search_for_short_models and model has no more than maximum_residues_to_try_fragment_search, try simple fragment search first
- try_fragment_search_for_short_models = True If try_fragment_search_for_short_models and model has no more than maximum_residues_to_try_fragment_search, try simple fragment search first
- crystal_info
- resolution = None Nominal resolution of map
- scattering_table = n_gaussian wk1995 it1992 *electron neutron Choice of scattering table for structure factor calculations. Standard for X-ray is n_gaussian, for cryoEM is electron.
- sequence = None Sequences
- wrapping = None You can specify whether the map is wrapped (can map values outside bounds to inside with cell translations).
- control
- nproc = 1 Number of processors (if None, use all available)
- ignore_symmetry_conflicts = False You can ignore the symmetry information (CRYST1) from coordinate files. This may be necessary if your model has been placed in a box with box_map for example.
- verbose = False Verbose output
- quick = False Run quickly
- guiGUI-specific parameter required for output directory