Merging two models with combine


	Python-based Hierarchical ENvironment for Integrated Xtallography
Documentation Home

Merging two models with combine_models

Author(s)
Purpose
Usage: How combine_models works:; Output files from combine_models
Examples: Standard run of combine_models:; Selecting pieces from the two models:; Replacing a specific segment:; Crossing two models that have entirely matched residues:
Possible Problems: Specific limitations and problems:
Literature
Additional information: List of all combine_models keywords

Author(s)

combine_models: Tom Terwilliger

Purpose

This routine takes a model and finds pieces of a second model file that improve its fit to density when they replace the corresponding pieces in the first model.

Usage

The main uses of phenix.combine_models are:

Taking the best parts of 2 models and merging them to make a new model.
Replacing a segment in one model with a corresponding segment from another.

How combine_models works:

Combine_models starts with two input models. The first model is used as the default; if nothing can be found in the second model that is better than what is in the first model, then that part of the first model is kept. The second model is used as a template for improving the first model. Fragments of the second model are considered as alternatives for corresponding segments in the first model.

The fit of the models to density is used to decide which of a pair of fragments is best. In general, the correlation of model density with the map is used as the criterion. In cases where unequal numbers of residues are considered, then this correlation is weighted by the square root of the number of residues in each case. During the optional merge_second_model step, the scoring is optionally based on correlation of density, or by default, based on density at the positions of the main-chain atoms in the model.

If the two input models are not in the same asymmetric unit of the crystal, then combine_models will move the pieces from the second model to the corresponding locations in the first model. In this way the final model has all its parts in the same place.

Combine_models has four main steps, each of which is optional:

Selecting parts of the input models to consider, using atom selections. This step allows you to do things such as removing a segment from the first model, then crossing with the corresponding segment from the second model, effectively replacing that segment in a way that you specify.
Creating as complete a starting model as possible (merge_second_model=True). This is done by cutting the second model into small pieces and reassembling the first model, considering these pieces along with all the segments in the first model as potential pieces of the reassembled model. The result is a new working model.
Crossing the working model with the second input model, allowing only equal-length crossovers (matching=True). In this step, pairs of segments in the working model and the second input model that have overlapping residues are considered, one at a time. The local map correlation is calculated for each residue in each fragment. Then at each position, the coordinates from one fragment or the other are chosen, with the choice crossing over between one fragment and the other only at positions where the main-chain atoms in the residue are within match_distance (default 0.5 A) of each other.
Crossing the working model with the second input model, allowing only unequal-length crossovers (non_matching=True). This step is just like the previous one, except that only unequal-length crossovers are considered. If this method is included, then it may be necessary to reassign the sequence afterwards as the alignment may change
Checking the sequence. If unequal-length crossovers are made, then the sequence alignment to the model may need to be changed. This step carries that out.

Output files from combine_models

combine_models.pdb: A PDB file with your combined model.

Examples

Standard run of combine_models:

Running combine_models is easy. From the command-line you can type:

phenix.combine_models pdb_in=first.pdb \
   second_pdb_in=second.pdb \
   seq.dat \
   map_coeffs.mtz

This will combine first.pdb and second.pdb based on fit to the map from map_coeffs.mtz, recheck the sequence alignment to seq.dat, and write out the resulting model.

Selecting pieces from the two models:

To take first.pdb and then see if residues A21-A30 of second.pdb can improve it, you can type:

phenix.combine_models pdb_in=first.pdb \
   second_pdb_in=second.pdb \
   seq.dat \
   map_coeffs.mtz \
   second_pdb_in_selection="(chain A and resid 21:30)" \

Replacing a specific segment:

To take first.pdb and then see if residues A21-A30 and B21-B30 can be improved by replacing them with residues C10-C20 and D10-D20 of second.pdb, you can tell combine_models to ignore residues A22-A29 and B22-B29 and to consider only residues C10-C20 and D10-D20 of second.pdb:

phenix.combine_models pdb_in=first.pdb \
   second_pdb_in=second.pdb \
   seq.dat \
   map_coeffs.mtz \
   pdb_in_selection="(not  ( (chain A or chain B) and resid 22:29) )" \
   second_pdb_in_selection="( (chain C or chain D) and resid 10:20)" \

Crossing two models that have entirely matched residues:

If your first.pdb and second.pdb have exactly the same residues present, and just differ in coordinates, then you might want to preserve all the connectivity by skipping the merge_second_model step, and by skipping the non_matching crossover step, and by skipping the reassignment of sequence. You can type preserve_connectivity=True as a shortcut for this:

phenix.combine_models pdb_in=first.pdb \
   second_pdb_in=second.pdb \
   seq.dat \
   map_coeffs.mtz \
   preserve_connectivity=True

Possible Problems

Specific limitations and problems:

Literature

Additional information

List of all combine_models keywords

------------------------------------------------------------------------------- 
Legend: black bold - scope names
        black - parameter names
        red - parameter values
        blue - parameter help
        blue bold - scope help
        Parameter values:
          * means selected parameter (where multiple choices are available)
          False is No
          True is Yes
          None means not provided, not predefined, or left up to the program
          "%3d" is a Python style formatting descriptor
------------------------------------------------------------------------------- 
combine_models
   input_files
      pdb_in= None Input starting PDB file. This model will be the default
              output model. Parts of the model that are better in the second
              model be replaced by the second model.
      pdb_in_atom_selection= None Any selection specified with
                             pdb_in_atom_selection is applied to the pdb_in
                             input model before using it.
      second_pdb_in= None Input second PDB file. Parts that are better in this
                     model will replace corresponding parts of the first
                     model.
      second_pdb_in_atom_selection= None Any selection specified with
                                    second_pdb_in_atom_selection is applied to
                                    the second_pdb_in input model before using
                                    it .
      mtz_in= None Input MTZ file with map coefficients
      map_coeff_labels= None If map coefficients cannot be identified
                        automatically from your MTZ file, you can specify the
                        label or labels for them. (Please separate labels with
                        blank space, MTZ columns grouped together separated by
                        commas with no blanks.) You can specify:
                        map_coeff_labels (e.g., FWT,PHIFWT) amplitudes and
                        phases (e.g., FP,SIGFP PHIB) or amplitudes, phases,
                        weights (e.g., FP,SIGFP PHIB FOM)
      seq_file= None Sequence file for sequence alignment
   output_files
      pdb_out= combine_models.pdb Output PDB file
      log= combine_models.log Output log file
      params_out= combine_models_params.eff Parameters file to rerun
                  combine_models
   directories
      output_dir= "" Output directory where files are to be written
   crossing
      high_resolution= None Resolution used in map calculation
      match_distance= 0.5 Residue pairs must have rmsd of match_distance (A)
                      or lower to be crossed. A value between 0.5 A and 1 A is
                      generally best.
      merge_second_model= True Cut second model into pieces and merge them
                          into first model. (Useful for filling in gaps in
                          first model.)
      merge_both_models= False Cut first and second models into pieces and
                         merge them (useful for combining all pieces of both
                         models). Alternative to merge_second_model.
      trim_in_merge= False Trim all the fragments of first and second model
                     back to match density before merging of models. See also
                     remove_bad_fragments in merge which removes bad fragments
                     (after any trimming is done).
      remove_bad_fragments_in_merge= True Remove bad fragments in merge. See
                                     also trim_in_merge which trims back
                                     fragments before trying to merge them.
      extend_in_merge= False Extend fragments of first and second model during
                       merging of models
      fragment_length= 10 Length of pieces that second model will be cut down
                       to in merge_second_model or merge_both_models
      use_cc_in_combine_extend= False You can choose to use the correlation of
                                density rather than density at atomic
                                positions to score models in the
                                merge_second_model or merge_both_models step.
                                This may be useful at lower resolution (> 3
                                A)
      matching= True Carry out crossover using segments that match (have the
                same number of residues)
      non_matching= True Carry out crossover using segments that do not match
                    (do not have the same number of residues)
      check_sequence= None After running all other procedures, redo the
                      sequence assignment. This may be necessary if the models
                      did not have the same sequence assignments or if
                      unassigned pieces are now put together and can be
                      assigned. If None, then check_sequence will be set to
                      True if non_matching=True.
      preserve_connectivity= None This is a shortcut for turning off
                             merge_second_model, merge_both_models,
                             non_matching, and check_sequence. It is useful if
                             your two models have the same residues, just with
                             different coordinates, and you want to maintain
                             the connectivity.
      merge_only= None This is a shortcut for turning off everything except
                  merge_second_model and merge_both_models.
      solvent_fraction= None You can specify the solvent fraction
   control
      verbose= False Verbose output
      raise_sorry= False Raise sorry if problems
      debug= False Debugging output
      dry_run= False Just read in and check parameter names
      resolve_command_list= None You can supply any resolve command here NOTE:
                            for command-line usage you need to enclose the
                            whole set of commands in double quotes (")
                            and each individual command in single quotes (')
                            like this: resolve_command_list="'no_build'
                            'b_overall 23' "