Comparison of CA positions in two models allowing any order of fragments

Author(s)

chain_comparison: Tom Terwilliger

Purpose

Identify how many of the CA atoms in one model match CA atoms in another model. Separately identify how many of these go in the same direction and how many of the residue names match.

Usage

How chain_comparison works:

chain_comparison runs normally selects the unique part of the target model (the first model specified), then it takes the entire query model and counts how many residues in the target model are matched by a residue in the query model. If NCS is supplied, the unique part of the query is selected, NCS is applied and then the same analysis is carried out.

If desired, you can have chain_comparison write out all the residues in the query model that are matching residues in the target model.

Examples

Standard run of chain_comparison:

Running chain_comparison is easy. From the command-line you can type:

phenix.chain_comparison target.pdb query.pdb

If you want to compare P positions for RNA/DNA instead, you can say:

phenix.chain_comparison target.pdb query.pdb chain_type=RNA

Possible Problems

Specific limitations and problems:

Literature

Additional information

List of all available keywords

job_title = None Job title in PHENIX GUI, not used on command line
input_files
- pdb_in = None Input PDB file (enter target first and then query) query_dir is set)
- unique_query_only = False Use only unique chains in query. Normally use unique_query_only=False and unique_part_of_target_only=True.
- unique_target_pdb_in = None Target model identifying which element is selected with unique_query_only. NOTE: must be specified by keyword.
- unique_part_of_target_only = None Use only unique chains in target (see also unique_query_only). .short_caption = Unique target only
- test_unique_part_of_target_only = True Try both unique_part_of_target_only as True and False and report result for whichever gives higher value of fraction matching. Cannot be used with match_pdb_file
- allow_extensions = False If True, ignore parts of chains that do not overlap. Normally use False: identity is identity of overlapping part times fraction of chain that overlaps.
- ncs_file = None NCS file. If unique_query_only is False (typically) apply NCS to it to generate full query. Normally used with test_unique_part_of_target_only=True. NOTE: if your structure has very high symmetry, including an NCS file can result in extremely long run times. It may be better in such cases to supply target and query files that have NCS applied (or matching structures without NCS) and not to supply an NCS file.
- query_dir = None directory containing query PDB files (any number)
output_files
- match_pdb_file = None Output file containing segments with specified match percentage
crystal_info
- chain_type = *PROTEIN RNA DNA Chain type. All residues of other chain types ignored.
- use_crystal_symmetry = None Default is True if space group is not P1. If set, use crystal symmetry to map atoms to closest positions
comparison
- max_dist = 3. Maximum distance between atoms to be considered close
- distance_per_site = None Maximum distance spanned by a pair of residues. Set by default as 3.8 A for protein and 8 A for RNA
- min_similarity = 0.99 When choosing unique chains, use min_similarity cutoff. This applies to both chain length and the sequence itself.
- target_length_from_matching_chains = False Use length of chains in target that are matched to define full target length (as opposed to all unique chains in target).
- minimum_percent_match_to_select = None You can specify minimum_percent_match_to_select and maximum_percent_match_to_select and match_pdb_file in which case all segments in the query model that have a percentage match (within max_dist of atom in target) in this range will be written out to match_pdb_file.
- maximum_percent_match_to_select = None You can specify minimum_percent_match_to_select and maximum_percent_match_to_select and match_pdb_file in which case all segments in the query model that have a percentage match (within max_dist of atom in target) in this range will be written out to match_pdb_file.
- remove_alt_conf = True Remove alternate conformers before analysis. This is normally required to align correctly.
- residue_groups = "VGASCTI P LDNEQM KR FHY W" Optional groups of residues to score together
- score_by_residue_groups = False Use residue groups in sequence alignment
control
- verbose = False Verbose output
- quiet = False No printed output
guiGUI-specific parameter required for output directory
- output_dir = None