Sorting heteroatoms
The program phenix.sort_hetatms is a utility designed to re-group the
non-polymeric molecules (heteroatoms) in a model in a roughly similar manner
to the format used by the Protein Data Bank. This consists of matching all
heteroatoms to the nearest polymer chain, resetting the chain IDs and
residue numbers, converting ATOM labels to HETATM, and optionally sorting
waters by B-factor. It is primarily intended or internal use, and is also
run within phenix.ligand_pipeline. The only
input required is a PDB file, since the program does not deal with molecular
geometry or experimental data. Any REMARK records will be removed, but this
can be disabled by setting preserve_remarks=True.
Note that the PDB will, inevitably, further modify the contents of the model
according to its own rules. However, the program output is significantly
closer to the PDB conventions than the output of phenix.refine or similar
programs.
List of all available parameters
- file_name = None Input file
- output_file = None
- unit_cell = None
- space_group = None
- ignore_symmetry = False Don't take symmetry-related chains into account when determining the nearest macromolecule.
- preserve_remarks = False Propagate all REMARK records to the output file.
- verbose = False
- remove_hetatm_ter_records = True The official PDB format only allows TER records at the end of polymer chains, whereas the CCTBX PDB-handling tools will insert TER after each chain of any type. If this parameter is True, the extra TER records will be removed.
- preserve_chain_id = False The default behavior is to group heteroatoms with the nearest macromolecule chain, whose ID is inherited. This parameter disables the change of chain ID, and preserves the original chain ID.
- waters_only = False Rearrange waters, but leave all other ligands alone.
- sort_waters_by = none *b_iso Ordering of waters - by default it will sort them by the isotropic B-factor.
- set_hetatm_record = True Convert ATOM to HETATM where appropriate.
- ignore_selection = None Selection of atoms to skip. Any residue group which overlaps with this selection will be preserved with the original chain ID and numbering.
- renumber = True Renumber heteroatoms once they are in new chains.
- sequential_numbering = True If True, the heteroatoms will be renumbered starting from the next available residue number after the end of the associated macromolecule chain. Otherwise, numbering will start from 1.
- distance_cutoff = 6.0 Cutoff for identifying nearby macromolecule chains. This should be kept relatively small for speed reasons, but it may miss waters that are far out in solvent channels.
- remove_waters_outside_radius = False Remove waters more than the specified distnace cutoff to the nearest polymer chain (to avoid PDB complaints).
- loose_chain_id = X Chain ID assigned to heteroatoms that can't be mapped to a nearby macromolecule chain.
- job_title = None Job title in PHENIX GUI, not used on command line