The simple_ncs_from_pdb method identifies NCS in the chains in a PDB file and writes out the NCS operators in forms suitable for phenix.refine, resolve, and the AutoSol and AutoBuild Wizards.
The basic steps that the simple_ncs_from_pdb carries out are:
The chains matching is done using dynamic programming alignment of residues and atoms. The first pass contains restriction on minimal similarity, set by min/_percent (number of matching residues)/(number of residues in longer chain)
From the matching chains list, we remove chain pairs where the matching segments exceed the RMSD limit.
Matching segments are scanned for local residue misalignment. Residues where (max_atom_distance - min_atom_distance) > match/_radius are excluded from matching segment. This allow local differences in matching chains.
If matching residues have different number of atoms (For example if one containing the side chain while the other not), only the matching atoms will be included.
Grouping of chains is not performed.
Alternative conformation are excluded from matching segments.
The matching is done by the residues name strings, not by the residue numbers, this allows handling of insertions in PDB file.
The result of the NCS search is combination of NCS related groups and invariant or non-NCS related regions, to the atom level. In every NCS group all copies have the same number of atoms and can be reproduced by applying the applying the appropriate rotation and translation to the master copy.
When running
phenix.simple_ncs_from_pdb 2h50.pdb
The following files will be produced
2h50_simple_ncs_from_pdb.phil
2h50_simple_ncs_from_pdb.ncs_spec 2h50_simple_ncs_from_pdb.resolve
The file that should be used for refinement is 4boz_simple_ncs_from_pdb.phil. This file can also be modified if a particular NCS relations need to be changed. The content of that file is the exact selection sting of the atoms in the NCS groups
The content of 2h50_simple_ncs_from_pdb.ncs is
ncs_group { reference = chain 'A' selection = chain 'C' selection = chain 'E' selection = chain 'G' selection = chain 'I' selection = chain 'K' selection = chain 'M' selection = chain 'O' selection = chain 'Q' selection = chain 'S' selection = chain 'U' selection = chain 'W' }
To get output in other format:
phenix.simple_ncs_from_pdb 4boz.pdb write_spec_files=True
Simple_ncs_from_pdb will analyze the chains in 4boz.pdb and identify any NCS that exists. For this sample run the following output is produced:
GROUP 1 Summary of NCS group with 2 operators: ID of chain/residue where these apply: [['A', 'D'], [[[147, 150], [152, 211], [213, 275], [280, 305], [307, 308]], [[147, 150], [152, 211], [213, 275], [280, 305], [307, 308]]]] RMSD (A) from chain A: 0.0 0.72 Number of residues matching chain A:[155, 155] OPERATOR 1 CENTER: 24.4880 -13.3177 -20.1848 ROTA 1: 1.0000 0.0000 0.0000 ROTA 2: 0.0000 1.0000 0.0000 ROTA 3: 0.0000 0.0000 1.0000 TRANS: 0.0000 0.0000 0.0000 OPERATOR 2 CENTER: 15.9430 11.8822 0.6609 ROTA 1: 0.7964 -0.5651 0.2152 ROTA 2: -0.5503 -0.8249 -0.1295 ROTA 3: 0.2507 -0.0152 -0.9680 TRANS: 18.3631 5.3425 -23.3604 GROUP 2 Summary of NCS group with 3 operators: ID of chain/residue where these apply: [['B', 'C', 'E'], [[[1, 41], [43, 71]], [[1, 41], [43, 71]], [[1, 41], [43, 71]]]] RMSD (A) from chain B: 0.0 0.8 0.77 Number of residues matching chain B:[70, 70, 70] OPERATOR 1 CENTER: 47.5581 -9.5652 -26.8403 ROTA 1: 1.0000 0.0000 0.0000 ROTA 2: 0.0000 1.0000 0.0000 ROTA 3: 0.0000 0.0000 1.0000 TRANS: 0.0000 0.0000 0.0000 OPERATOR 2 CENTER: 27.7410 -11.0237 -49.7361 ROTA 1: -0.6661 0.2615 -0.6986 ROTA 2: 0.2866 -0.7749 -0.5634 ROTA 3: -0.6886 -0.5755 0.4412 TRANS: 34.1743 -54.0788 7.8610 OPERATOR 3 CENTER: 29.3812 -4.6379 12.2671 ROTA 1: 0.7539 -0.5901 0.2888 ROTA 2: -0.5860 -0.8027 -0.1106 ROTA 3: 0.2971 -0.0858 -0.9510 TRANS: 19.1286 5.2864 -24.3021
Another way to view the results is
phenix.simple_ncs_from_pdb 4boz.pdb show_summary=true Chains in model: --------------------------------------------------- A B C D E . . . . . . . . . . . . . . . . . . . . . . . . . . NCS summary: --------------------------------------------------- Number of NCS groups : 2 Group # : 1 Number of copies : 2 Chains in master : 'A' Chains in copies : 'D' Group # : 2 Number of copies : 3 Chains in master : 'B' Chains in copies : 'C', 'E' . . . . . . . . . . . . . . . . . . . . . . . . . . Transforms: --------------------------------------------------- Group # : 1 Transform # : 1 RMSD : 0 ROTA 0 1.0000 0.0000 0.0000 ROTA 1 0.0000 1.0000 0.0000 ROTA 2 0.0000 0.0000 1.0000 TRANS 0.0000 0.0000 0.0000 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Transform # : 2 RMSD : 0.720792956817 ROTA 0 0.7964 -0.5503 0.2507 ROTA 1 -0.5651 -0.8249 -0.0152 ROTA 2 0.2152 -0.1295 -0.9680 TRANS -5.8295 14.4282 -25.8708 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Group # : 2 Transform # : 1 RMSD : 0 ROTA 0 1.0000 0.0000 0.0000 ROTA 1 0.0000 1.0000 0.0000 ROTA 2 0.0000 0.0000 1.0000 TRANS 0.0000 0.0000 0.0000 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Transform # : 2 RMSD : 0.796197645664 ROTA 0 -0.6661 0.2866 -0.6886 ROTA 1 0.2615 -0.7749 -0.5755 ROTA 2 -0.6986 -0.5634 0.4412 TRANS 43.6766 -46.3175 -10.0616 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Transform # : 3 RMSD : 0.772963874518 ROTA 0 0.7539 -0.5860 0.2971 ROTA 1 -0.5901 -0.8027 -0.0858 ROTA 2 0.2888 -0.1106 -0.9510 TRANS -4.1022 13.4462 -28.0502
There are 5 chains in the PDB file (A,B,C,D,E). In the first group the master is A the the copy D and in the second group the master is B and the copy is C and E. Chain C is not in any group.
RMSD (A) from chain A: 0.0 0.72
shows the RMSD of matching atoms between the master and every other copy, The list of numbers is the list of matching residues by residue number. Note that this is not the exact selection, as it appears in: 4boz_simple_ncs_from_pdb.ncs.
ROTA and TRANS are the rotation and translation information. In the .ncs_spec file and the default on-screen representation of the results, the rotation and translation are:
Master = ROT x Copy + TRANS
While in the summery and in the CCBTX implementation the coomon use is:
Copy = ROT x Master + TRANS
So the rotation/Translation are the inverse of each other in the two formats.
A portion of the contents of the 4boz_simple_ncs_from_pdb.ncs_spec file, which you can edit if you want and which you can use in the AutoBuild Wizard, are shown below. NOTE: The ncs operators describe how to map the N'th ncs-related copy on to the first copy.
Summary of NCS information Thu Apr 2 15:44:03 2015 /net/cci-filer2/raid1/home/... new_ncs_group new_operator rota_matrix 1.0000 0.0000 0.0000 rota_matrix 0.0000 1.0000 0.0000 rota_matrix 0.0000 0.0000 1.0000 tran_orth 0.0000 0.0000 0.0000 center_orth 24.4880 -13.3177 -20.1848 CHAIN A RMSD 0 MATCHING 155 RESSEQ 147:150 RESSEQ 152:211 RESSEQ 213:275 RESSEQ 280:305 RESSEQ 307:308 new_operator rota_matrix 0.7955 -0.5660 0.2164 rota_matrix -0.5511 -0.8242 -0.1300 rota_matrix 0.2520 -0.0159 -0.9676 tran_orth 18.4021 5.3569 -23.3621 center_orth 15.9430 11.8822 0.6609 CHAIN D RMSD 0.817 MATCHING 155 RESSEQ 147:150 RESSEQ 152:211 RESSEQ 213:275 RESSEQ 280:305 RESSEQ 307:308
e.g. set bigger chain_max_rmsd, smaller chain_similarity_threshold.
Simple_ncs_from_pdb produces very complicated selections, like
reference = (chain 'A' and (resid 147:150 or resid 152:194 or (resid 195 and (name N or name CA or name C or name O or name CB )) or resid 196:211 or resid 213:229 or (resid 230 and (name N or name CA or name C or name O or name CB or name CG )) or resid 231:298 or (resid 299 and (name N or name CA or name C or name O or name CB or name CG )) or resid 300:305 or resid 307:308))
This indicates that procedure is trying to exclude some residues or atoms from the selections. This could be caused by alternative confomations of residues (not supported), poor alignment of particular residues (try to increase residue_match_radius) or missing atoms in one of the chains.
Master = ROT x Copy + TRANS
and not:
Copy = ROT x Master + TRANS
They are the inverse of the rotation and translation that are used in the implementation of the NCS relation.