The simple_ncs_from_pdb method identifies NCS in the chains in a PDB file and writes out the NCS operators in forms suitable for phenix.refine, resolve, and the AutoSol and AutoBuild Wizards.
The basic steps that the simple_ncs_from_pdb carries out are:
The chains matching is done using dynamic programming alignment of residues and atoms. The first pass contains restriction on minimal similarity, set by min/_percent (number of matching residues)/(number of residues in longer chain)
From the matching chains list, we remove chain pairs where the matching segments exceed the RMSD limit.
Matching segments are scanned for local residue misalignment. Residues where (max_atom_distance - min_atom_distance) > match/_radius are excluded from matching segment. This allow local differences in matching chains.
If matching residues have different number of atoms (For example if one containing the side chain while the other not), only the matching atoms will be included.
Grouping of chains done in two steps. First, a single master chain to copies that share matching segments. Second, if two groups of master and copies share the same rotations and translations, we group the masters and the copies.
When grouping chains together, we limit the similarity level of chains pairs being grouped using similarity/_threshold. Lower similarity/_threshold will allow grouping of more chains.
Similarity groping example: Consider four chains (A,B,C,D) where A has 100 residues, B has 50, C has 50 and D has 50. Consider that A-B have 50 contiguous matching residues and that C and D have 40. Also assume that A -> B has the same rotation and translation as C -> D. (A,B) similarity is 50/100 = 0.5 and (C,D) similarity is 40/50 = 0.8. The 0.5/0.8 = 0.625, if the similarity/_threshold=0.95 we can't group will get two groups [(A),(B)] , [(C),(D)]. but if the similarity/_threshold=0.6, we will get a single group [(A,C),(B,D)] where each NCS copy contains two chains.
The search for master to group is done using spatial proximity, to improve the probability that chains that are next to each other will be group together.
Alternative confirmation are excluded from matching segments.
The matching is done by the residues name strings, not by the residue numbers, this allows handling of insertions in PDB file.
The result of the NCS search is combination of NCS related groups and invariant or non-NCS related regions, to the atom level. In every NCS group all copies have the same number of atoms and can be reproduced by applying the applying the appropriate rotation and translation to the master copy.
When running
phenix.simple_ncs_from_pdb 4boz.pdb
The following files will be produced
4boz_simple_ncs_from_pdb.ncs
4boz_simple_ncs_from_pdb.ncs_spec 4boz_simple_ncs_from_pdb.resolve
Running simple_ncs_from pdb is easy. For example, you can type:
phenix.simple_ncs_from_pdb 4boz.pdb
Simple_ncs_from_pdb will analyze the chains in 4boz.pdb and identify any NCS that exists. For this sample run the following output is produced:
GROUP 1 Summary of NCS group with 2 operators: ID of chain/residue where these apply: [['A', 'D'], [[[147, 150], [152, 211], [213, 275], [280, 305], [307, 308]], [[147, 150], [152, 211], [213, 275], [280, 305], [307, 308]]]] RMSD (A) from chain A: 0.0 0.82 Number of residues matching chain A:[155, 155] OPERATOR 1 CENTER: 24.4880 -13.3177 -20.1848 ROTA 1: 1.0000 0.0000 0.0000 ROTA 2: 0.0000 1.0000 0.0000 ROTA 3: 0.0000 0.0000 1.0000 TRANS: 0.0000 0.0000 0.0000 OPERATOR 2 CENTER: 15.9430 11.8822 0.6609 ROTA 1: 0.7955 -0.5660 0.2164 ROTA 2: -0.5511 -0.8242 -0.1300 ROTA 3: 0.2520 -0.0159 -0.9676 TRANS: 18.4021 5.3569 -23.3621 GROUP 2 Summary of NCS group with 2 operators: ID of chain/residue where these apply: [['B', 'E'], [[[1, 41], [43, 76]], [[1, 41], [43, 76]]]] RMSD (A) from chain B: 0.0 0.88 Number of residues matching chain B:[75, 75] OPERATOR 1 CENTER: 46.1427 -9.2567 -26.1789 ROTA 1: 1.0000 0.0000 0.0000 ROTA 2: 0.0000 1.0000 0.0000 ROTA 3: 0.0000 0.0000 1.0000 TRANS: 0.0000 0.0000 0.0000 OPERATOR 2 CENTER: 28.3065 -4.0803 11.2523 ROTA 1: 0.7550 -0.5940 0.2778 ROTA 2: -0.5894 -0.8004 -0.1096 ROTA 3: 0.2874 -0.0810 -0.9544 TRANS: 19.2296 5.3926 -23.9141
Another way to view the results without creating files is
phenix.simple_ncs_from_pdb 4boz.pdb show_summary=true Chains in model: --------------------------------------------------- A B C D E . . . . . . . . . . . . . . . . . . . . . . . . . . NCS summary: --------------------------------------------------- Number of NCS groups : 2 Group # : 1 Number of copies : 2 Chains in master : A Chains in copies : D Group # : 2 Number of copies : 2 Chains in master : B Chains in copies : E . . . . . . . . . . . . . . . . . . . . . . . . . . Transforms: --------------------------------------------------- Group # : 1 Transform # : 1 RMSD : 0 ROTA 0 1.0000 0.0000 0.0000 ROTA 1 0.0000 1.0000 0.0000 ROTA 2 0.0000 0.0000 1.0000 TRANS 0.0000 0.0000 0.0000 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Transform # : 2 RMSD : 0.817 ROTA 0 0.7955 -0.5511 0.2520 ROTA 1 -0.5660 -0.8242 -0.0159 ROTA 2 0.2164 -0.1300 -0.9676 TRANS -5.7994 14.4602 -25.8920 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Group # : 2 Transform # : 1 RMSD : 0 ROTA 0 1.0000 0.0000 0.0000 ROTA 1 0.0000 1.0000 0.0000 ROTA 2 0.0000 0.0000 1.0000 TRANS 0.0000 0.0000 0.0000 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Transform # : 2 RMSD : 0.8756 ROTA 0 0.7550 -0.5894 0.2874 ROTA 1 -0.5940 -0.8004 -0.0810 ROTA 2 0.2778 -0.1096 -0.9544 TRANS -4.4667 13.8014 -27.5738
There are 5 chains in the PDB file (A,B,C,D,E). In the first group the master is A the the copy D and in the second group the master is B and the copy is E. Chain C is not in any group.
RMSD (A) from chain A: 0.0 0.82
show the RMSD of matching atoms between the master and every other copy, The list of numbers is the list of matching residues by residue number. Note that this is not the exact selection, as it appears in: 4boz_simple_ncs_from_pdb.ncs.
ROTA and TRANS are the rotation and translation information. In the .ncs_spec file and the default on-screen representation of the results, the rotation and translation are:
Master = ROT x Copy + TRANS
While in the summery and in the CCBTX implementation the coomon use is:
Copy = ROT x Master + TRANS
So the rotation/Translation are the inverse of each other in the two formats.
A portion of the contents of the 4boz_simple_ncs_from_pdb.ncs_spec file, which you can edit if you want and which you can use in the AutoBuild Wizard, are shown below. NOTE: The ncs operators describe how to map the N'th ncs-related copy on to the first copy.
Summary of NCS information Thu Apr 2 15:44:03 2015 /net/cci-filer2/raid1/home/... new_ncs_group new_operator rota_matrix 1.0000 0.0000 0.0000 rota_matrix 0.0000 1.0000 0.0000 rota_matrix 0.0000 0.0000 1.0000 tran_orth 0.0000 0.0000 0.0000 center_orth 24.4880 -13.3177 -20.1848 CHAIN A RMSD 0 MATCHING 155 RESSEQ 147:150 RESSEQ 152:211 RESSEQ 213:275 RESSEQ 280:305 RESSEQ 307:308 new_operator rota_matrix 0.7955 -0.5660 0.2164 rota_matrix -0.5511 -0.8242 -0.1300 rota_matrix 0.2520 -0.0159 -0.9676 tran_orth 18.4021 5.3569 -23.3621 center_orth 15.9430 11.8822 0.6609 CHAIN D RMSD 0.817 MATCHING 155 RESSEQ 147:150 RESSEQ 152:211 RESSEQ 213:275 RESSEQ 280:305 RESSEQ 307:308
The file used for refinement is 4boz_simple_ncs_from_pdb.ncs. This file can also be modified if a particular NCS relations need to be reinforced. The content of that file is the exact selection sting of the atoms in the NCS groups
The content of 1vcr_simple_ncs_from_pdb.ncs is
refinement.ncs.restraint_group { reference = chain A selection = chain B selection = chain C selection = chain D selection = chain E }
If the following option is used
phenix.simple_ncs_from_pdb 1vcr.pdb ncs_file_format=constraints
The content of 1vcr_simple_ncs_from_pdb.ncs will have constraint_group instead of restraint_group
refinement.ncs.constraint_group { reference = chain A selection = chain B selection = chain C selection = chain D selection = chain E }
Master = ROT x Copy + TRANS and not:
Copy = ROT x Master + TRANS They are the inverse of the rotation and translation that are used in the implementation of the NCS relation.