Finding and analyzing NCS from heavy-atom sites or a model with find_ncs
- find_ncs: Tom Terwilliger
- simple_ncs_from_pdb : Tom Terwilliger
- Phil command interpreter: Ralf W. Grosse-Kunstleve
- find_domain: Peter Zwart
The find_ncs method identifies NCS in (a) the chains in a PDB file or (b) a set of heavy-atom sites, or (c) from the density in a map and writes out the NCS operators in forms suitable for phenix.refine, resolve, and the AutoSol and AutoBuild Wizards, or (d) from an existing symmetry file (.ncs_spec), or (e) it can generate NCS for a helical array from helical parameters.
The basic steps that the find_ncs carries out are:
- Decide whether to use simple_ncs_from_pdb (used if the input file contains chains from a PDB file) or RESOLVE NCS identification (used if the input file contains heavy-atom sites) or find_ncs_from_density (used if a map is supplied and sites are not found from either of the above)
- call either simple_ncs_from_pdb or RESOLVE or find_ncs_from_density to identify NCS
- Evaluate the NCS by calculating the correlation of NCS-related electron density based on the input map coefficients mtz file.
- Report the NCS operators and correlations
The output files that are produced are:
- NCS operators written in format for phenix.refine: find_ncs.ncs
- NCS operators written in format for the PHENIX Wizards: find_ncs.ncs_spec
- find_ncs needs a file with map coefficients and optionally a second file containing NCS information.
The file with NCS information can be...
- a PDB file with a model (find_ncs will call simple_ncs_from_pdb to extract NCS operators from the chains in your model)
- a PDB file with heavy-atom sites (find_ncs will call RESOLVE to find NCS operators from your heavy-atom sites)
- an NCS definitions file written by a PHENIX wizard (e.g., AutoSol_1.ncs_spec, produced by AutoSol)
- a RESOLVE log file containing formatted NCS operators
The file with map coefficients can be any MTZ file with coefficients for a map. If find_ncs does not choose the correct columns automatically, then you can specify them with a command like:
labin="labin FP=FP PHIB=PHIB FOM=FOM "
If you have no map coefficients yet (you just have some sites and want to get operators, for example), you can tell find_ncs to ignore the map with:
ncs_parameters.force_ncs=True
You still need to supply a map coefficients file (the map itself will be ignored though).
If you have just a map coefficients file you can say:
phenix.find_ncs mlt.mtz
and find_ncs will look for NCS-related density in mlt.mtz (see find_ncs_from_density for more details on how this works)
If you have a model and a map file, from the command-line you can type:
phenix.find_ncs anb.pdb mlt.mtz
This will produce the following output:
Getting column labels from mlt.mtz for input map file
FILE TYPE: ccp4_mtz
All labels: ['FP', 'SIGFP', 'PHIC', 'FOM']
Labin line will be: labin FP=FP PHIB=PHIC FOM=FOM
To change it modify this: params.ncs.labin="labin FP=FP PHIB=PHIC FOM=FOM "
This is the map that will be used to evaluate NCS
Reading NCS information from: anb.pdb
Copying mlt.mtz to temp_dir/mlt.mtz
This PDB file contains 2 chains and 636 total residues
and 636 C-alpha or P atoms and 4740 total atoms
NCS will be found using the chains in this PDB file
Chains in this PDB file: ['M', 'Z']
Two chains were found in the file anb.pdb, chain M and chain Z
GROUPS BASED ON QUICK COMPARISON: []
Looking for invariant domains for ...: ['M', 'Z'] [[[2, 138], [193, 373]], [[2, 138], [193, 373]]]
Residues 2-138, 193-373, matched between the two chains
Copying mlt.mtz to temp_dir/mlt.mtz
Copying temp_dir/NCS_correlation.log to NCS_correlation.log
Log file for NCS correlation is in NCS_correlation.log
List of refined NCS correlations: [1.0, 0.80000000000000004]
There were two separate groups of residues that had different NCS
relationships. Residues 193-373 of each chain were in one group, and
residues 2-138 in each chain were in the other group.
The electron density map had a correlation between the two NCS-related
chains of 1.0 for the first group, and 0.8 for the second
The NCS operators for each are listed.
GROUP 1
Summary of NCS group with 2 operators:
ID of chain/residue where these apply: [['M', 'Z'], [[[193, 373]], [[193, 373]]]]
RMSD (A) from chain M: 0.0 0.0
Number of residues matching chain M:[181, 181]
Source of NCS info: anb.pdb
Correlation of NCS: 1.0
OPERATOR 1
CENTER: 69.1058 -9.5443 59.4674
ROTA 1: 1.0000 0.0000 0.0000
ROTA 2: 0.0000 1.0000 0.0000
ROTA 3: 0.0000 0.0000 1.0000
TRANS: 0.0000 0.0000 0.0000
OPERATOR 2
CENTER: 37.5004 -37.0709 -62.5441
ROTA 1: 0.7751 -0.6211 -0.1162
ROTA 2: -0.3607 -0.5859 0.7256
ROTA 3: -0.5188 -0.5205 -0.6782
TRANS: 9.7485 27.6460 17.2076
GROUP 2
Summary of NCS group with 2 operators:
ID of chain/residue where these apply: [['M', 'Z'], [[[2, 138]], [[2, 138]]]]
RMSD (A) from chain M: 0.0 0.0
Number of residues matching chain M:[137, 137]
Source of NCS info: anb.pdb
Correlation of NCS: 0.8
OPERATOR 1
CENTER: 66.6943 -13.3128 21.6769
ROTA 1: 1.0000 0.0000 0.0000
ROTA 2: 0.0000 1.0000 0.0000
ROTA 3: 0.0000 0.0000 1.0000
TRANS: 0.0000 0.0000 0.0000
OPERATOR 2
CENTER: 39.0126 -53.7392 -13.4457
ROTA 1: 0.3702 -0.9275 -0.0516
ROTA 2: -0.8933 -0.3402 -0.2938
ROTA 3: 0.2549 0.1548 -0.9545
TRANS: 1.7147 -0.6936 7.2172
- find_ncs
- job_title = None Job title in PHENIX GUI, not used on command line
- input_files
- ncs_in_type = *None chains sites ncs_file Type of ncs information. Choices are:
chains: a PDB file with two or more chains that have a consistent
residue-numbering system.
sites: a PDB file or fractional-coordinate file with atomic
positions of heavy-atoms that show NCS
ncs_file: an ncs object file from PHENIX.
- ncs_in = None File with NCS information (PDB file with heavy-atom sites
or with NCS-related chains) or NCS object
- mtz_in = None MTZ file with coefficients for a map that can be used
to assess NCS. Required for finding NCS from heavy-atom sites
- labin = "" Labin line for MTZ file with map coefficients.
This is optional if find_ncs can guess the correct coefficients
for FP PHI and FOM. Otherwise specify:
LABIN FP=myFP PHIB=myPHI FOM=myFOM
where myFP is your column label for FP
- map_in = None CCP4 or MRC-style map file (instead of mtz_in)
- output_files
- params_out = find_ncs_params.eff Parameters file to rerun find_ncs
- ncs_out = find_ncs.ncs_spec Output file with NCS information. If set, then files will also be written for resolve and phenix.refine and as BIOMT records
- ncs_format = *ncs_spec biomt Format of output NCS information. ncs_spec is standard phenix ncs format. biomt is BIOMT records.
- ncs_mask_file = None If defined and mtz_in is defined, map coefficients will be written to this file that show the asymmetric unit of NCS. By default the file contains map coefficients. To write a CCP4-style map. Use ncs_mask_as_mtz=False to write map coefficients for the mask.
- ncs_mask_as_mtz = True If True, the ncs_mask_file will be map coefficients. If False a CCP4-style map will be written.
- ncs_au_file = None If defined, map coefficients showing the asymmetric unit of NCS will be written to this file.
- crystal_info
- resolution = None high-resolution limit for map calculation
- directories
- temp_dir = "temp_dir" Temporary work directory
- output_dir = "" Output directory where files are to be written
- gui_output_dir = None
- control
- verbose = True Verbose output
- debug = False Debugging output
- raise_sorry = False Raise sorry if problems
- dry_run = False Just read in and check parameter names
- require_nonzero = True Require non-zero values in data columns to consider reading in.
- ncs_parameters
- invert_matrices = False Matrices normally describe how to map each operator on to the first one. To get operators for mapping the first one on the others, set invert_matrices=True
- chain_max_rmsd = 2 limit of rms difference between chains to be considered
as copies
- ncs_restrict = 0 You can specify the number of NCS operators to look for
- force_ncs = False You can tell find_ncs to ignore the map. This is useful
if you only have FP but no phases yet...
- optimize_ncs = False You can tell find_ncs to optimize the NCS by making
as compact a molecule as possible.
- no_ncs_sg_mismatch = True If one or more NCS operators is the same as a SG operator, skip the NCS
- try_density_after_ha = True If search for NCS using ha sites does not yield NCS correlation of at least target_ncs_cc or the number of operators is less than target_ncs_operators then try finding NCS directly from density
- target_ncs_operators = None Number of expected NCS operators (used to decide whether to try density search)
- ncs_copies_max = None Max number of allowed NCS operators in density search
- target_ncs_cc = 0.80 Expected NCS correlation
- minimum_ncs_cc = 0.30 Minimum NCS correlation . If the overall NCS correlation is less than minimum_ncs_cc it will be ignored. NOTE: This is applied at the very end of NCS identification. The parameter minimum_ncs_cc_from_density is applied in finding NCS from density as well.
- fraction_ncs_min = 0.05 Minimum fraction of the asymmetric unit covered by NCS. NCS ignored if less than this.
- helix_extend_range = None Range in z to extend helical operators (A)
- helix_trans_along_z = None Along with helix_theta, defines helical parameters. Note: matrices are for mapping position k on to identity, this may be the opposite of what you would expect.
- helix_theta = None Along with helix_trans_along_z, defines helical parameters. Note: matrices are for mapping position k on to identity, this may be the opposite of what you would expect.
- n_try_ncs = 3 Number of tries to find ncs from heavy-atom sites
- ncs_thorough = 8
- ha_ds_window = None Used to specify window of atom-atom lengths in heavy-atom search. None uses defaults
- ha_window_max = 30 Used to specify max atom-atom length in heavy-atom search. None uses defaults. Specify a longer distance to look more thoroughly, a shorter distance to speed up the process
- dist_cut_ncs = None Thoroughness for looking for heavy-atom
sites (high=more thorough)
- coordinate_offset = None Origin offset to apply to ncs_in
- tol_r = 0.02 tolerance in rotations for point group or helical symmetry
- abs_tol_t = 2 tolerance in translations (A) for point group or helical symmetry
- rel_tol_t = 0.05 tolerance in translations (fractional) for point group or helical symmetry
- find_ncs_from_density
- input_files
- mtz_in = None MTZ file with coefficients for a map
- labin = "" Labin line for MTZ file with map coefficients.
This is optional if find_ncs_from_density
can guess the correct coefficients
for FP PHI and FOM. Otherwise specify:
LABIN FP=myFP PHIB=myPHI FOM=myFOM
where myFP is your column label for FP
- center_pdb_in = None Optional PDB file with list of centers of density to be used as search model
- density_mtz_in = None MTZ file with coefficients for density around center defined by first atom in center_pdb_in (required) NOTE: Must be in space group p1 with identical cell to mtz_in
- density_labin = "" Labin line for MTZ file density_mtz_in.
This is optional if find_ncs_from_density
can guess the correct coefficients
for FP PHI and FOM. Otherwise specify:
LABIN FP=myFP PHIB=myPHI FOM=myFOM
where myFP is your column label for FP
- output_files
- centers_pdb_out = guess_molecular_centers.pdb Output PDB file with coords of centers
- log = find_ncs_from_density.log Output log file
- ncs_spec_file = 'find_ncs_from_density.ncs_spec' NCS specification file with all NCS information
- params_out = find_ncs_from_density_params.eff Parameters file to rerun find_ncs_from_density
- ncs_mask_file = None If defined, map coefficients will be written to this file that show the asymmetric unit of NCS.
- directories
- temp_dir = "temp_dir" Temporary work directory
- output_dir = "" Output directory where files are to be written
- density_search
- density_radius = 10. Radius for density to be cut out and compared
- peak_separation = 15 Minimum distance between centers. Use about 1.5*density_radius
- density_peaks = 20 Number of NCS-related peaks of density to output
- delta_phi = 20 Angular spacing of search
- rotz_only = False Rotate around Z only
- ncs_copies_max = None Maximum number of NCS copies to look for
- min_ratio_to_top_cc = 0.75 Peaks will be kept up to ncs_copies_max, or min_ratio_to_top_cc * best cc, whichever comes first
- minimum_ncs_cc_from_density = 0.30 Overall NCS CC at full resolution must be at least this high
- fraction_ncs_min_from_density = 0.05 Minimum fraction of the asymmetric unit covered by NCS. NCS ignored if less than this.
- dump_ncs_density = False You can dump 1 mtz file for each possible NCS copy so that you can see if the NCS is real Each file has the local map transformed to the orientation and position of copy 1 so they should all superimpose The files are written to the temp_dir (e.g., temp_dir/ncs_1_map_coeffs_001.mtz )
- map_operators_inside_unit_cell = False If set then put the operators inside the unit cell. This removes the use of space-group symmetry for SG P1 and is used normally for non-crystal data.
- find_centers
- smoothing_radius = None Radius for smoothing squared density to find centers Choose a smaller value to get more center guesses. Default is same value as density_radius.
- n_center_find = None Target number of centers to find Smoothing radius will be varied from 2 to 2*smoothing_radius and the value giving the number of peaks closest to n_center_find will be used. Normally leave this set to None
- n_center_use = 20 Number of locations to consider as molecular centers in density search. Try increasing this number to look harder for NCS.
- try_all_centers = False If set, try all n_center_use centers. Normally stop if a good solution is found.
- ok_ncs_cc = 0.6 NCS correlation good enough to terminate search
- crystal_info
- resolution = 4. High-resolution limit for map calculation It is useful to cut the resolution at 3-5 A If you use higher resolution, use a finer delta_phi
- solvent_fraction = 0.5 solvent fraction
- control
- verbose = True Verbose output
- debug = False Debugging output
- raise_sorry = False Raise sorry if problems
- dry_run = False Just read in and check parameter names
- resolve_command_list = None You can supply any resolve command here NOTE: for command-line usage you need to enclose the whole set of commands in double quotes (") and each individual command in single quotes (') like this: resolve_command_list="'no_build' 'b_overall 23' "