Hi Tanya,

thanks for sending the data and model files. Here are some results.

1) Starting values corresponding to the model you sent me:

r_work = 0.2891 r_free = 0.3267
MOLPROBITY STATISTICS.
 ALL-ATOM CLASHSCORE : 67.04
 RAMACHANDRAN PLOT:
   OUTLIERS : 6.04  %
   ALLOWED  : 16.58 %
   FAVORED  : 77.38 %
 ROTAMER OUTLIERS : 14.80 %
 CBETA DEVIATIONS : 13

2) I ran six refinement jobs using input model with and without H atoms, and for each model I tried 3 ways of using NCS in refinement: the new option defining NCS in torsion angle space, let phenix.refine define NCS automatically and apply it in Cartesian space, and finally I used the NCS selections that you sent me. So 2 (model with and w/o H) * 3 (NCS options) = 6 refinements in total.

Here are the results:

MODEL with H:

- torsion NCS:

r_work = 0.2951 r_free = 0.3319
REMARK   3  MOLPROBITY STATISTICS.
REMARK   3   ALL-ATOM CLASHSCORE : 19.91
REMARK   3   RAMACHANDRAN PLOT:
REMARK   3     OUTLIERS : 1.99  %
REMARK   3     ALLOWED  : 6.99  %
REMARK   3     FAVORED  : 91.02 %
REMARK   3   ROTAMER OUTLIERS : 0.45 %
REMARK   3   CBETA DEVIATIONS : 0

- Cartesian NCS defined automatically:

r_work = 0.3260 r_free = 0.3427
REMARK   3  MOLPROBITY STATISTICS.
REMARK   3   ALL-ATOM CLASHSCORE : 12.79
REMARK   3   RAMACHANDRAN PLOT:
REMARK   3     OUTLIERS : 1.02  %
REMARK   3     ALLOWED  : 5.97  %
REMARK   3     FAVORED  : 93.01 %
REMARK   3   ROTAMER OUTLIERS : 14.07 %
REMARK   3   CBETA DEVIATIONS : 45

- Cartesian NCS using your selections for NCS groups:

r_work = 0.2792 r_free = 0.3245
REMARK   3  MOLPROBITY STATISTICS.
REMARK   3   ALL-ATOM CLASHSCORE : 15.29
REMARK   3   RAMACHANDRAN PLOT:
REMARK   3     OUTLIERS : 0.90  %
REMARK   3     ALLOWED  : 7.77  %
REMARK   3     FAVORED  : 91.33 %
REMARK   3   ROTAMER OUTLIERS : 12.87 %
REMARK   3   CBETA DEVIATIONS : 36


MODEL without H:

- torsion NCS:

r_work = 0.2938 r_free = 0.3301
REMARK   3  MOLPROBITY STATISTICS.
REMARK   3   ALL-ATOM CLASHSCORE : 30.12
REMARK   3   RAMACHANDRAN PLOT:
REMARK   3     OUTLIERS : 1.89  %
REMARK   3     ALLOWED  : 7.01  %
REMARK   3     FAVORED  : 91.11 %
REMARK   3   ROTAMER OUTLIERS : 0.22 %
REMARK   3   CBETA DEVIATIONS : 2

- Cartesian NCS defined automatically:

r_work = 0.3174 r_free = 0.3354
REMARK   3  MOLPROBITY STATISTICS.
REMARK   3   ALL-ATOM CLASHSCORE : 33.76
REMARK   3   RAMACHANDRAN PLOT:
REMARK   3     OUTLIERS : 0.63  %
REMARK   3     ALLOWED  : 6.31  %
REMARK   3     FAVORED  : 93.06 %
REMARK   3   ROTAMER OUTLIERS : 12.60 %
REMARK   3   CBETA DEVIATIONS : 142

- Cartesian NCS using your selections for NCS groups:

r_work = 0.2762 r_free = 0.3233
REMARK   3  MOLPROBITY STATISTICS.
REMARK   3   ALL-ATOM CLASHSCORE : 41.83
REMARK   3   RAMACHANDRAN PLOT:
REMARK   3     OUTLIERS : 0.73  %
REMARK   3     ALLOWED  : 8.32  %
REMARK   3     FAVORED  : 90.95 %
REMARK   3   ROTAMER OUTLIERS : 18.36 %
REMARK   3   CBETA DEVIATIONS : 118

3) Looking at the results above, I would say we have two overall equally good results (in my interpretation):

MODEL with H (torsion NCS):

r_work = 0.2951 r_free = 0.3319
REMARK   3  MOLPROBITY STATISTICS.
REMARK   3   ALL-ATOM CLASHSCORE : 19.91
REMARK   3   RAMACHANDRAN PLOT:
REMARK   3     OUTLIERS : 1.99  %
REMARK   3     ALLOWED  : 6.99  %
REMARK   3     FAVORED  : 91.02 %
REMARK   3   ROTAMER OUTLIERS : 0.45 %
REMARK   3   CBETA DEVIATIONS : 0

and

MODEL without H (torsion NCS):

r_work = 0.2938 r_free = 0.3301
REMARK   3  MOLPROBITY STATISTICS.
REMARK   3   ALL-ATOM CLASHSCORE : 30.12
REMARK   3   RAMACHANDRAN PLOT:
REMARK   3     OUTLIERS : 1.89  %
REMARK   3     ALLOWED  : 7.01  %
REMARK   3     FAVORED  : 91.11 %
REMARK   3   ROTAMER OUTLIERS : 0.22 %
REMARK   3   CBETA DEVIATIONS : 2

compare it to your original structure:

r_work = 0.2891 r_free = 0.3267
MOLPROBITY STATISTICS.
 ALL-ATOM CLASHSCORE : 67.04
 RAMACHANDRAN PLOT:
   OUTLIERS : 6.04  %
   ALLOWED  : 16.58 %
   FAVORED  : 77.38 %
 ROTAMER OUTLIERS : 14.80 %
 CBETA DEVIATIONS : 13

4) To be able to reproduce these numbers you need to use the most recent PHENIX version from the nightly builds:
http://www.phenix-online.org/download/nightly_builds.cgi
use dev-838 and up.

5) I'm sending the relevant files off-list.

6) The six commands I used are:

phenix.refine data.mtz model_H.pdb ramachandran_restraints=true main.number_of_mac=5 optimize_xyz_weight=true optimize_adp_weight=true --overwrite main.ncs=true ncs.type=torsion output.prefix=H secondary_structure_restraints=true *cif xray_data.high_res=3.6 > & zlogH &

phenix.refine data.mtz model.pdb ramachandran_restraints=true main.number_of_mac=5 optimize_xyz_weight=true optimize_adp_weight=true --overwrite main.ncs=true ncs.type=torsion output.prefix=noH secondary_structure_restraints=true *cif xray_data.high_res=3.6 > & zlognoH &


phenix.refine data.mtz model_H.pdb excessive_distance_limit=None ramachandran_restraints=true main.number_of_mac=5 optimize_xyz_weight=true optimize_adp_weight=true --overwrite main.ncs=true output.prefix=H_ncsC secondary_structure_restraints=true *cif xray_data.high_res=3.6 > & zlogH_ncsC &

phenix.refine data.mtz model.pdb   excessive_distance_limit=None ramachandran_restraints=true  main.number_of_mac=5 optimize_xyz_weight=true optimize_adp_weight=true --overwrite main.ncs=true output.prefix=noH_ncsC secondary_structure_restraints=true *cif xray_data.high_res=3.6 > & zlognoH_ncsC &


phenix.refine data.mtz model_H.pdb excessive_distance_limit=None ncs_groups_H.params ramachandran_restraints=true main.number_of_mac=5 optimize_xyz_weight=true optimize_adp_weight=true --overwrite main.ncs=true output.prefix=H_ncsC_cust secondary_structure_restraints=true *cif xray_data.high_res=3.6 > & zlogH_ncsC_cust &

phenix.refine data.mtz model.pdb   excessive_distance_limit=None ncs_groups_H.params ramachandran_restraints=true  main.number_of_mac=5 optimize_xyz_weight=true optimize_adp_weight=true --overwrite main.ncs=true output.prefix=noH_ncsC_cust secondary_structure_restraints=true *cif xray_data.high_res=3.6 > & zlognoH_ncsC_cust &

It may not be necessary to use weights optimization, although I did not try without it. I recommend you try it: if it turns out to be not necessary than it may save you many hours of time.

Also, the amount of CB-outliers and Ramachandran outliers in your original model probably indicates that it is far from final and requires some careful analysis manually.

As Nat mentioned before, fit_rotamers will not work at resolutions lower than 2.8-3.0A, as it relies on "reasonably good" density. I plan to extend it to lower resolutions, but it will take some time.

You don't need to run tools like phenix.ramalyze, etc separately after refinement, since most of Molprobity statistics is reported in REMARK 3 records by phenix.refine as in examples above.

Finally, you don't really have to type all these long commands. Using parameter files is easy, and can reduce the amount of typing to something like:

phenix.refine params.eff

See https://www.phenix-online.org/presentations/latest/pavel_phenix_refine.pdf
for details and examples.

7) Finally finally, at 3.6A resolution or so, the R-factors withing ~2% can be considered the same. For example, I would not say that r_free = 0.3245 is better than 0.3319.
For discussion see
http://phenix-online.org/newsletter/CCN_2011_07.pdf
article "Improved target weight optimization in phenix.refine".

Please let me know if you have any questions or need any help with this.

Pavel.


On 7/30/11 11:35 AM, Tatyana Sysoeva wrote:
I ran the same commands with the build dev-833.
Below are the results.
These parameters are obtained by running phenix.ramalyze, cbetdev, and clashscore.
I am almost done repeating fix-rotamers run. First time it gave a way worse validation results then before running phenix without fix_rotamers=true.
I don't understand what I am doing wrong in the runs and would appreciate your help!

Thanks,
Tanya


I used phenix-dev-833 to test the riding hydrogens refinement at 3.6A.

I ran two indentical commands:

Command line arguments: "model.pdb" "data.mtz" "main.ncs=true" "ncs.find_automatically=false" "ncs_groups.params" "refinement.input.xray_data.high_resolution=3.6" "mgadpbef.cif" "refinement.ncs.excessive_distance_limit=None" "strategy=individual_sites+individual_sites_real_space+group_adp+occupancies"

 with the only difference in the input files – ncs.params and model.pdb

Model pdb contained the model identical to the control run but with added H atoms. I have not add H to the ADP molecules since I did not know how to write a correct CIF file for it. NCS definitions were changed by addition of “and not (element H)” to each line. It was done to exclude H from the NCS groups.

no hydrogens

with hydrogens

# Date 2011-07-29 Time 19:09:58 EDT -0400 (1311980998.46 s)

wall clock time: 2646.75 s

 

Start R-work = 0.2891, R-free = 0.3267

Final R-work = 0.2868, R-free = 0.3304

 

# Date 2011-07-29 Time 20:14:20 EDT -0400 (1311984860.16 s)

wall clock time: 6517.60 s

 

Start R-work = 0.2910, R-free = 0.3273

Final R-work = 0.2720, R-free = 0.3353

 

clashscore 31.77

clashscore 96.12

cbeta 0

cbeta 151

rama 12.62% outliers

rama 19.67% outliers

 

Interestingly in previous release version the same refinement runs produced:

without H clashscore = 77.580195/cbeta =8

with H clashscore = 119.729960/cbeta=330